Common: Introduce StopAllIteration filter status for decoding and encoding filters#5954
Common: Introduce StopAllIteration filter status for decoding and encoding filters#5954mattklein123 merged 46 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
|
I only include the decoding path change in this PR. Because the code change is very simple, I suspect I must have missed some cases. It's better to get feedback before I am too far away from the right path, ;) The plan is to change decoding path first, and then change the existing filters (I think all the impacted filters are decoding filters?). After that, I will modify metadata decoding path accordingly. WDYT? |
|
cc @birenroy |
Signed-off-by: Yang Song <yasong@google.com>
|
@soya3129 nit: do you mind adding a more expressive PR title e.g |
Oops. Thanks for the reminder! Fixed. |
mattklein123
left a comment
There was a problem hiding this comment.
Very nice. A few comments to get started. Thank you!
/wait
include/envoy/http/filter.h
Outdated
| // rate-limiting filter can be added before the filter that can return StopAllTypesIteration. | ||
| // Only used in decoding path. | ||
| // TODO(soya3129): add tests for encoding path, and remove the decoding path condition. | ||
| StopAllTypesIteration |
There was a problem hiding this comment.
nit: I would call this StopAllIteration personally.
Also, I think we likely need StopAllIterationAndBuffer as well as StopAllIterationAndWatermark or something along those lines to match the options we give on on data? WDYT?
| static_cast<const void*>(this)); | ||
| ASSERT(stopped_); | ||
| stopped_ = false; | ||
| stopped_all_ = false; |
There was a problem hiding this comment.
nit: Instead of stopped_ and _stopped_all_ can we make this a tri-state enum? I think it will help the logic be more clear.
| return false; | ||
| } else if (status == FilterHeadersStatus::StopAllTypesIteration) { | ||
| stopped_ = true; | ||
| stopped_all_ = true; |
There was a problem hiding this comment.
Where do we look at stopped_all_ to determine to not call any filter callbacks? I don't see that? This is the place also that for data we will have to decide on buffer vs watermark processing?
There was a problem hiding this comment.
I checked stopped_all_ value in line 1597 and line 1623 in conn_manager_impl.cc. Let's say there are 3 filters:
filter1 ----- filter2 ----- filter3
where filter2::decodeHeaders() returns StopAllIteration. We still allow data to call filter1::decodeData(), and filter2::decodeData(), but will not call filter3::decodeData(). This is achieved by the check filter2::stop_all_ value in line 1597. If filter2::stop_all_ is true, we will buffer data but not call commonContinue(), and because commonHandleAfterDataCallback returns false, filter3::decodeData() will not be called.
The same logic applies to trailers, where stop_all_ value is checked in line 1623. But I can be completely wrong. Please let me know the feedback. Thanks!
There was a problem hiding this comment.
OK I see. I think though we don't want to call any other filter2 methods, including data and trailers. This is how most filters would use this I think, including auth, rate limit, etc. where they make an outbound call as part of the decodeHeaders() path. Right?
Assuming ^ is what we want, I think this means that you need to check stop all status before calling filter functions, and then also handle buffer/watermark for data?
There was a problem hiding this comment.
Ah, I see! I mistakenly thought we didn't allow callbacks for filters after filter2, excluding filter2. Will make the change. Thanks!
About stop-all options for data. I thought we want stop-all for headers only at least for now? (#5842 (comment)). But it makes sense to be consistent with data as well. Will make the change and address other comments as well. Thanks for the feedback!
There was a problem hiding this comment.
About stop-all options for data. I thought we want stop-all for headers only at least for now?
We do want it only for headers, but what I'm saying is that we need to tell the HCM how to handle subsequent data. Do we buffer up to the limit and then error? Or do we buffer and try to apply flow control? Those are the two options.
There was a problem hiding this comment.
Sorry I misread the comment!
My original thoughts was if users don't want to buffer large data or want the flow control feature, they may add a rate-limiting filter before the filter that can return StopAllIteration for headers:
"// To avoid buffering large amount of data, a rate-limiting filter can be added before the filter that
// can return StopAllTypesIteration."
If users don't care, we can buffer up to the limit and then error. But I think I missed to set decoder_filters_streaming_ to be false in this case. Sorry, I will fix it.
Do you think it's reasonable to let users add rate-limiting filter, otherwise, fails when the limit is reached?
There was a problem hiding this comment.
Do you think it's reasonable to let users add rate-limiting filter, otherwise, fails when the limit is reached?
I don't think so. I think it would be more natural given the rest of the API to offer StopAllIterationAndBuffer and StopAllIterationAndWatermark, then given the return code, subsequent data can be handled in either the streaming or buffer way as the data return codes do. I don't think it would be very hard to implement. WDYT?
There was a problem hiding this comment.
I see! Is the suggestion to introduce StopAllIterationAndBuffer and StopAllIterationAndWatermark as decodeHeaders()'s return status (not as data's status)? I think it totally makes sense! Sorry for being slow!! I will change accordingly.
There was a problem hiding this comment.
Yup exactly! I think these are new headers return codes, and they guide what we do when data comes in. You aren't being slow, this is very tough stuff! :)
include/envoy/http/filter.h
Outdated
| // that filter stop processing. The filters before are not impacted. continueDecoding() MUST be | ||
| // called if continued filter iteration is desired. To avoid buffering large amount of data, a | ||
| // rate-limiting filter can be added before the filter that can return StopAllTypesIteration. | ||
| // Only used in decoding path. |
There was a problem hiding this comment.
would be nice to add a line on the use case for this type of status
There was a problem hiding this comment.
Fixed. Thanks for the feedback.
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
|
I am still working on enabling h2 for one of the tests. But I figure I should ask for another review in case I am off the track again, :) Thanks! |
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks this definitely looks like it's on the right track. Great work! A couple of questions.
/wait
| stopped_ = true; | ||
| return false; | ||
| } else if (status == FilterHeadersStatus::StopAllIterationAndBuffer) { | ||
| stopped_ = true; |
There was a problem hiding this comment.
Can we merge stopped_ and stop_all_state_ into a single enum? I think it will make it easier to understand the overall flow? Is that possible?
| entry = decoder_filters_.begin(); | ||
| } else { | ||
| entry = std::next(filter->entry()); | ||
| if ((*(filter->entry()))->iterate_from_current_filter_) { |
There was a problem hiding this comment.
Is there a way to remove iterate_from_current_filter_ to make the logic simpler? I'm wondering if we check the current filter's stop state, and if stopped, just do the common data after stop logic?
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
Sorry about the delay! I am working on the comments. Thanks! |
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
| void addStreamEncoderFilterWorker(StreamEncoderFilterSharedPtr filter, bool dual_filter); | ||
| void chargeStats(const HeaderMap& headers); | ||
| // Returns the encoder filter to start iteration with. If the function is called from a filter | ||
| // that should always iterate from the next filter, always_start_next should be set to true. |
There was a problem hiding this comment.
always_start_next is not one of the arguments.
| commonEncodePrefix(ActiveStreamEncoderFilter* filter, bool end_stream, | ||
| FilterIterationStartState filter_iteration_start_state); | ||
| // Returns the decoder filter to start iteration with. If the function is called from a filter | ||
| // that should always iterate from the next filter, always_start_next should be set to true. |
There was a problem hiding this comment.
always_start_next is not one of the arguments.
| void maybeEndEncode(bool end_stream); | ||
| uint64_t streamId() { return stream_id_; } | ||
| // Returns true if filter has stopped iteration for all frame types. Otherwise, returns false. | ||
| // filter_streaming is the variable to indicate if stream is streaming. |
There was a problem hiding this comment.
Indicate |filter_streaming| is output.
| size_t added_size_ = 0; | ||
| size_t buffer_limit_ = 0; | ||
| bool watermark_enabled_ = false; | ||
| bool first_trigger_ = false; |
There was a problem hiding this comment.
nit: rename to is_first_trigger to reflect it's a boolean.
| ActiveStreamDecoderFilter* filter, FilterIterationStartState filter_iteration_start_state) { | ||
| if (!filter) { | ||
| return decoder_filters_.begin(); | ||
| } else { |
There was a problem hiding this comment.
early returned already, no need for else statement
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
danzh2010
left a comment
There was a problem hiding this comment.
A few more nits in tests. Looks good all else!
| ASSERT(entry_content != nullptr && entry_added != nullptr); | ||
| content_size_ = std::stoul(std::string(entry_content->value().getStringView())); | ||
| added_size_ = std::stoul(std::string(entry_added->value().getStringView())); | ||
| Http::HeaderEntry* entry_is_first_trigger = |
There was a problem hiding this comment.
misfire of s/first_trigger/is_first_trigger?
There was a problem hiding this comment.
Not fixed. Is the name of entry_is_first_trigger is a misfire of s/first_trigger/is_first_trigger?
There was a problem hiding this comment.
No. entry_is_first_trigger means the variable is the entry for "is_first_trigger" header key.
| } else { | ||
| // Create a new timer to try again later. | ||
| delay_timer_->disableTimer(); | ||
| delay_timer_.reset(); |
There was a problem hiding this comment.
These 2 lines are not needed:
delay_timer_->disableTimer();
delay_timer_.reset();
| // If decodeHeaders() returns StopAllIterationAndBuffer, triggers the timer when all the | ||
| // request data has been received. If decodeHeaders() returns StopAllIterationAndWatermark, | ||
| // triggers the timer when received data exceed buffer limit. | ||
| if ((content_size_ > 0 && |
There was a problem hiding this comment.
What would happen if "buffer_limit" or "content_size" is 0 in request headers? Will the pipe line stop forever?
There was a problem hiding this comment.
I think buffer_limit should not be 0 otherwise that says we don't want to buffer any data. content_size can be 0. It just means we send 0 data, :)
There was a problem hiding this comment.
Can you add DCHECK to prevent "buffer_limit" from being set to 0? Though we have full control over the test input, but it's better to fail the test early for easy debugging.
If "content_size" is 0, according to my reading of the if condition, continueDecoding() will never be called. Do we need content_size > 0 there?
There was a problem hiding this comment.
This filter is designed to call continueDecoding() when all the data have been received. It is for the ease of test of StopAll status, not for checking if continueDecoding() is called. I think it should be OK, since it's testing only, :)
| } else { | ||
| EXPECT_EQ(count_ * size_ + added_decoded_data_size_ * 1, upstream_request_->bodyLength()); | ||
| } | ||
| EXPECT_EQ(true, upstream_request_->complete()); |
There was a problem hiding this comment.
The check after getting response is repeated in above tests. Can you move the response verification into a helper function and call it at the end of the test?
There was a problem hiding this comment.
Helper function added. Thanks!
| TEST_P(DownstreamProtocolIntegrationTest, testDecodeHeadersReturnsStopAll) { | ||
| config_helper_.addFilter(R"EOF( | ||
| name: call-decodedata-once-filter | ||
| )EOF"); |
There was a problem hiding this comment.
Can you add comment why call-decodedata-once-filter's expectation can be met after decode-headers-return-stop-all-filter?
There was a problem hiding this comment.
I have comments in call_decodedata_once_filter.cc to explain the number. Basically, depending on where decodeData is triggerred, either from decodeTrailer()'s addDecodedData() or from filter's callback decodeData(), the data received can be different.
| name: decode-headers-return-stop-all-filter | ||
| )EOF"); | ||
| config_helper_.addFilter(R"EOF( | ||
| name: passthrough-filter |
There was a problem hiding this comment.
Why do we need a passthrough-filter at the beginning?
There was a problem hiding this comment.
This is to test we can stop-all when there is another filter in front of the filter chain. It's not required. Just for testing, :)
| name: passthrough-filter | ||
| )EOF"); | ||
|
|
||
| // Sets initial stream window to min value to make the client sensitive to a low watermark. |
There was a problem hiding this comment.
s/to make the client sensitive to a low watermark/to make sure the client initially send less data than the buffer_limit.
There was a problem hiding this comment.
I think it may not guarantee the client sends less than buffer_limit. I think when watermark is reached, we stop reading, but client can still send, :)
There was a problem hiding this comment.
I think it may not guarantee the client sends less than buffer_limit.
Even not at the beginning of sending? I thought we use the small flow control window to prevent client from sending more than the buffer_limit in order to enable filter to start buffering.
There was a problem hiding this comment.
When buffer_limit is reached, we stop reading. But it can't guarantee the client sends less than buffer_limit.
mattklein123
left a comment
There was a problem hiding this comment.
Thanks this looks awesome, just 2 small nits. Really love how much we were able to simplify/clarify this. will defer to @alyssawilk and @danzh2010 for further review and merge.
| ~ActiveStream(); | ||
|
|
||
| // Indicates which filter to start the iteration with. | ||
| enum class FilterIterationStartState { Always_start_from_next, Can_start_from_current }; |
There was a problem hiding this comment.
nit: AlwaysStartFromNext and CanStartFromCurrent
| void addStreamDecoderFilterWorker(StreamDecoderFilterSharedPtr filter, bool dual_filter); | ||
| void addStreamEncoderFilterWorker(StreamEncoderFilterSharedPtr filter, bool dual_filter); | ||
| void chargeStats(const HeaderMap& headers); | ||
| // Returns the encoder filter to start iteration with. If the function is called from a filter |
There was a problem hiding this comment.
nit: now with the enum I think the 2nd part of this comment is obvious and can be removed, same below.
Signed-off-by: Yang Song <yasong@google.com>
| // If decodeHeaders() returns StopAllIterationAndBuffer, triggers the timer when all the | ||
| // request data has been received. If decodeHeaders() returns StopAllIterationAndWatermark, | ||
| // triggers the timer when received data exceed buffer limit. | ||
| if ((content_size_ > 0 && |
There was a problem hiding this comment.
Can you add DCHECK to prevent "buffer_limit" from being set to 0? Though we have full control over the test input, but it's better to fail the test early for easy debugging.
If "content_size" is 0, according to my reading of the if condition, continueDecoding() will never be called. Do we need content_size > 0 there?
| name: passthrough-filter | ||
| )EOF"); | ||
|
|
||
| // Sets initial stream window to min value to make the client sensitive to a low watermark. |
There was a problem hiding this comment.
I think it may not guarantee the client sends less than buffer_limit.
Even not at the beginning of sending? I thought we use the small flow control window to prevent client from sending more than the buffer_limit in order to enable filter to start buffering.
| ASSERT(entry_content != nullptr && entry_added != nullptr); | ||
| content_size_ = std::stoul(std::string(entry_content->value().getStringView())); | ||
| added_size_ = std::stoul(std::string(entry_added->value().getStringView())); | ||
| Http::HeaderEntry* entry_is_first_trigger = |
There was a problem hiding this comment.
Not fixed. Is the name of entry_is_first_trigger is a misfire of s/first_trigger/is_first_trigger?
| @@ -1044,9 +1048,29 @@ ConnectionManagerImpl::ActiveStream::commonEncodePrefix(ActiveStreamEncoderFilte | |||
|
|
|||
| if (!filter) { | |||
There was a problem hiding this comment.
This block can be combined with the if block above.
|
|
||
| namespace Envoy { | ||
|
|
||
| // A filter that only allows decodeData() to be called once with fixed data length. |
There was a problem hiding this comment.
Where do we guarantee that decodeData() can only be called once?
Signed-off-by: Yang Song <yasong@google.com>
|
The good news is this is ready to land. Hurrah! The bad news is our release got pushed back from last week to this week, and after some discussion on #envoy-maintainers we're both close enough to cutting this release (and doing our first round of security patches) I think we'll merge this first thing after the release is cut (theoretically Friday). Sorry about the additional delay and thanks again for all your work getting this tricky feature ready to land! |
|
Thank you so much @soya3129 for your awesome work on this change. I know it took a lot of iterations but I'm very happy with how it turned out. We will get this merged ASAP! |
|
Thanks for the opportunity and all the reviews! I learned a lot, :) |
|
@soya3129 can you merge master and we can get this in? Thank you! /wait |
Signed-off-by: Yang Song <yasong@google.com>
* master: (137 commits) test: router upstream log to v2 config stubs (envoyproxy#6499) remove idle timeout validation (envoyproxy#6500) build: Change namespace of chromium_url. (envoyproxy#6506) coverage: exclude chromium_url (envoyproxy#6498) fix(tracing): allow 256 chars in path tag (envoyproxy#6492) Common: Introduce StopAllIteration filter status for decoding and encoding filters (envoyproxy#5954) build: update PGV url (envoyproxy#6495) subset lb: avoid partitioning host lists on worker threads (envoyproxy#6302) ci: Make envoy_select_quiche no-op. (envoyproxy#6393) watcher: notify when watched files are modified (envoyproxy#6215) stat: Add counterFromStatName(), gaugeFromStatName(), and histogramFromStatName() (envoyproxy#6475) bump to 1.11.0-dev (envoyproxy#6490) release: bump to 1.10.0 (envoyproxy#6489) hcm: path normalization. (#1) build: import manually minified Chrome URL lib. (envoyproxy#3) codec: reject embedded NUL in headers. (envoyproxy#2) Added veryfication if path contains query params and add them to path header (envoyproxy#6466) redis: basic integration test for redis_proxy (envoyproxy#6450) stats: report sample count as an integer to prevent loss of precision (envoyproxy#6274) Added VHDS protobuf message and updated RouteConfig to include it. (envoyproxy#6418) ... Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Yang Song yasong@google.com
Description: Created a new filter return status for headers: StopAllTypesIteration. Returning this status means the filter and the filters following it stop iteration on not only headers but also data and trailers. The upstream filters will not be impacted. The change only includes decoding path.
Risk Level: High. HCM refactor. Should be a no-op as no filters return the new status but a lot of code is changing.
Testing: Integration test for decoding path.
Docs Changes: Comments added for the new status.
#5842