[http1] Buffer pending http/1 body before dispatching to the filter chain#10406
Conversation
…nd do a single dispatch call for the buffered body either after the http_parser_execute call completes or after the last byte of the body is processed. Signed-off-by: Antonio Vicente <avd@google.com>
add missing comments rename buffered_body_ and associated methods Signed-off-by: Antonio Vicente <avd@google.com>
|
CI bazel compile_time_options: There seems to a namespace "data" under the "envoy" namespace in v2 and v3 API proto definitions which interacts poorly with arguments named "data". Example "package envoy.data.core.v3" in api/envoy/data/core/v3/health_check_event.proto Changing argument name to work around it. |
…e envoy::data in bazel compile_time_options CI build. Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
yanavlasov
left a comment
There was a problem hiding this comment.
This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.
What kind of test do you have in mind? Something that verifies that at least some of the chunks are consolidated when proxying H1 transfer-encoding chunked? I can look into it, the difficulty is in getting raw data from the upstream or downstream in order to verify the transformations performed by the proxy. |
Yes, I see your point. It would be difficult to control that body arrives in multiple reads. My idea was to have integration test with small read buffer and large POST and just verify that it succeeds. We have a test like this for H2 requests. Http2IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer But we do not have anything like this for H1. |
I'll look into what large transfers with and without content length exist in tests. I think we probably have decent converage but it wouldn't hurt to double check. |
…red_body Signed-off-by: Antonio Vicente <avd@google.com>
e2e HTTP benchmark Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer covers the same for H1. The other problem with these tests is that they just verify that the body length, but not actual body bytes proxied. The body is just a large string of 'a', so if you were to swap two sections the code wouldn't be able to notice. I added test cases that cover proxying requests and responses with content length and large body with a larger buffer. And a basic HTTP benchmark.
|
|
@antoniovicente I know we've had multiple conversations around batching to reduce extra work done during the processing of requests. This PR seems quite a big hammer for HTTP/1, so before diving into it deeply I'd like to get a sense for the intuition. Some questions:
|
Yes, streaming still works. We only buffer under the dispatch call stack. There are asserts that verify that the temporary buffer is empty as we enter and exit the dispatch loop. As we unroll after dispatch, we do the onBody upcall through the filter chain. Long polling continues to work, as does proxying of large responses while bounding internal buffers as we did before this change.
It should be fine, since we're still streaming. This just optimizes processing under a single dispatch call.
H2 is harder due to stream interleaving. Let's say "maybe". |
|
|
||
| TEST_P(Http2IntegrationTest, RouterRequestAndResponseWithBodyNoBuffer) { | ||
| testRouterRequestAndResponseWithBody(1024, 512, false); | ||
| testRouterRequestAndResponseWithBody(1024, 512, false, false); |
There was a problem hiding this comment.
The growth of parameters is getting a bit out of hand in some of these test calls, do you think it would make sense to add an options struct?
There was a problem hiding this comment.
Yes, but in a different PR.
There was a problem hiding this comment.
OK, would be a great followup PR, thanks.
htuch
left a comment
There was a problem hiding this comment.
Thanks for the explanation. Implementation looks good, a bunch of testing comments.
/wait
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
…red_body Signed-off-by: Antonio Vicente <avd@google.com>
htuch
left a comment
There was a problem hiding this comment.
LGTM. @yanavlasov @alyssawilk would be great to get your thoughts on this
|
|
||
| TEST_P(Http2IntegrationTest, RouterRequestAndResponseWithBodyNoBuffer) { | ||
| testRouterRequestAndResponseWithBody(1024, 512, false); | ||
| testRouterRequestAndResponseWithBody(1024, 512, false, false); |
There was a problem hiding this comment.
OK, would be a great followup PR, thanks.
alyssawilk
left a comment
There was a problem hiding this comment.
Looks great! Mostly have test / comment requests
| nullptr, // on_chunk_header | ||
| nullptr // on_chunk_complete | ||
| [](http_parser* parser) -> int { | ||
| const bool is_final_chunk = (parser->content_length == 0); |
There was a problem hiding this comment.
no no, content length is the length of the body, see?
https://github.com/nodejs/http-parser/blob/master/http_parser.h#L307
.....
reads further: https://github.com/nodejs/http-parser/blob/master/http_parser.h#L337
.....
never mind :-(
optionally, maybe comment this field is overloaded, in case other people have concerns?
There was a problem hiding this comment.
Added a comment. Thanks!
| Buffer::OwnedImpl expected_data1("Hello Worl"); | ||
| EXPECT_CALL(decoder, decodeData(BufferEqual(&expected_data1), false)); | ||
| Buffer::OwnedImpl expected_data2("d"); | ||
| EXPECT_CALL(decoder, decodeData(BufferEqual(&expected_data2), false)); |
There was a problem hiding this comment.
can you move these down to after the first dispatch to make the order of dispatch and decodes more clear?
| if (data.length() > 0) { | ||
| for (const Buffer::RawSlice& slice : data.getRawSlices()) { | ||
| total_parsed += dispatchSlice(static_cast<const char*>(slice.mem_), slice.len_); | ||
| if (HTTP_PARSER_ERRNO(&parser_) != HPE_OK) { |
There was a problem hiding this comment.
do we have tests for garbled chunks which would make sure we don't dispatch buffered body when we fail parsing mid-dispatch? Or do we dispatch given the break rather than return? I'd think we would want to halt work.
There was a problem hiding this comment.
The parser can only be in 2 states when we hit this statement: HPE_OK or HPE_PAUSED.
The reason is that other error codes throw an exception in ConnectionImpl::dispatchSlice which unrolls the stack to the HCM.
I had written an invalid chunk headers test to cover this, but it seems to have disappeared due to a bad merge... See codec_impl_test.cc Http1ServerConnectionImplTest.InvalidChunkHeader
Thanks for catching this!
| break; | ||
| } | ||
| } | ||
| dispatchBufferedBody(); |
There was a problem hiding this comment.
if we hit a parser error above and HTTP_PARSER_ERRNO(&parser_) != HPE_OK don't we break out of the for loop and then fail the assert in dispatchBufferedBody?
There was a problem hiding this comment.
The ASSERT is avoided due to exceptions. As we remove exceptions, the ASSERT should help us make sure we end up handling this case correctly.
| if (is_final_chunk) { | ||
| // Dispatch body before parsing trailers, so body ends up dispatched even if an error is found | ||
| // while processing trailers. | ||
| dispatchBufferedBody(); |
There was a problem hiding this comment.
lazy ask: we have a test for this?
There was a problem hiding this comment.
Http1ServerConnectionImplTest.Http11InvalidTrailerPost
I'm not 100% sure if this is actually necessary. If we find a trailer error, an exception is throw even if trailer proxying is disabled. Pushing the body or throwing it away when trailer processing fails seems about equally good. Thoughts?
There was a problem hiding this comment.
Mainly I wanted to make sure we regression tested that all body was sent before trailers in the happy path. I don't have strong opinions about making sure all data gets processed in case of errors.
There was a problem hiding this comment.
I enhanced Http1ServerConnectionImplTest.RequestWithTrailersKept to verify that body is delivered before trailers.
The call to dispatchBufferedBody() in ConnectionImpl::onMessageCompleteBase is enough to accomplish that.
| * Called when body data is received. | ||
| * @param data supplies the start address. | ||
| * @param length supplies the length. | ||
| * Called when body data is available for processing. |
There was a problem hiding this comment.
how about some comments here or on our buffered data when we buffer and when we process, and why we do it this way?
Signed-off-by: Antonio Vicente <avd@google.com>
- Add some comments to codec_impl and test - Move test expectations down to improve clarity - Tighten chunked body with trailers test Signed-off-by: Antonio Vicente <avd@google.com>
…error is found while processing trailers. Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
Looks good! Yan: any final comments on your end?
| if (is_final_chunk) { | ||
| // Dispatch body before parsing trailers, so body ends up dispatched even if an error is found | ||
| // while processing trailers. | ||
| dispatchBufferedBody(); |
There was a problem hiding this comment.
Mainly I wanted to make sure we regression tested that all body was sent before trailers in the happy path. I don't have strong opinions about making sure all data gets processed in case of errors.
…red_body Signed-off-by: Antonio Vicente <avd@google.com>
|
Looks good to me. Thanks for adding these integration tests. |
|
/azp run envoy-presubmit |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Hmm picked up this on Fri and earlier today I saw this in prod: |
|
@rgs1 can you open a fresh issue on this? Also what commit range you deployed? I think this is probably a regression from #10561. cc @euroelessar |
|
Sure -- #10655. |
Description: Optimize http/1 request and response body processing by accumulating the body into a buffer during parse, and do a single dispatch call for the buffered body either after the last byte of the body is processed by http_parser or after the http_parser_execute call completes. A possible future optimization would be to direct dispatch from the read buffer to avoid a copy, but doing so would require some changes to the parser API to correctly account for body bytes that are dispatched directly.
Risk Level: low
Testing: unit
Docs Changes: n/a
Release Notes: n/a