[http1] Buffer pending http/1 body before dispatching to the filter chain by antoniovicente · Pull Request #10406 · envoyproxy/envoy

antoniovicente · 2020-03-16T19:32:02Z

Description: Optimize http/1 request and response body processing by accumulating the body into a buffer during parse, and do a single dispatch call for the buffered body either after the last byte of the body is processed by http_parser or after the http_parser_execute call completes. A possible future optimization would be to direct dispatch from the read buffer to avoid a copy, but doing so would require some changes to the parser API to correctly account for body bytes that are dispatched directly.
Risk Level: low
Testing: unit
Docs Changes: n/a
Release Notes: n/a

…nd do a single dispatch call for the buffered body either after the http_parser_execute call completes or after the last byte of the body is processed. Signed-off-by: Antonio Vicente <avd@google.com>

add missing comments rename buffered_body_ and associated methods Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente · 2020-03-16T23:14:19Z

CI bazel compile_time_options: There seems to a namespace "data" under the "envoy" namespace in v2 and v3 API proto definitions which interacts poorly with arguments named "data".

Example "package envoy.data.core.v3" in api/envoy/data/core/v3/health_check_event.proto

Changing argument name to work around it.

…e envoy::data in bazel compile_time_options CI build. Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

yanavlasov

This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.

antoniovicente · 2020-03-18T17:16:31Z

This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.

What kind of test do you have in mind? Something that verifies that at least some of the chunks are consolidated when proxying H1 transfer-encoding chunked? I can look into it, the difficulty is in getting raw data from the upstream or downstream in order to verify the transformations performed by the proxy.

yanavlasov · 2020-03-19T15:47:54Z

This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.

What kind of test do you have in mind? Something that verifies that at least some of the chunks are consolidated when proxying H1 transfer-encoding chunked? I can look into it, the difficulty is in getting raw data from the upstream or downstream in order to verify the transformations performed by the proxy.

Yes, I see your point. It would be difficult to control that body arrives in multiple reads. My idea was to have integration test with small read buffer and large POST and just verify that it succeeds. We have a test like this for H2 requests. Http2IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer But we do not have anything like this for H1.

antoniovicente · 2020-03-19T16:56:33Z

This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.

What kind of test do you have in mind? Something that verifies that at least some of the chunks are consolidated when proxying H1 transfer-encoding chunked? I can look into it, the difficulty is in getting raw data from the upstream or downstream in order to verify the transformations performed by the proxy.

Yes, I see your point. It would be difficult to control that body arrives in multiple reads. My idea was to have integration test with small read buffer and large POST and just verify that it succeeds. We have a test like this for H2 requests. Http2IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer But we do not have anything like this for H1.

I'll look into what large transfers with and without content length exist in tests. I think we probably have decent converage but it wouldn't hurt to double check.

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

e2e HTTP benchmark Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente · 2020-03-23T21:37:43Z

This looks good to me. I think a protocol integration test would be good for this case. Although it would have to be limited to H1 downstreams for now.

What kind of test do you have in mind? Something that verifies that at least some of the chunks are consolidated when proxying H1 transfer-encoding chunked? I can look into it, the difficulty is in getting raw data from the upstream or downstream in order to verify the transformations performed by the proxy.

Yes, I see your point. It would be difficult to control that body arrives in multiple reads. My idea was to have integration test with small read buffer and large POST and just verify that it succeeds. We have a test like this for H2 requests. Http2IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer But we do not have anything like this for H1.

IntegrationTest.RouterRequestAndResponseWithBodyNoBuffer covers the same for H1.
But it covers only the chunk-encoded case, not the case of content-length. Or cover large bodies when buffer limits are higher than 16kb. Coverage with content-length seems somewhat sporadic, or at least I haven't found how e2e tests tell the framework to add a content-length header.

The other problem with these tests is that they just verify that the body length, but not actual body bytes proxied. The body is just a large string of 'a', so if you were to swap two sections the code wouldn't be able to notice.

I added test cases that cover proxying requests and responses with content length and large body with a larger buffer. And a basic HTTP benchmark.

I'll look into what large transfers with and without content length exist in tests. I think we probably have decent converage but it wouldn't hurt to double check.

htuch · 2020-03-23T23:21:30Z

@antoniovicente I know we've had multiple conversations around batching to reduce extra work done during the processing of requests. This PR seems quite a big hammer for HTTP/1, so before diving into it deeply I'd like to get a sense for the intuition. Some questions:

Does this prohibit HTTP/1 response streaming? If we're going to buffer the entire thing, that doesn't seem compatible with HTTP/1 long poll streams, e.g. where some browser script is connected to a remote service and is consuming data line-by-line as it is produced. I realize there are better technologies for this today, including Websockets and H2, but I think there is a class of HTTP/1 use that resembles this.
How does this interact with retries? I think we already have some whole request buffering for this purpose, in particular gRPC retries (maybe the HTTP/1 is a downstream client being transcoded to gRPC).
Is there an HTTP/2 analogy planned?

antoniovicente · 2020-03-24T03:18:59Z

@antoniovicente I know we've had multiple conversations around batching to reduce extra work done during the processing of requests. This PR seems quite a big hammer for HTTP/1, so before diving into it deeply I'd like to get a sense for the intuition. Some questions:

Does this prohibit HTTP/1 response streaming? If we're going to buffer the entire thing, that doesn't seem compatible with HTTP/1 long poll streams, e.g. where some browser script is connected to a remote service and is consuming data line-by-line as it is produced. I realize there are better technologies for this today, including Websockets and H2, but I think there is a class of HTTP/1 use that resembles this.

Yes, streaming still works. We only buffer under the dispatch call stack. There are asserts that verify that the temporary buffer is empty as we enter and exit the dispatch loop. As we unroll after dispatch, we do the onBody upcall through the filter chain. Long polling continues to work, as does proxying of large responses while bounding internal buffers as we did before this change.

How does this interact with retries? I think we already have some whole request buffering for this purpose, in particular gRPC retries (maybe the HTTP/1 is a downstream client being transcoded to gRPC).

It should be fine, since we're still streaming. This just optimizes processing under a single dispatch call.

Is there an HTTP/2 analogy planned?

H2 is harder due to stream interleaving. Let's say "maybe".

test/common/http/http1/codec_impl_test.cc

htuch · 2020-03-24T13:01:44Z

test/integration/http2_integration_test.cc


 TEST_P(Http2IntegrationTest, RouterRequestAndResponseWithBodyNoBuffer) {
-  testRouterRequestAndResponseWithBody(1024, 512, false);
+  testRouterRequestAndResponseWithBody(1024, 512, false, false);


The growth of parameters is getting a bit out of hand in some of these test calls, do you think it would make sense to add an options struct?

Yes, but in a different PR.

OK, would be a great followup PR, thanks.

test/integration/http_benchmark.cc

test/common/http/http1/codec_impl_test.cc

htuch

Thanks for the explanation. Implementation looks good, a bunch of testing comments.
/wait

Signed-off-by: Antonio Vicente <avd@google.com>

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

htuch

LGTM. @yanavlasov @alyssawilk would be great to get your thoughts on this

htuch · 2020-03-25T13:54:41Z

test/integration/http2_integration_test.cc


 TEST_P(Http2IntegrationTest, RouterRequestAndResponseWithBodyNoBuffer) {
-  testRouterRequestAndResponseWithBody(1024, 512, false);
+  testRouterRequestAndResponseWithBody(1024, 512, false, false);


OK, would be a great followup PR, thanks.

alyssawilk

Looks great! Mostly have test / comment requests

alyssawilk · 2020-03-25T17:06:21Z

source/common/http/http1/codec_impl.cc

-    nullptr, // on_chunk_header
-    nullptr  // on_chunk_complete
+    [](http_parser* parser) -> int {
+      const bool is_final_chunk = (parser->content_length == 0);


no no, content length is the length of the body, see?
https://github.com/nodejs/http-parser/blob/master/http_parser.h#L307

.....
reads further: https://github.com/nodejs/http-parser/blob/master/http_parser.h#L337
.....

never mind :-(
optionally, maybe comment this field is overloaded, in case other people have concerns?

Added a comment. Thanks!

alyssawilk · 2020-03-25T17:10:30Z

test/common/http/http1/codec_impl_test.cc

+  Buffer::OwnedImpl expected_data1("Hello Worl");
+  EXPECT_CALL(decoder, decodeData(BufferEqual(&expected_data1), false));
+  Buffer::OwnedImpl expected_data2("d");
+  EXPECT_CALL(decoder, decodeData(BufferEqual(&expected_data2), false));


can you move these down to after the first dispatch to make the order of dispatch and decodes more clear?

alyssawilk · 2020-03-25T17:11:58Z

source/common/http/http1/codec_impl.cc

  if (data.length() > 0) {
    for (const Buffer::RawSlice& slice : data.getRawSlices()) {
      total_parsed += dispatchSlice(static_cast<const char*>(slice.mem_), slice.len_);
+      if (HTTP_PARSER_ERRNO(&parser_) != HPE_OK) {


do we have tests for garbled chunks which would make sure we don't dispatch buffered body when we fail parsing mid-dispatch? Or do we dispatch given the break rather than return? I'd think we would want to halt work.

The parser can only be in 2 states when we hit this statement: HPE_OK or HPE_PAUSED.
The reason is that other error codes throw an exception in ConnectionImpl::dispatchSlice which unrolls the stack to the HCM.

I had written an invalid chunk headers test to cover this, but it seems to have disappeared due to a bad merge... See codec_impl_test.cc Http1ServerConnectionImplTest.InvalidChunkHeader

Thanks for catching this!

alyssawilk · 2020-03-25T17:15:10Z

source/common/http/http1/codec_impl.cc

+        break;
+      }
    }
+    dispatchBufferedBody();


if we hit a parser error above and HTTP_PARSER_ERRNO(&parser_) != HPE_OK don't we break out of the for loop and then fail the assert in dispatchBufferedBody?

The ASSERT is avoided due to exceptions. As we remove exceptions, the ASSERT should help us make sure we end up handling this case correctly.

alyssawilk · 2020-03-25T17:19:14Z

source/common/http/http1/codec_impl.cc

+  if (is_final_chunk) {
+    // Dispatch body before parsing trailers, so body ends up dispatched even if an error is found
+    // while processing trailers.
+    dispatchBufferedBody();


lazy ask: we have a test for this?

Http1ServerConnectionImplTest.Http11InvalidTrailerPost

I'm not 100% sure if this is actually necessary. If we find a trailer error, an exception is throw even if trailer proxying is disabled. Pushing the body or throwing it away when trailer processing fails seems about equally good. Thoughts?

Mainly I wanted to make sure we regression tested that all body was sent before trailers in the happy path. I don't have strong opinions about making sure all data gets processed in case of errors.

I enhanced Http1ServerConnectionImplTest.RequestWithTrailersKept to verify that body is delivered before trailers.

The call to dispatchBufferedBody() in ConnectionImpl::onMessageCompleteBase is enough to accomplish that.

alyssawilk · 2020-03-25T17:20:27Z

source/common/http/http1/codec_impl.h

-   * Called when body data is received.
-   * @param data supplies the start address.
-   * @param length supplies the length.
+   * Called when body data is available for processing.


how about some comments here or on our buffered data when we buffer and when we process, and why we do it this way?

Signed-off-by: Antonio Vicente <avd@google.com>

- Add some comments to codec_impl and test - Move test expectations down to improve clarity - Tighten chunked body with trailers test Signed-off-by: Antonio Vicente <avd@google.com>

…error is found while processing trailers. Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

alyssawilk

Looks good! Yan: any final comments on your end?

alyssawilk · 2020-03-30T20:35:05Z

source/common/http/http1/codec_impl.cc

+  if (is_final_chunk) {
+    // Dispatch body before parsing trailers, so body ends up dispatched even if an error is found
+    // while processing trailers.
+    dispatchBufferedBody();


Mainly I wanted to make sure we regression tested that all body was sent before trailers in the happy path. I don't have strong opinions about making sure all data gets processed in case of errors.

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

yanavlasov · 2020-04-02T15:52:48Z

Looks good to me. Thanks for adding these integration tests.

mattklein123 · 2020-04-02T23:17:41Z

/azp run envoy-presubmit

azure-pipelines · 2020-04-02T23:17:49Z

Azure Pipelines successfully started running 1 pipeline(s).

rgs1 · 2020-04-05T15:13:42Z

Hmm picked up this on Fri and earlier today I saw this in prod:

Caught Segmentation fault, suspect faulting address 0x0
Backtrace (use tools/stack_decode.py to get line numbers):
Envoy version: 0/1.14.0-dev//RELEASE/BoringSSL
#0: __restore_rt [0x7fa186a08890] ??:0
#1: Envoy::Network::ConnectionImpl::onLowWatermark() [0x55940ecdef7a] ??:0
#2: Envoy::Network::ConnectionImpl::closeSocket() [0x55940ecde4a8] ??:0
#3: Envoy::Network::ConnectionImpl::close() [0x55940ecde1aa] ??:0
#4: Envoy::Http::Http1::ConnPoolImpl::onResponseComplete() [0x55940ee5d145] ??:0
#5: Envoy::Http::ResponseDecoderWrapper::decodeHeaders() [0x55940ee5d74c] ??:0
#6: Envoy::Http::Http1::ClientConnectionImpl::onMessageComplete() [0x55940ef78073] ??:0
#7: Envoy::Http::Http1::ConnectionImpl::onMessageCompleteBase() [0x55940ef75945] ??:0
#8: Envoy::Http::Http1::ConnectionImpl::$_8::__invoke() [0x55940ef79b8d] ??:0
#9: http_parser_execute [0x55940f1553e6] ??:0
#10: Envoy::Http::Http1::ConnectionImpl::dispatchSlice() [0x55940ef74a17] ??:0
#11: Envoy::Http::Http1::ConnectionImpl::dispatch() [0x55940ef7477f] ??:0
#12: Envoy::Http::CodecClient::onData() [0x55940eedf248] ??:0
#13: Envoy::Http::CodecClient::CodecReadFilter::onData() [0x55940eee007d] ??:0
#14: Envoy::Network::FilterManagerImpl::onContinueReading() [0x55940ece49c9] ??:0
#15: Envoy::Network::ConnectionImpl::onReadReady() [0x55940ecdfee7] ??:0
#16: Envoy::Network::ConnectionImpl::onFileEvent() [0x55940ecdf20d] ??:0
#17: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x55940ecda5f5] ??:0
#18: event_process_active_single_queue [0x55940f14eea4] ??:0
#19: event_base_loop [0x55940f14d75e] ??:0
#20: Envoy::Server::WorkerImpl::threadRoutine() [0x55940eccf1ff] ??:0
#21: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__invoke() [0x55940f206055] ??:0
#22: start_thread [0x7fa1869fd6db] ??:0

cc: @antoniovicente @alyssawilk

mattklein123 · 2020-04-05T16:10:27Z

@rgs1 can you open a fresh issue on this? Also what commit range you deployed? I think this is probably a regression from #10561. cc @euroelessar

rgs1 · 2020-04-05T16:16:42Z

Sure -- #10655.

antoniovicente added 2 commits March 16, 2020 14:14

Accumulate H1 request and response bodies in a buffer during parse, a…

5632fb7

…nd do a single dispatch call for the buffered body either after the http_parser_execute call completes or after the last byte of the body is processed. Signed-off-by: Antonio Vicente <avd@google.com>

add test for invalid chunk header processing

f8e5d34

add missing comments rename buffered_body_ and associated methods Signed-off-by: Antonio Vicente <avd@google.com>

mattklein123 assigned yanavlasov Mar 16, 2020

antoniovicente added 3 commits March 16, 2020 19:15

Change argument name to hack around name conflict issue with namespac…

44e59f6

…e envoy::data in bazel compile_time_options CI build. Signed-off-by: Antonio Vicente <avd@google.com>

Add missing header to fix bazel compile_time_options CI

e7096f4

Signed-off-by: Antonio Vicente <avd@google.com>

use absl::string_view

9de08cc

Signed-off-by: Antonio Vicente <avd@google.com>

yanavlasov reviewed Mar 18, 2020

View reviewed changes

antoniovicente added 3 commits March 23, 2020 12:14

Merge remote-tracking branch 'upstream/master' into h1_dispatch_buffe…

dbf66e9

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

improve e2e coverage for larger request/responses over H1 and H2

26340ac

e2e HTTP benchmark Signed-off-by: Antonio Vicente <avd@google.com>

cover large transfers with content-length

4508697

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente requested review from PiotrSikora, htuch and lizan as code owners March 23, 2020 21:23

htuch reviewed Mar 24, 2020

View reviewed changes

htuch suggested changes Mar 24, 2020

View reviewed changes

repokitteh-read-only bot added the waiting label Mar 24, 2020

htuch self-assigned this Mar 24, 2020

antoniovicente added 3 commits March 24, 2020 11:59

address review comments

ae92586

Signed-off-by: Antonio Vicente <avd@google.com>

Remove http_benchmark

6739dfc

Signed-off-by: Antonio Vicente <avd@google.com>

make spellchecker happy

334847f

Signed-off-by: Antonio Vicente <avd@google.com>

repokitteh-read-only bot removed the waiting label Mar 24, 2020

antoniovicente requested a review from htuch March 24, 2020 23:40

Merge remote-tracking branch 'upstream/master' into h1_dispatch_buffe…

fa7b117

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

htuch reviewed Mar 25, 2020

View reviewed changes

antoniovicente requested review from htuch and yanavlasov March 25, 2020 15:36

alyssawilk reviewed Mar 25, 2020

View reviewed changes

antoniovicente added 6 commits March 25, 2020 18:55

add invalid chunk test

f419bf1

Signed-off-by: Antonio Vicente <avd@google.com>

Address review comments:

ddc5a01

- Add some comments to codec_impl and test - Move test expectations down to improve clarity - Tighten chunked body with trailers test Signed-off-by: Antonio Vicente <avd@google.com>

Tighten test that verifies that chunked body is delivered even if an …

0101521

…error is found while processing trailers. Signed-off-by: Antonio Vicente <avd@google.com>

expand comments

631ea09

Signed-off-by: Antonio Vicente <avd@google.com>

fix clang-tidy

8a0d1fb

Signed-off-by: Antonio Vicente <avd@google.com>

Kick CI

91ff8d6

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente requested a review from alyssawilk March 30, 2020 18:58

alyssawilk previously approved these changes Mar 30, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into h1_dispatch_buffe…

29aa428

…red_body Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente dismissed alyssawilk’s stale review via 29aa428 March 31, 2020 14:46

alyssawilk self-assigned this Mar 31, 2020

yanavlasov approved these changes Apr 2, 2020

View reviewed changes

htuch approved these changes Apr 2, 2020

View reviewed changes

mattklein123 self-assigned this Apr 2, 2020

mattklein123 merged commit 17e5f5c into envoyproxy:master Apr 3, 2020

rgs1 mentioned this pull request Apr 5, 2020

Crash since 3e1806f1e12ec #10655

Closed

mattklein123 mentioned this pull request Apr 7, 2020

envoy.gzip compression performance is poor #8448

Open

Conversation

antoniovicente commented Mar 16, 2020

Uh oh!

antoniovicente commented Mar 16, 2020

Uh oh!

yanavlasov left a comment

Choose a reason for hiding this comment

Uh oh!

antoniovicente commented Mar 18, 2020

Uh oh!

yanavlasov commented Mar 19, 2020

Uh oh!

antoniovicente commented Mar 19, 2020

Uh oh!

antoniovicente commented Mar 23, 2020

Uh oh!

htuch commented Mar 23, 2020

Uh oh!

antoniovicente commented Mar 24, 2020

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanavlasov commented Apr 2, 2020

Uh oh!

mattklein123 commented Apr 2, 2020

Uh oh!

azure-pipelines bot commented Apr 2, 2020

Uh oh!

rgs1 commented Apr 5, 2020 •

edited

Loading