Overflow watermark by adisuissa · Pull Request #10404 · envoyproxy/envoy

adisuissa · 2020-03-16T14:13:31Z

For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md
Continues the PR of mergeconflict.

Description: Continuation of the overflow watermark work, that closes the connection/stream in case of too much data in the buffer.
Risk Level: Medium - modifies the workflow in case an overflow occurs.
Testing: Added Unit tests, and an integration test that tests downstream buffer overflow. Need to add more.
Docs Changes: Need to update docs.

Signed-off-by: Dan Rosen <mergeconflict@google.com>

…flow behavior, and update tests Signed-off-by: Dan Rosen <mergeconflict@google.com>

Signed-off-by: Dan Rosen <mergeconflict@google.com>

…rflow_watermark

…onnection_impl Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa · 2020-03-16T14:19:50Z

There are many changes in this PR. mostly due to merging of ~6 months of upstream work.
This PR is a continuation of previous PR.

Note that in this PR I only address the downstream buffer overflow (upstream will be handled after we decide on the right approach).
The things to look at:

source/common/buffer/watermark_buffer.cc

source/common/network/connection_impl.cc:

envoy/source/common/network/connection_impl.cc

Lines 438 to 449 in a14b73d

    
           // If the writing to the buffer caused an overflow-watermark breach, the connection should be 
        
           // closed 
        
           // TODO(adip): what should be done if the state is Closing 
        
           if (state() != State::Closed) { 
        
             // Activating a write event before the socket is connected has the side-effect of tricking 
        
             // doWriteReady into thinking the socket is connected. On macOS, the underlying write may fail 
        
             // with a connection error if a call to write(2) occurs before the connection is completed. 
        
             if (!connecting_) { 
        
               ASSERT(file_event_ != nullptr, "ConnectionImpl file event was unexpectedly reset"); 
        
               file_event_->activate(Event::FileReadyType::Write); 
        
             } 
        
           }

test/integration/http_integration.cc

envoy/test/integration/http_integration.cc

Lines 632 to 650 in a14b73d

    
           void HttpIntegrationTest::testRouterRequestAndResponseWithGiantHeader(uint64_t response_size) { 
        
             initialize(); 
        
             codec_client_ = makeHttpConnection(lookupPort("http")); 
        
             fake_upstreams_[0]->set_allow_unexpected_disconnects(true); 
        
             auto response = codec_client_->makeHeaderOnlyRequest(default_request_headers_); 
        
             waitForNextUpstreamRequest(); 
        
             // Send response headers, and no end_stream (there's response body). 
        
             Http::TestResponseHeaderMapImpl response_headers_{{":status", "200"}, 
        
                                                               {":big", std::string(response_size, 'a')}}; 
        
             upstream_request_->encodeHeaders(default_response_headers_, false); 
        
             // Send the response data, with end_stream true. 
        
             upstream_request_->encodeData(response_size, true); 
        
             // Overflow should happen, and the response should be reset 
        
             // Wait for the response to be read by the codec client. 
        
             codec_client_->waitForDisconnect(); 
        
             EXPECT_FALSE(response->complete()); 
        
           }

I'll add more discussion points shortly.

adisuissa · 2020-03-16T14:20:24Z

/cc @htuch @yanavlasov

adisuissa · 2020-03-16T14:26:46Z

Discussion points:

atm the overflow watermark check is happening after adding data to the buffer. Should this be done proactively?
should the buffer be drained as soon as the buffer overflows?
the overflow_multiplier is initialized from the config loader singleton. This requires the runtime's loader to be initialized before running a test. Not sure if this is desired for all tests.

antoniovicente · 2020-03-16T20:08:39Z

source/common/buffer/watermark_buffer.h

+      : below_low_watermark_(below_low_watermark), above_high_watermark_(above_high_watermark),
+        above_overflow_watermark_(above_overflow_watermark),
+        overflow_multiplier_(Runtime::LoaderSingleton::get().threadsafeSnapshot()->getInteger(
+            "buffer.overflow.high_watermark_multiplier", 2)) {}


Is a factor of 2 safe based on the maximum expansion of a request that can happen during proxying due to encoding related issues?

Some cases that come to mind:

H2 compressed headers expanding to H1 headers. I think this is limited by max header size bytes, which is not related to watermark parameters.

H2 connection flow control windows vs buffer watermarks.

H2 control frames which are not limited by flow control.

H1 to H2 chunk body conversions. I think an 1-byte H1 chunk can be encoded in either 4 or 6 bytes based on which line termination the sender uses. The H2 data frame header is 9 bytes, so the H2 encoding for a 1 byte frame is 10 bytes (2.5x larger)

Asymmetric read/write buffer configs. If the read buffer size is 10x the write buffer, you are very likely to hit the overflow condition after a large read.

Good points...
To add to the first bullet, should we verify that the max header size bytes is less-or-equal to overflow watermark value?
Regarding point 5, IIUC it is also the case where the rate of the producer is much faster than the rate of the consumer. In this case I think it's up to the configuration parameters, but if there's a way to warn that a configuration might be wrong, then this is an example where it should.

The max header limit excludes framing overhead. Also, you can end up writing headers and body in the same stack, you need space for both.
I think your changes effectively make it is unsafe to set buffer high water to anything below around 32kb or possibly, yet this is not enforced anywhere in the config plane. Note that there's widespread use of high watermarks of exactly 32kb.

I think the goal of high watermark is to deal with highly unusual situations where normal watermarks have failed us. Examples are where a number of connection buffers are close to their watermark, but not over, and a large read occurs, adding a bunch of data before watermarking or overload manager can respond. Or when we forget to do watermark flow control push back due to a missing feedback path (e.g. for proxy generated direct responses).

Thus, I'm not super worried about the user having to explicitly size them, they should be sizing the normal watermarks (or knobs implying them) based on an understanding of traffic. Whenever there is compression involved, this might be tricky, e.g. zlib or HPACK. Headers are a special case, in general though compression could be 100x. Presumably the basic watermark limit is set to reflect this.

Setting some fixed constant as the fudge factor then makes sense to provide a zero tune capability here IMHO, it probably doesn't matter as long as it's some reasonable distance from the normal operational limit.

antoniovicente · 2020-03-16T20:13:00Z

source/common/buffer/watermark_buffer.cc


 void WatermarkBuffer::checkLowWatermark() {
-  if (!above_high_watermark_called_ ||
+  if (above_overflow_watermark_called_ || !above_high_watermark_called_ ||


Why do we check for above_overflow_watermark_called_ ?

At the moment, although an overflow occurs, the buffer can still be used. Once the buffer is read from, the low watermark callback will be called. I'm not sure that in cases where the connection is terminated, the readDisable should be called again.
If I'm missing a scenario please let me know.

That said, a different buffer design that should be considered is to disable all operations once an overflow has been detected.

Here's the problem: if the overflow callback is called, the buffer stops functioning properly since low watermark callbacks are no longer called when they should.

This PR feels high risk. I think we may be jumping the gun by going straight to closing the connection/resetting stream on overflow. It would be good to gather some data about what can cause the overflow cb to trigger. I know there's a runtime flag to control this behavior, but it defaulting on with a 2x overflow ratio could cause some problems.

Agreed that runtime control over the high watermark constant seems a useful capability.

@antoniovicente I'm unsure we have valid causes for the overflow to trigger at the moment. This feature to me is a safety valve to account for unknown bugs that may trigger buffer to grow way beyond high watermark. As such I would think outright socket closure is appropriate.
I agree though the multiplier is way too low and perhaps is not quite the right approach, see my comment above.

@yanavlasov triggering the 2x multipler overflow watermark is very easy. Specially when you factor in that the H2 connection flow control window is not coordinated with high watermarks. Going straight to closing connection is very risky.

I'm just arguing for keeping the high/low watermarks in working order even if the overflow one is hit, so we can add overflow implementations that just increment counters.

antoniovicente · 2020-03-16T20:14:10Z

source/common/buffer/watermark_buffer.cc

+  if (!above_overflow_watermark_called_ && overflow_watermark_ != 0 &&
+      OwnedImpl::length() > overflow_watermark_) {
+    above_overflow_watermark_called_ = true;
+    above_overflow_watermark_();


This early return combined with above_overflow_watermark_called_ means that the second check can end up calling the high watermark cb. Should these checks happen in the opposite order: high watermark first, then overflow?

It really depends whether the normal flow control should be preserved or not, and if the buffer should be usable or not.
To the best of my understanding it should actually be that we first check if the overflow watermark cb was called, and if so, then early return.

Is there any behavior we want to support other than connection shutdown and resource teardown? Presumably initiated within the space of a single epoll event call stack (we would do a hard connection shutdown, with no drain).

antoniovicente · 2020-03-16T20:21:30Z

source/extensions/filters/http/fault/fault_filter.cc

 }

 StreamRateLimiter::StreamRateLimiter(uint64_t max_kbps, uint64_t max_buffered_data,
+                                     std::function<void()> overflow_cb,


overflow cb seems to be the last argument most of the time. Why is it the first in this case?

Thanks for pointing it out... Actually it seems that it is reversed compared to others. Usually it is low_cb, high_cb, overflow_cb.
Not sure why, and I'll try to find out.

antoniovicente · 2020-03-16T20:24:56Z

source/common/buffer/watermark_buffer.cc

+  overflow_watermark_ = high_watermark * overflow_multiplier_;
+  // TODO(adip): What should be done if there's an overflow (overflow_watermark_ <
+  // overflow_watermark_)? should this be a release_assert?
+  ASSERT((high_watermark <= overflow_watermark_) || (overflow_multiplier_ == 0));


|| overflow_watermark_ == 0

I'll add a comment here. This checks that there's no overflow due to multiplication:
overflow_watermark_ = high_watermark * overflow_multiplier_;

antoniovicente · 2020-03-16T20:30:07Z

test/integration/http_integration.cc

+
+  // Overflow should happen, and the response should be reset
+  // Wait for the response to be read by the codec client.
+  codec_client_->waitForDisconnect();


Do we fail due to overflow limit being hit or because headers are too large?

The failure is due to the buffer limit. I'll add a stats counter and an assert that verifies this.

htuch · 2020-03-17T19:46:24Z

source/common/network/connection_impl.cc

-      file_event_->activate(Event::FileReadyType::Write);
+    // If the writing to the buffer caused an overflow-watermark breach, the connection should be
+    // closed
+    // TODO(adip): what should be done if the state is Closing


I think for connection-level resets, since this is highly unusual, it should just do a hard connection shutdown.

htuch

I think this fits the high watermark notion that @alyssawilk has been championing, would be great to get her take.

adisuissa · 2020-03-17T20:35:46Z

I think this fits the high watermark notion that @alyssawilk has been championing, would be great to get her take.

I'll send a preliminary design doc that describes the goal, and we can start the discussion from there.

yanavlasov · 2020-03-18T13:54:07Z

source/common/buffer/watermark_buffer.cc

  low_watermark_ = low_watermark;
  high_watermark_ = high_watermark;
-  checkHighWatermark();
+  overflow_watermark_ = high_watermark * overflow_multiplier_;


I wonder if multiplier is the right approach. I think the worst case formula for figuring out how much data can end up in the downstream outgoing buffer is (for H/2 upstream):

min(max_concurrent_streams-downstream, max_concurrent_streams-upstream) * initial_stream_window_size-upstream.

So if the client makes a lot of simultaneous requests which land on distinct upstream servers all responses may end up in the outbound buffer before flow control can do anything about it. In this case low multiplier may cause spurious disconnects.
I wonder if the we should use a number instead of multiplier, or at the very least check if it makes sense given H2 settings for downstream and upstream?

yanavlasov · 2020-03-18T13:58:54Z

source/common/buffer/watermark_buffer.cc


 void WatermarkBuffer::checkLowWatermark() {
-  if (!above_high_watermark_called_ ||
+  if (above_overflow_watermark_called_ || !above_high_watermark_called_ ||


@antoniovicente I'm unsure we have valid causes for the overflow to trigger at the moment. This feature to me is a safety valve to account for unknown bugs that may trigger buffer to grow way beyond high watermark. As such I would think outright socket closure is appropriate.
I agree though the multiplier is way too low and perhaps is not quite the right approach, see my comment above.

htuch · 2020-03-18T22:20:06Z

I think these are separate issues. We should have better safety guards to ensure the overflow watermark is high enough from an HTTP/2 perspective (taking into account flow control windows, streams etc.)

In terms of overflow watermark shutdown behavior, I think we need to keep the config as simple as possible while providing maximum chance of runtime recovery. What if we have a runtime knob that globally decides whether to do a hard shutdown or a stats bump on overflow?

That requires regular HW/LW functions. I think this would be default enabled on every edge Envoy; the entire goal of OW is to deal with the real risk of OOM from unknown watermark snafus.

mattklein123 · 2020-03-18T22:28:04Z

Given the complexity of this PR, do you think it would be reasonable to split this into at least 2 PRs, the first being the actual overflow implementation in the watermark buffer, and then actually using this by providing non-nullptr callbacks in various places? (The latter could possibly be split also but might not be needed) WDYT?

adisuissa · 2020-03-19T02:02:53Z

@mattklein123 I agree that this is a complex PR, and probably should be divided into multiple PRs.
It might be a non-trivial modification to the watermark_buffer API, because one of the design decisions is whether an overflowed buffer should be in an "invalid" state and not allow any/some operations.
We are working on a design proposal, and once we get a green light, we can share this proposal and proceed with the implementation.

mattklein123 · 2020-03-19T16:54:05Z

We are working on a design proposal, and once we get a green light, we can share this proposal and proceed with the implementation.

Awesome a design proposal sounds great, thank you.

stale · 2020-03-26T17:13:33Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

stale · 2020-04-02T17:15:28Z

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Dan Rosen and others added 17 commits July 17, 2019 11:30

buffer: add "overflow" watermark

a87a40e

Signed-off-by: Dan Rosen <mergeconflict@google.com>

wip

478b871

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

c929f30

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

ef8367c

Signed-off-by: Dan Rosen <mergeconflict@google.com>

revert all test changes, introduce runtime configuration

b5dbfbc

Signed-off-by: Dan Rosen <mergeconflict@google.com>

wip: fix tests that require runtime singleton

7a96abf

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

7fe7911

Signed-off-by: Dan Rosen <mergeconflict@google.com>

wip: enable overflow watermark by default, disable in tests that break

a392911

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

921e3f0

Signed-off-by: Dan Rosen <mergeconflict@google.com>

update comment per @alyssawilk

3ea380a

Signed-off-by: Dan Rosen <mergeconflict@google.com>

snap overflow multiplier in buffer ctor, fix hcm response buffer over…

33e887d

…flow behavior, and update tests Signed-off-by: Dan Rosen <mergeconflict@google.com>

re-add / fix unit test coverage for overflow watermark

839bfc7

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

eb3520b

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

ef8df87

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' of https://github.com/envoyproxy/envoy into ove…

7ecfd25

…rflow_watermark

Fixing overflow watermark from future trigging of high overflow callback

96a7594

Adding a test for Downstream overflow watermark, and fixing network_c…

a14b73d

…onnection_impl Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa requested review from PiotrSikora, alyssawilk, lizan and mattklein123 as code owners March 16, 2020 14:13

mattklein123 assigned mattklein123, jmarantz, antoniovicente and yanavlasov Mar 16, 2020

antoniovicente reviewed Mar 16, 2020

View reviewed changes

htuch reviewed Mar 17, 2020

View reviewed changes

yanavlasov requested changes Mar 18, 2020

View reviewed changes

mattklein123 added the waiting label Mar 19, 2020

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Mar 26, 2020

stale bot closed this Apr 2, 2020

Conversation

adisuissa commented Mar 16, 2020

Uh oh!

adisuissa commented Mar 16, 2020

Uh oh!

adisuissa commented Mar 16, 2020

Uh oh!

adisuissa commented Mar 16, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

adisuissa commented Mar 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch commented Mar 18, 2020

Uh oh!

mattklein123 commented Mar 18, 2020

Uh oh!

adisuissa commented Mar 19, 2020

Uh oh!

mattklein123 commented Mar 19, 2020

Uh oh!

stale bot commented Mar 26, 2020

Uh oh!

stale bot commented Apr 2, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants