buffer: add "overflow" watermark by mergeconflict · Pull Request #7619 · envoyproxy/envoy

mergeconflict · 2019-07-17T17:12:21Z

WIP: Add a new "overflow" watermark to watermark buffers. Intended usage: if a buffer continues to be written to after exceeding its high watermark, and exceeds its overflow watermark (by default 2x the high watermark), Envoy should protect itself by closing the associated stream.

Risk Level: Medium
Testing: Only unit tests so far; needs integration tests.
Docs Changes: TBD
Release Notes: TBD

Signed-off-by: Dan Rosen mergeconflict@google.com

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-07-17T17:16:13Z

Looking for some initial feedback as I start working on integration tests. Is this roughly what we had in mind?
/assign @alyssawilk @htuch @yanavlasov

alyssawilk

I think this has the right overall shape.
I think we can simplify, by basically dealing with events fairly locally. If we go over the limit on a connection or codec buffer, just close the connection (no need to inform every stream as we do for overflow)
I think that'll simplify this quite a bit, then I'll do another pass and you can start in on tests.

source/common/buffer/watermark_buffer.cc

include/envoy/network/connection.h

source/common/buffer/watermark_buffer.h

alyssawilk · 2019-07-17T18:59:21Z

source/common/buffer/watermark_buffer.h

+  // below_low_watermark_ has been called.
  bool above_high_watermark_called_{false};
+  // Set to true after above_overflow_watermark_ has been called. Never reset, because we assume
+  // the stream will be forcibly closed in response.


object will be destroyed? I don't think we want this stream-centric given it could be a connection

Sorry, which object will be destroyed?

My point is that in a buffer which is used by http streams, raw tcp connections, and arbitrary other objects, we shouldn't be talking about streams :-)
For L7 the stream will be closed, for L4 the connection will be closed, so let's find some neutral way of wording that the owning object will take care of the buffer going away.

source/common/http/codec_client.h

source/common/http/http2/codec_impl.h

mattklein123 · 2019-07-18T16:46:33Z

QQ: What's the config plan for this? Correctly configuring all of the flow control settings is already extremely complicated. Can we potentially start with this just being a multiple of the high watermark? Or is the plan to have independent config?

Also, is this a defense in depth measure, or are there specific cases we are targeting? One case I know this will fix is the local reply case which is not well covered today.

mergeconflict · 2019-07-19T15:05:54Z

QQ: What's the config plan for this? Correctly configuring all of the flow control settings is already extremely complicated. Can we potentially start with this just being a multiple of the high watermark? Or is the plan to have independent config?

I'm not currently planning to add any new configuration for this. The only existing configuration I found so far are initial_stream_window_size and initial_connection_window_size in Http2ProtocolOptions, but I've quite possibly missed something. My guess is that setting the overflow watermark at 2x the high watermark (both for the connection and per H2 stream) is probably fine, at least as a starting point, but I'll defer to @alyssawilk.

Also, is this a defense in depth measure, or are there specific cases we are targeting? One case I know this will fix is the local reply case which is not well covered today.

I'm not aware of specific cases we're trying to cover, I think this is just defense in depth.

mergeconflict · 2019-07-19T15:06:42Z

/wait while I work on addressing Alyssa's feedback.

mattklein123 · 2019-07-20T19:42:10Z

My guess is that setting the overflow watermark at 2x the high watermark (both for the connection and per H2 stream) is probably fine, at least as a starting point, but I'll defer to @alyssawilk.

Yes agreed, this makes sense to me. Ping me when this is more fleshed out and I'm happy to take a look also.

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-07-23T21:31:40Z

@alyssawilk - Some updates here, still a lot of work to do, but wondering if you could take another quick look? I'm now just closing the stream or connection immediately whenever I get an overflow, rather than passing the event through callbacks, and I've started fixing some tests.

alyssawilk

Yeah, sorry for the delay on comments but I think this is much cleaner and will hopefully be correspondingly safer.

Couple of drive by comments while I'm in the file, and I'll do a full pass once you're ready.

source/common/buffer/watermark_buffer.h

alyssawilk · 2019-07-25T15:37:11Z

source/common/buffer/watermark_buffer.h

+  // below_low_watermark_ has been called.
  bool above_high_watermark_called_{false};
+  // Set to true after above_overflow_watermark_ has been called. Never reset, because we assume
+  // the stream will be forcibly closed in response.


My point is that in a buffer which is used by http streams, raw tcp connections, and arbitrary other objects, we shouldn't be talking about streams :-)
For L7 the stream will be closed, for L4 the connection will be closed, so let's find some neutral way of wording that the owning object will take care of the buffer going away.

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-08-02T21:19:14Z

/wait

mergeconflict · 2019-08-02T21:29:26Z

tools/spelling_dictionary.txt

 istream
 istringstream
 iteratively
+janky


This is my most important contribution to Envoy.

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-08-06T21:14:00Z

Status update on this PR:

The high watermark overflow multiplier is now controlled by a runtime variable, buffer.overflow.high_watermark_multiplier. Its default value is 2, and it can be disabled by setting it to 0. I haven't documented this yet.
There are a bunch of tests that indirectly create watermarked buffers and therefore require the runtime to be present. I've created a Runtime::ScopedMockLoaderSingleton class to deal with these easily.
There are a lot existing tests that set a relatively small buffer size (e.g. 1 KB) and then flood it with data (e.g. 1 MB), to exercise the various backpressure mechanisms. These tests break with high_watermark_multiplier=2, because the new behavior is to immediately close the connection or stream. So I've disabled the feature and left a TODO in each of those specific tests, to write another test that covers the new behavior. There are about 35 of these.

So now is, I think, the hard part: writing new tests. @alyssawilk and @mattklein123, if you get a chance, please let me know if my methodology seems reasonable to you, and have another look at the code. You can mainly limit your attention to the changes under /source, since the /test changes are all basically mechanical so far. Thanks again!

mergeconflict · 2019-08-07T15:15:01Z

/unassign @yanavlasov @htuch

mattklein123

@mergeconflict thanks agreed this is the right approach. I left some comments on the source and didn't look at tests as you requested.

/wait

source/common/buffer/watermark_buffer.cc

mattklein123 · 2019-08-07T17:09:19Z

source/common/buffer/watermark_buffer.h

+  void checkHighAndOverflowWatermarks();
  void checkLowWatermark();

+  Runtime::Loader& runtime_;


If we snap the multiplier, I think we won't need to store the reference and can just use the free variant to grab the multiplier at construction?

Updated; wasn't sure what you meant about the "free variant" - lmk if this isn't what you had in mind.

mattklein123 · 2019-08-07T17:10:53Z

source/common/http/conn_manager_impl.cc

-                                                [this]() -> void { this->requestDataTooLarge(); });
+  auto buffer = std::make_unique<Buffer::WatermarkBuffer>(
+      [this]() -> void { this->requestDataDrained(); },
+      [this]() -> void { this->requestDataTooLarge(); }, [this]() -> void { this->resetStream(); });


I will need to refresh my memory on this code a bit, but some thought will need to be put into how we reset the stream here. Should it look like a remote reset? Local reset? What reset code do we use? Etc. Mainly just a heads up to think about this a bit. Same below. (You might consider moving buffer creation to a shared function with more comments.) Will also need a stat here.

mattklein123 · 2019-08-07T17:12:21Z

source/common/router/router.cc

          [this]() -> void { this->enableDataFromDownstream(); },
-          [this]() -> void { this->disableDataFromDownstream(); });
+          [this]() -> void { this->disableDataFromDownstream(); },
+          [this]() -> void { this->resetStream(); });


Same comment about thinking about reset type, etc.

mattklein123 · 2019-08-07T17:12:39Z

source/extensions/filters/http/fault/fault_filter.cc


  response_limiter_ = std::make_unique<StreamRateLimiter>(
      rate_kbps.value(), encoder_callbacks_->encoderBufferLimit(),
+      [this] { encoder_callbacks_->resetStream(); },


Will need a stat of some type.

…flow behavior, and update tests Signed-off-by: Dan Rosen <mergeconflict@google.com>

Signed-off-by: Dan Rosen <mergeconflict@google.com>

alyssawilk

Source seems overall solid modulo the H2 concern. Hopefully for tests you can mostly crib the existing watermark unit and integration tests? If you're having trouble forcing network back-up for integration tests you could make a fake network listener which backs up. I'd been playing around with randomized backup over in
master...alyssawilk:backup_test_fuzzing
which might have some code useful to crib off of.

alyssawilk · 2019-08-12T14:39:29Z

source/common/buffer/watermark_buffer.cc

  low_watermark_ = low_watermark;
  high_watermark_ = high_watermark;
-  checkHighWatermark();
+  overflow_watermark_ = high_watermark * overflow_multiplier_;


Can we check that we don't overflow our overflow due to bad config?

alyssawilk · 2019-08-12T14:43:28Z

source/common/buffer/watermark_buffer.h

+  const uint32_t overflow_multiplier_{0};
  // Used for enforcing buffer limits (off by default). If these are set to non-zero by a call to
  // setWatermarks() the watermark callbacks will be called as described above.
+  uint32_t overflow_watermark_{0};


Is there no longer a way to set this on a per-buffer basis?

Again my concern is that if you have many streams outputting to one downstream H2 connection, and the network::Connection goes over-watermark, that the streams can each dump roughly one watermark worth of data into the network::connection without being malicious. I think when we talk about memory limits per stream this is accounted for, and we have to make sure that the H2 HCM can set this for downstream H2 and the connection pool can set a higher multiplier for H2 upstream.

mergeconflict · 2019-08-12T21:10:16Z

Heads up for all concerned: I have a different project I need to prioritize right now, so I'm stepping away from this PR for a while. I'll try to keep it in sync with master in the meantime.

stale · 2019-08-19T21:57:21Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-08-21T16:57:25Z

Still not really working on this. Just merged master to prevent excessive bitrot and appease stalebot.
/wait

stale · 2019-08-28T17:32:22Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mergeconflict · 2019-08-28T20:45:29Z

On second thought, @alyssawilk had a great suggestion: just close this until I actually have a chance to get back to it.

buffer: add "overflow" watermark

a87a40e

Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot assigned alyssawilk, htuch and yanavlasov Jul 17, 2019

alyssawilk reviewed Jul 17, 2019

View reviewed changes

mattklein123 self-assigned this Jul 20, 2019

mattklein123 added the waiting label Jul 22, 2019

wip

478b871

Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot removed the waiting label Jul 23, 2019

alyssawilk reviewed Jul 25, 2019

View reviewed changes

Merge branch 'master' into overflow_watermark

c929f30

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mattklein123 added the waiting label Jul 26, 2019

Dan Rosen added 4 commits July 29, 2019 10:53

Merge branch 'master' into overflow_watermark

ef8367c

Signed-off-by: Dan Rosen <mergeconflict@google.com>

revert all test changes, introduce runtime configuration

b5dbfbc

Signed-off-by: Dan Rosen <mergeconflict@google.com>

wip: fix tests that require runtime singleton

7a96abf

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

7fe7911

Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot removed the waiting label Aug 2, 2019

repokitteh-read-only bot added the waiting label Aug 2, 2019

mergeconflict commented Aug 2, 2019

View reviewed changes

Dan Rosen added 2 commits August 6, 2019 15:45

wip: enable overflow watermark by default, disable in tests that break

a392911

Signed-off-by: Dan Rosen <mergeconflict@google.com>

Merge branch 'master' into overflow_watermark

921e3f0

Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot removed the waiting label Aug 6, 2019

update comment per @alyssawilk

3ea380a

Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot unassigned yanavlasov and htuch Aug 7, 2019

mattklein123 requested changes Aug 7, 2019

View reviewed changes

repokitteh-read-only bot added the waiting label Aug 7, 2019

snap overflow multiplier in buffer ctor, fix hcm response buffer over…

33e887d

…flow behavior, and update tests Signed-off-by: Dan Rosen <mergeconflict@google.com>

repokitteh-read-only bot removed the waiting label Aug 7, 2019

re-add / fix unit test coverage for overflow watermark

839bfc7

Signed-off-by: Dan Rosen <mergeconflict@google.com>

mattklein123 added the waiting label Aug 8, 2019

alyssawilk reviewed Aug 12, 2019

View reviewed changes

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 19, 2019

Merge branch 'master' into overflow_watermark

eb3520b

Signed-off-by: Dan Rosen <mergeconflict@google.com>

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Aug 21, 2019

repokitteh-read-only bot removed the waiting label Aug 21, 2019

repokitteh-read-only bot added the waiting label Aug 21, 2019

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 28, 2019

Merge branch 'master' into overflow_watermark

ef8df87

Signed-off-by: Dan Rosen <mergeconflict@google.com>

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Aug 28, 2019

repokitteh-read-only bot removed the waiting label Aug 28, 2019

mergeconflict closed this Aug 28, 2019

alyssawilk mentioned this pull request Oct 16, 2019

quiche: implement http stream interfaces #8556

Merged

adisuissa mentioned this pull request Mar 16, 2020

Overflow watermark #10404

Closed

Conversation

mergeconflict commented Jul 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergeconflict commented Jul 17, 2019

Uh oh!

alyssawilk left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mattklein123 commented Jul 18, 2019

Uh oh!

mergeconflict commented Jul 19, 2019

Uh oh!

mergeconflict commented Jul 19, 2019

Uh oh!

mattklein123 commented Jul 20, 2019

Uh oh!

mergeconflict commented Jul 23, 2019

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergeconflict commented Aug 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergeconflict commented Aug 6, 2019

Uh oh!

mergeconflict commented Aug 7, 2019

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergeconflict commented Aug 12, 2019

Uh oh!

stale bot commented Aug 19, 2019

Uh oh!

mergeconflict commented Aug 21, 2019

Uh oh!

stale bot commented Aug 28, 2019

Uh oh!

mergeconflict commented Aug 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

mergeconflict commented Jul 17, 2019 •

edited

Loading

alyssawilk left a comment •

edited

Loading