buffer: add "overflow" watermark#7619
buffer: add "overflow" watermark#7619mergeconflict wants to merge 14 commits intoenvoyproxy:masterfrom mergeconflict:overflow_watermark
Conversation
Signed-off-by: Dan Rosen <mergeconflict@google.com>
|
Looking for some initial feedback as I start working on integration tests. Is this roughly what we had in mind? |
There was a problem hiding this comment.
I think this has the right overall shape.
I think we can simplify, by basically dealing with events fairly locally. If we go over the limit on a connection or codec buffer, just close the connection (no need to inform every stream as we do for overflow)
I think that'll simplify this quite a bit, then I'll do another pass and you can start in on tests.
| // below_low_watermark_ has been called. | ||
| bool above_high_watermark_called_{false}; | ||
| // Set to true after above_overflow_watermark_ has been called. Never reset, because we assume | ||
| // the stream will be forcibly closed in response. |
There was a problem hiding this comment.
object will be destroyed? I don't think we want this stream-centric given it could be a connection
There was a problem hiding this comment.
Sorry, which object will be destroyed?
There was a problem hiding this comment.
My point is that in a buffer which is used by http streams, raw tcp connections, and arbitrary other objects, we shouldn't be talking about streams :-)
For L7 the stream will be closed, for L4 the connection will be closed, so let's find some neutral way of wording that the owning object will take care of the buffer going away.
|
QQ: What's the config plan for this? Correctly configuring all of the flow control settings is already extremely complicated. Can we potentially start with this just being a multiple of the high watermark? Or is the plan to have independent config? Also, is this a defense in depth measure, or are there specific cases we are targeting? One case I know this will fix is the local reply case which is not well covered today. |
I'm not currently planning to add any new configuration for this. The only existing configuration I found so far are
I'm not aware of specific cases we're trying to cover, I think this is just defense in depth. |
|
/wait while I work on addressing Alyssa's feedback. |
Yes agreed, this makes sense to me. Ping me when this is more fleshed out and I'm happy to take a look also. |
|
@alyssawilk - Some updates here, still a lot of work to do, but wondering if you could take another quick look? I'm now just closing the stream or connection immediately whenever I get an overflow, rather than passing the event through callbacks, and I've started fixing some tests. |
alyssawilk
left a comment
There was a problem hiding this comment.
Yeah, sorry for the delay on comments but I think this is much cleaner and will hopefully be correspondingly safer.
Couple of drive by comments while I'm in the file, and I'll do a full pass once you're ready.
| // below_low_watermark_ has been called. | ||
| bool above_high_watermark_called_{false}; | ||
| // Set to true after above_overflow_watermark_ has been called. Never reset, because we assume | ||
| // the stream will be forcibly closed in response. |
There was a problem hiding this comment.
My point is that in a buffer which is used by http streams, raw tcp connections, and arbitrary other objects, we shouldn't be talking about streams :-)
For L7 the stream will be closed, for L4 the connection will be closed, so let's find some neutral way of wording that the owning object will take care of the buffer going away.
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
|
/wait |
| istream | ||
| istringstream | ||
| iteratively | ||
| janky |
There was a problem hiding this comment.
This is my most important contribution to Envoy.
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
|
Status update on this PR:
So now is, I think, the hard part: writing new tests. @alyssawilk and @mattklein123, if you get a chance, please let me know if my methodology seems reasonable to you, and have another look at the code. You can mainly limit your attention to the changes under |
|
/unassign @yanavlasov @htuch |
mattklein123
left a comment
There was a problem hiding this comment.
@mergeconflict thanks agreed this is the right approach. I left some comments on the source and didn't look at tests as you requested.
/wait
| void checkHighAndOverflowWatermarks(); | ||
| void checkLowWatermark(); | ||
|
|
||
| Runtime::Loader& runtime_; |
There was a problem hiding this comment.
If we snap the multiplier, I think we won't need to store the reference and can just use the free variant to grab the multiplier at construction?
There was a problem hiding this comment.
Updated; wasn't sure what you meant about the "free variant" - lmk if this isn't what you had in mind.
| [this]() -> void { this->requestDataTooLarge(); }); | ||
| auto buffer = std::make_unique<Buffer::WatermarkBuffer>( | ||
| [this]() -> void { this->requestDataDrained(); }, | ||
| [this]() -> void { this->requestDataTooLarge(); }, [this]() -> void { this->resetStream(); }); |
There was a problem hiding this comment.
I will need to refresh my memory on this code a bit, but some thought will need to be put into how we reset the stream here. Should it look like a remote reset? Local reset? What reset code do we use? Etc. Mainly just a heads up to think about this a bit. Same below. (You might consider moving buffer creation to a shared function with more comments.) Will also need a stat here.
| [this]() -> void { this->enableDataFromDownstream(); }, | ||
| [this]() -> void { this->disableDataFromDownstream(); }); | ||
| [this]() -> void { this->disableDataFromDownstream(); }, | ||
| [this]() -> void { this->resetStream(); }); |
There was a problem hiding this comment.
Same comment about thinking about reset type, etc.
|
|
||
| response_limiter_ = std::make_unique<StreamRateLimiter>( | ||
| rate_kbps.value(), encoder_callbacks_->encoderBufferLimit(), | ||
| [this] { encoder_callbacks_->resetStream(); }, |
There was a problem hiding this comment.
Will need a stat of some type.
…flow behavior, and update tests Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
Source seems overall solid modulo the H2 concern. Hopefully for tests you can mostly crib the existing watermark unit and integration tests? If you're having trouble forcing network back-up for integration tests you could make a fake network listener which backs up. I'd been playing around with randomized backup over in
master...alyssawilk:backup_test_fuzzing
which might have some code useful to crib off of.
| low_watermark_ = low_watermark; | ||
| high_watermark_ = high_watermark; | ||
| checkHighWatermark(); | ||
| overflow_watermark_ = high_watermark * overflow_multiplier_; |
There was a problem hiding this comment.
Can we check that we don't overflow our overflow due to bad config?
| const uint32_t overflow_multiplier_{0}; | ||
| // Used for enforcing buffer limits (off by default). If these are set to non-zero by a call to | ||
| // setWatermarks() the watermark callbacks will be called as described above. | ||
| uint32_t overflow_watermark_{0}; |
There was a problem hiding this comment.
Is there no longer a way to set this on a per-buffer basis?
Again my concern is that if you have many streams outputting to one downstream H2 connection, and the network::Connection goes over-watermark, that the streams can each dump roughly one watermark worth of data into the network::connection without being malicious. I think when we talk about memory limits per stream this is accounted for, and we have to make sure that the H2 HCM can set this for downstream H2 and the connection pool can set a higher multiplier for H2 upstream.
|
Heads up for all concerned: I have a different project I need to prioritize right now, so I'm stepping away from this PR for a while. I'll try to keep it in sync with master in the meantime. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Dan Rosen <mergeconflict@google.com>
|
Still not really working on this. Just merged master to prevent excessive bitrot and appease stalebot. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Dan Rosen <mergeconflict@google.com>
|
On second thought, @alyssawilk had a great suggestion: just close this until I actually have a chance to get back to it. |
WIP: Add a new "overflow" watermark to watermark buffers. Intended usage: if a buffer continues to be written to after exceeding its high watermark, and exceeds its overflow watermark (by default 2x the high watermark), Envoy should protect itself by closing the associated stream.
Risk Level: Medium
Testing: Only unit tests so far; needs integration tests.
Docs Changes: TBD
Release Notes: TBD
Signed-off-by: Dan Rosen mergeconflict@google.com