listener manager: implement graceful draining using a local drain man…#1219
listener manager: implement graceful draining using a local drain man…#1219mattklein123 merged 6 commits intomasterfrom
Conversation
…ager Listeners now drain as part of the removal/update process using the server configured drain time options. drainClose() decisions are the union between the local drain manager and the global drain manager.
dnoe
left a comment
There was a problem hiding this comment.
Looks good. A few comments/questions.
| /** | ||
| * @return TRUE if the manager is currently draining connections. | ||
| * Invoked to begin the drain procedure. (Making drain close operations more likely). | ||
| * @param completion supplies the completion that will be called when the drain sequence is |
There was a problem hiding this comment.
Looks like this is optional (could be unassigned std::function) from the code below. Might be useful to call that out here.
| // TODO(mattklein123): Real listener draining with a per listener drain manager. | ||
| // TODO(mattklein123): Detect runtime worker listener addition failure and handle. | ||
| // TODO(mattklein123): Consider getting rid of pre-worker start and post-worker start code by | ||
| // initializing all listeners after workers are started. This is related to correctly dealing with |
There was a problem hiding this comment.
Do you want the text to line up? I don't care either way.
There was a problem hiding this comment.
I prefer lining up but I think when all the TODOs got reformatted a while ago the lining up got removed. I will reline up.
| Upstream::ClusterManager& clusterManager() override { return parent_.server_.clusterManager(); } | ||
| Event::Dispatcher& dispatcher() override { return parent_.server_.dispatcher(); } | ||
| DrainManager& drainManager() override { return parent_.server_.drainManager(); } | ||
| Network::DrainDecision& drainManager() override { return *this; } |
There was a problem hiding this comment.
This is a bit confusing since there is also a local drain manager. Could the member be renamed drainDecision()?
There was a problem hiding this comment.
I thought about renaming but then at the time I thought it was a breaking change, but now I realize that the whole factory context thing is new in 1.4.0 so I will change it.
| @@ -37,12 +40,15 @@ TEST(DrainManagerImplTest, All) { | |||
| // Test drain sequence. | |||
| Event::MockTimer* drain_timer = new Event::MockTimer(&server.dispatcher_); | |||
There was a problem hiding this comment.
This looks like it leaks memory. Not a big deal for the test but can it be unique_ptr?
There was a problem hiding this comment.
It doesn't leak. This is a pretty common pattern in tests where code needs to operate on a unique_ptr but we need access to the returned thing so we have to use the raw pointer. (This timer gets wrapped in a unique_ptr and returned to code under test).
There was a problem hiding this comment.
Got it, so the deletion actually happens when that unique_ptr is destroyed.
There was a problem hiding this comment.
@alyssawilk who had a question on the Envoy-ish pattern for this form of memory management the other day.
There was a problem hiding this comment.
If you would like to me document this somewhere in the style guide let me know. It allows us to use unique_ptr in product code whenever possible at the expense of making some of the memory management in the tests a bit strange, which IMO is a good compromise.
There was a problem hiding this comment.
I think it would be good to document in the style guide since this explicitly departs from the usual unique_ptr hygiene. Agree it makes total sense for test code once I got the explanation.
| std::string listener_foo_json = R"EOF( | ||
| { | ||
| "name": "foo", | ||
| "address": "tcp://127.0.0.1:1234", |
There was a problem hiding this comment.
Will this test run in an IPv6 only environment?
There was a problem hiding this comment.
Yes the tests use mock sockets.
|
@dnoe updated |
| * more likely). | ||
| */ | ||
| virtual void startDrainSequence() PURE; | ||
| virtual void startDrainSequence(std::function<void()> completion) PURE; |
There was a problem hiding this comment.
Do we drain globally on each LDS update? This seems reasonable given the current semantics (each LDS refresh is an update of the complete set of listeners), but won't this also preclude the possibility of skipping the drain on listeners that aren't being modified (i.e. they have the same config hash)?
There was a problem hiding this comment.
Oh, I see, we create a new drain manager per listener as well as the singleton per-server. Makes sense (could be called out in some comment as well).
There was a problem hiding this comment.
OK will find a comment to add to.
| void ListenerManagerImpl::drainListener(ListenerImplPtr&& listener) { | ||
| // TODO(mattklein123): Actually implement timed draining in a follow up. Currently we just | ||
| // correctly synchronize removal across all workers. | ||
| // First add the listener to the draining list. Thist must be done under the lock since it |
There was a problem hiding this comment.
Nit: s/Thirst/This/
Does the timed draining TODO no longer apply?
There was a problem hiding this comment.
It no longer applies. The drain manager handles the timer and then removal via the completion callback. I will add more comments.
| @@ -37,12 +40,15 @@ TEST(DrainManagerImplTest, All) { | |||
| // Test drain sequence. | |||
| Event::MockTimer* drain_timer = new Event::MockTimer(&server.dispatcher_); | |||
There was a problem hiding this comment.
@alyssawilk who had a question on the Envoy-ish pattern for this form of memory management the other day.
Automatic merge from submit-queue. Add support for v1 config. **What this PR does / why we need it**: **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes envoyproxy#1217 **Special notes for your reviewer**: **Release note**: ```release-note NONE ```
Description: This PR implements all of the header-related classes--all of the request/response headers/trailers, their builders, and their base classes. Risk Level: Low Testing: N/A, will follow up when everything is finished with e2e testing + unit testing in Python and/orC++ Docs Changes: N/A Release Notes: N/A Signed-off-by: Cerek Hillen <chillen@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
Description: This PR implements all of the header-related classes--all of the request/response headers/trailers, their builders, and their base classes. Risk Level: Low Testing: N/A, will follow up when everything is finished with e2e testing + unit testing in Python and/orC++ Docs Changes: N/A Release Notes: N/A Signed-off-by: Cerek Hillen <chillen@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
…t model (#1230) **Description** This implements `OriginalModel` as the model name extracted from the incoming request body before any virtualization applies. In doing so, we allow metrics such as "gen_ai.server.token.usage" to be aggregated either on the model the router received (`OriginalModel`) or an override (`RequestModel`). Flow: 1. Router filter extracts model from request body 2. If ModelNameOverride is configured, RequestModel differs from OriginalModel 3. Provider responds with ResponseModel (may differ from RequestModel) Example: 1. `OriginalModel`: OpenAI Client sends: {"model": "gpt-5"} 2. `RequestModel`: ModelNameOverride replaces with "gpt-5-nano" 3. `ResponseModel`: OpenAI Platform sends: {"model": "gpt-5-nano-2025-08-07"} OpenTelemetry: In OpenTelemetry Generative AI Metrics, this is an attribute on metrics such as "gen_ai.server.token.usage". For example, an OpenAI Chat Completion request to the "gpt-5" model results in a plain text string attribute: "gen_ai.original.model" -> "gpt-5" **Related Issues/PRs (if applicable)** Builds on #1219 --------- Signed-off-by: Adrian Cole <adrian@tetrate.io>
…ager
Listeners now drain as part of the removal/update process using the server
configured drain time options. drainClose() decisions are the union between
the local drain manager and the global drain manager.