Skip to content

response decoder: add new interfaces for life time tracking of response decoder#41048

Merged
danzh2010 merged 21 commits intoenvoyproxy:mainfrom
danzh2010:decoderhandle
Sep 17, 2025
Merged

response decoder: add new interfaces for life time tracking of response decoder#41048
danzh2010 merged 21 commits intoenvoyproxy:mainfrom
danzh2010:decoderhandle

Conversation

@danzh2010
Copy link
Copy Markdown
Contributor

@danzh2010 danzh2010 commented Sep 11, 2025

Commit Message: introduce ResponseDecoderHandle interface via which ResponseDecoder object can be obtained with guarantee that the object is still alive. Also added a new interface getResponseDecoderHandle() to ResponseDecoder and a new base implementation of ResponseDecoder ResponseDecoderImplBase which implements these new interfaces. Existing subclasses of ResponseDecoder are changed to inherit from ResponseDecoderImplBase.

Additional Description: also make EnvoyQuicClientStream and ResponseDecoderWrapper to use the handle instead of directly accessing the passed-in response decoder, guarded by envoy.reloadable_features.use_response_decoder_handle. And when the decoder is dead, log error or abort based on envoy.reloadable_features.abort_when_accessing_dead_decoder.

We should gradually transit from using raw pointer or reference of ResponseDecoder to using the handle interface everywhere else. This PR makes the transition in the 2 classes which we are observing life time issues today.

Risk Level: low
Testing: existing tests passed
Docs Changes: N/A
Release Notes: Y
Platform Specific Features: N/A
Runtime guard: envoy.reloadable_features.abort_when_accessing_dead_decoder(default false) and envoy.reloadable_features.use_response_decoder_handle
Issue: #41060

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #41048 was opened by danzh2010.

see: more, trace.

@danzh2010 danzh2010 changed the title response decoder: add new interfaces for life time assurance response decoder: add new interfaces for life time tracking of response decoder Sep 11, 2025
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Copy Markdown
Contributor Author

/assign @RyanTheOptimist @abeyad

Copy link
Copy Markdown
Contributor

@abeyad abeyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this Dan! I really like this fix.

* and/or router components
*/
class UpstreamToDownstream : public Http::ResponseDecoder, public Http::StreamCallbacks {
class UpstreamToDownstream : public Http::ResponseDecoderImplBase, public Http::StreamCallbacks {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to change the inheritance to ResponseDecoderImplBase?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is so we get an implementation of the new getHandle method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResponseDecoderImplBase implements the new interface getResponseDecoderHandle()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this was a comment i had left in before i read the full PR but forgot to remove :/ sorry about that!

// ResponseDecoder
void decode1xxHeaders(ResponseHeaderMapPtr&& headers) override {
inner_.decode1xxHeaders(std::move(headers));
if (Http::ResponseDecoder* inner = getInnerDecoder()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I leave it up to you if you think it helps readability, but you could refactor this into a function:

void runWithInnerDecoder(std::function<>(Http::ResponseDecoder*) callback) {
  if (Http::ResponseDecoder* inner = getInnerDecoder()) {
      callback(inner);
  } else {
      onInnerDecoderDead();
  }
}

then all of these functions would look like:

runWithInnerDecoder([headers](Http::ResponseDecoder* inner) -> { inner->decode1xxHeaders(std::move(headers)); });

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack trace with lambda function is hard to read. I'd prefer inlining the if else block.

* @return A handle to the response decoder. Caller can check the response decoder's liveness via
* the handle.
*/
virtual ResponseDecoderHandlePtr getResponseDecoderHandle() PURE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can be a pure virtual function, b/c there are several classes (including within Google, several classes) that inherit from ResponseDecoder, so unless we add an override in all those functions, we'd fail to compile, no?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this perhaps be named createResponseDecoderHandle() to make it clear that it is creating a new object (as suggested by the return type of std::unique_ptr)?

Ali: I think this will be a breaking API change, but we get those all the time in Envoy which makes import so much fun. When this gets imported, can we change those classes to extend the new base class?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ResponseDecoderImplBase implements it. So all the subclasses of ResponseDecoder are now inheriting from ResponseDecoderImplBase

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant that there are subclasses of ResponseDecoder in Google specific code that will need to be changed. As Ryan said, I guess the import will just need to handle it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to createResponseDecoderHandle

// In case the status is invalid or missing, the response_decoder_.decodeHeaders() will fail the
// request
response_decoder_->decodeHeaders(std::move(headers), fin);
if (Http::ResponseDecoder* decoder = getResponseDecoder()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why aren't we using a ResponseDecoderWrapper here? It seems like logic is duplicated here and in ResponseDecoderWrapper

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrapper is extra layer of indirection and it has other virtual interfaces which are not needed by quic client stream.

Signed-off-by: Dan Zhang <danzh@google.com>
Copy link
Copy Markdown
Contributor

@RyanTheOptimist RyanTheOptimist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to understand why we only need this mechanism in QUIC and not in other codecs. Perhaps we can update the CL description to flesh this out.


class ResponseDecoder;

class ResponseDecoderHandle {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a brief comment explaining the purpose of this class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* @return A handle to the response decoder. Caller can check the response decoder's liveness via
* the handle.
*/
virtual ResponseDecoderHandlePtr getResponseDecoderHandle() PURE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this perhaps be named createResponseDecoderHandle() to make it clear that it is creating a new object (as suggested by the return type of std::unique_ptr)?

Ali: I think this will be a breaking API change, but we get those all the time in Envoy which makes import so much fun. When this gets imported, can we change those classes to extend the new base class?

* and/or router components
*/
class UpstreamToDownstream : public Http::ResponseDecoder, public Http::StreamCallbacks {
class UpstreamToDownstream : public Http::ResponseDecoderImplBase, public Http::StreamCallbacks {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is so we get an implementation of the new getHandle method?

}

protected:
std::shared_ptr<bool> live_trackable_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So effectively, this Handle is basically a weak pointer to the decoder, but manually implemented?

Looking at the CL in more detail, I wonder if we could do something a bit simpler. What if we added a new capability to the response decoder, which is a callback to run when the decoder is destroyed. We could then implement this decoder something like:

class ResponseDecoderHandleImpl : public ResponseDecoderHandle {
public:
  ResponseDecoderHandleImpl(explicit ResponseDecoder& decoder)
      : ldecoder_(&decoder) {
    decoder_.SetDestructionCallback( [] { OnDecoderDestroyed();} );
  }

  void OnDecoderDestroyed() {
    decoder_ = nullptr;
  }

  OptRef<ResponseDecoder> get() override {
    if (decoder_ != nullpter) {
      return *decoder_;
    }
    return {};
  }

private:
  ResponseDecoder* decoder_;
};

Then we don't need explicit ref-counting in the base class, and we don't need the Decoder to create Handles (though maybe we would still want to?). WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResponseDecoderHandleImpl also needs to notify the decoder_ about its own lifetime then. There would be mutual references and detaching upon destruction between the handle and the decoder. It's possible, but I feel shared_ptr is less error-prone. Happy to chat more offline about the approach.

@abeyad
Copy link
Copy Markdown
Contributor

abeyad commented Sep 11, 2025

I think it would be good to understand why we only need this mechanism in QUIC and not in other codecs. Perhaps we can update the CL description to flesh this out.

This bug effects HTTP/1 too from what we saw, it's not protocol specific. So the fix will need to handle all protocols. I think that's what ResponseDecoderWrapper does anyway, so I'm not sure why we need the specific checks in EnvoyQuicClientStream

@danzh2010
Copy link
Copy Markdown
Contributor Author

I think it would be good to understand why we only need this mechanism in QUIC and not in other codecs. Perhaps we can update the CL description to flesh this out.

This bug effects HTTP/1 too from what we saw, it's not protocol specific. So the fix will need to handle all protocols. I think that's what ResponseDecoderWrapper does anyway, so I'm not sure why we need the specific checks in EnvoyQuicClientStream

It's upto each codec whether to use the handle interface or not. I change the quic codec to use the handle as an example.

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010 danzh2010 requested a review from zuercher as a code owner September 12, 2025 14:39
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Copy Markdown
Contributor Author

/retest

Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Copy Markdown
Contributor Author

The PR notifier seems unrelated. This PR is ready for review!

@danzh2010
Copy link
Copy Markdown
Contributor Author

/retest

abeyad
abeyad previously approved these changes Sep 16, 2025
Copy link
Copy Markdown
Contributor

@abeyad abeyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks Dan!

@abeyad
Copy link
Copy Markdown
Contributor

abeyad commented Sep 16, 2025

Looks like the PR notifier job is broken at tip of tree (unrelated to this PR)

Copy link
Copy Markdown
Contributor

@RyanTheOptimist RyanTheOptimist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

void useCapsuleProtocol();
#endif

Http::ResponseDecoder* getResponseDecoder();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please add a comment which explains when/why it can be null.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010 danzh2010 dismissed stale reviews from RyanTheOptimist and abeyad via 4627f86 September 16, 2025 21:15
@danzh2010 danzh2010 enabled auto-merge (squash) September 16, 2025 21:22
@danzh2010
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@danzh2010
Copy link
Copy Markdown
Contributor Author

/retest

@danzh2010 danzh2010 merged commit 566b192 into envoyproxy:main Sep 17, 2025
25 checks passed
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Sep 19, 2025
…f response decoder (envoyproxy#41048)"

This reverts commit 566b192.

Signed-off-by: Dan Zhang <danzh@google.com>
botengyao pushed a commit that referenced this pull request Sep 19, 2025
#41141)

…f response decoder (#41048)"

This reverts commit 566b192.

It's causing crash at ResponseDecoderWrapper

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
mbadov pushed a commit to mbadov/envoy that referenced this pull request Sep 22, 2025
…se decoder (envoyproxy#41048)

Commit Message: introduce ResponseDecoderHandle interface via which
ResponseDecoder object can be obtained with guarantee that the object is
still alive. Also added a new interface getResponseDecoderHandle() to
ResponseDecoder and a new base implementation of ResponseDecoder
ResponseDecoderImplBase which implements these new interfaces. Existing
subclasses of ResponseDecoder are changed to inherit from
ResponseDecoderImplBase.

Additional Description: also make EnvoyQuicClientStream and
ResponseDecoderWrapper to use the handle instead of directly accessing
the passed-in response decoder, guarded by
`envoy.reloadable_features.use_response_decoder_handle`. And when the
decoder is dead, log error or abort based on
`envoy.reloadable_features.abort_when_accessing_dead_decoder`.

We should gradually transit from using raw pointer or reference of
ResponseDecoder to using the handle interface everywhere else. This PR
makes the transition in the 2 classes which we are observing life time
issues today.


Risk Level: low
Testing: existing tests passed
Docs Changes: N/A
Release Notes: Y
Platform Specific Features: N/A
Runtime guard:
envoy.reloadable_features.abort_when_accessing_dead_decoder(default
false) and envoy.reloadable_features.use_response_decoder_handle
Issue: envoyproxy#41060

---------

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
Signed-off-by: Misha Badov <mbadov@google.com>
mbadov pushed a commit to mbadov/envoy that referenced this pull request Sep 22, 2025
envoyproxy#41141)

…f response decoder (envoyproxy#41048)"

This reverts commit 566b192.

It's causing crash at ResponseDecoderWrapper

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
Signed-off-by: Misha Badov <mbadov@google.com>
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Sep 23, 2025
…of response decoder (envoyproxy#41048)"

This reverts commit bfd5b10.

Signed-off-by: Dan Zhang <danzh@google.com>
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Sep 23, 2025
…of response decoder (envoyproxy#41048)"

This reverts commit bfd5b10.

Signed-off-by: Dan Zhang <danzh@google.com>
lucaschimweg pushed a commit to lucaschimweg/envoy that referenced this pull request Sep 23, 2025
envoyproxy#41141)

…f response decoder (envoyproxy#41048)"

This reverts commit 566b192.

It's causing crash at ResponseDecoderWrapper

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
danzh2010 added a commit that referenced this pull request Sep 26, 2025
Commit Message: Reapply "response decoder: add new interfaces for life
time tracking of response decoder (#41048)"

Additional Description: In addition to changes in #41048, this PR added
a new constructor to ResponseDecoderWrapper to take a
ResponseDecoderHandle to avoid accessing the decode object at anytime
without checking its liveliness. It also changes ConnectivityGrid and
HttpConnPoolImplBase to retain an ResponseDecoderHandle rather than
directly referencing to ResponseDecoder during the async connection
establishment.

Added a regression test for the crash observed with the previous change.

Risk Level: medium
Testing: new integration tests
Docs Changes: N/A
Release Notes: Y
Platform Specific Features: N/A
Runtime guard: envoy.reloadable_features.use_response_decoder_handle
default true and
envoy.reloadable_features.abort_when_accessing_dead_decoder default
false

---------

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants