Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onCancel is not invoked up on server-side deadline expiration #11872

Open
nyukhalov opened this issue Feb 3, 2025 · 4 comments
Open

onCancel is not invoked up on server-side deadline expiration #11872

nyukhalov opened this issue Feb 3, 2025 · 4 comments

Comments

@nyukhalov
Copy link

nyukhalov commented Feb 3, 2025

What version of gRPC-Java are you using?

1.67.1

What is your environment?

OS: MacOS 15.2
JDK: openjdk version "17.0.12" 2024-07-16 LTS

What did you expect to see?

When a request times out on the server, the server invokes the ServerCall.Listener::onCancel callback.

What did you see instead?

The ServerCall.Listener::onCancel callback is not invoked.

Steps to reproduce the bug

  1. Start a grpc-java server with a method that runs for a while (e.g. Thread.sleep(10_000));
  2. Call the method using curl to ensure the client does not send a cancellation signal after the client-side deadline expires.
$ curl  -v -k --raw --http2 \
   -H "Content-Type: application/grpc" \
   -H "TE: trailers" \
   -H "grpc-timeout: 500m" \
   --data-binary @frame.bin \
   https://localhost:8080/com.service/method --output -
  1. Observe the the onCancel callback is not invoked despite the request's context being cancelled correctly with a TimeoutException after 500ms.
@nyukhalov
Copy link
Author

nyukhalov commented Feb 3, 2025

Our issue is that the resource cleanup logic residing in the Listener::onCancel callback isn't triggered on the server after a request timed out and the client was unable to send a cancellation signal. This may happen in a setup where the traffic goes via a proxy (e.g. AWS ALB) which does not forward RST_STREAM frames, hence, the server does not receive client cancellations.

As a workaround, I am considering moving the cleanup logic from the Listener::onCancel callback to a context's CancellationListener which seems to be invoked in all scenarios. However, I find onCancel more appropriate for cleanup logic because it has a more strict contract than CancellationListener (e.g. CancellationListener is invoked after a request completes successfully).

@kannanjgithub
Copy link
Contributor

kannanjgithub commented Feb 4, 2025

Does setting ServerCallStreamObserver.onCloseHandler help? If onCancel is not called, onClose will be called (reference) when the RPC completes from the server perspective.

@nyukhalov
Copy link
Author

nyukhalov commented Feb 7, 2025

A quick update: I started getting mixed results as I investigated further.

Now I can see that in some scenarios onCancel is invoked correctly via the following chain of methods:

  • transportReportStatus() ->
  • closeListener() ->
  • listener().closed() ->
  • JumpToApplicationThreadServerStreamListener::closeInternal() ->
  • Closed ->
  • ServerStreamListenerImpl::closedInternal() ->
  • listener.onCancel()
Image

My current hypothesis is that because onCancel's invokaction is serialized on the callExecutor it runs only after the request handler (business logic) completes, whereas context cancellation happens asynchronously.

If the above is true, is the onCancel invocation serialized by design?

Edit:

in some scenarios onCancel is invoked correctly

My current understanding is that onCancel is always invoked correctly, but when the request handling took too long my test terminated earlier (after registering gRPC context cancellation), hence, I thought onCancel is never invoked.

@kannanjgithub
Copy link
Contributor

Whether callExecutor or cancelExecutor gets used is dependent on the status passed here. For the screenshot you have shown since it shows DEADLINE_EXCEEDED and a reset frame sent from the server, in this case it should be on the cancel executor. For status code Ok which would be the case when the server normally completes a RPC, it will run on the callExecutor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants