-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
HTTP/2 pings are used to keep connections from being completely idle, and to detect dead ones.
With long lived requests that have no implied timeout (e.g. gRPC streaming), they are the way to eventually detect and abort unresponsive servers.
The way our logic works today, we'll only run heartbeat logic on HTTP/2 connections that are in the _availableHttp2Connections
list.
This means that pings are broken in the following scenarios:
- The connection reached its stream limit and was temporarily removed from the available list.
- We received a GOAWAY frame indicating that existing requests will be processed. The first thing we do when receiving the frame is to mark the connection as shut down, and remove it from the available pool.
- If the
SocketsHttpHandler
instance is disposed. Today any existing requests on the instance will continue as normal, but we're also stopping the heartbeat timer and therefore breaking pings. This case could be looked at as user error to a degree though.
The 2nd issue was hit by a user who reported it on Discord (after lots of back and forth to figure out why things aren't working as expected). In their case they are using long-lived gRPC streams, and configured HTTP/2 ping keep alives and timeouts. When the backend service was restarting/scaling down/failing etc., they observed seeing a GOAWAY frame, sometimes eventually followed by the request failing due to the connection observing an EOF.
But sometimes, they would receive a GOAWAY frame and then nothing happened, all requests appearing stuck forever. This is the case where we might not notice that a TCP connection is dead until we try writing to it / enforce a reading timeout - precisely what HTTP/2 pings are supposed to do.
Example traces
