Kubernetes: Handle GOAWAY requests#61142
Merged
rosstimothy merged 1 commit intomasterfrom Nov 11, 2025
Merged
Conversation
d18d544 to
897ac00
Compare
smallinsky
reviewed
Nov 10, 2025
smallinsky
reviewed
Nov 10, 2025
897ac00 to
9ee15d6
Compare
smallinsky
approved these changes
Nov 10, 2025
| rw.Header().Set("Retry-After", "1") | ||
| rw.WriteHeader(http.StatusTooManyRequests) | ||
| return | ||
| } |
Contributor
There was a problem hiding this comment.
nit:
Suggested change
| } | |
| if isHTTP2GoawayError(respErr) { | |
| // When Kubernetes API servers are configured with --goaway-chance they may send | |
| // HTTP/2 GOAWAY frames to distribute load across replicas. | |
| // If a request cannot be automatically retried because the request body was already sent, | |
| // we return HTTP 429 with a Retry-After header to instruct clients to retry. | |
| rw.Header().Set("Retry-After", "1") | |
| rw.WriteHeader(http.StatusTooManyRequests) | |
| return | |
| } |
espadolini
approved these changes
Nov 10, 2025
a6be2da to
9099446
Compare
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
9099446 to
cd5cfe5
Compare
Contributor
|
@rosstimothy See the table below for backport results.
|
This was referenced Nov 11, 2025
rosstimothy
added a commit
that referenced
this pull request
Nov 11, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
rosstimothy
added a commit
that referenced
this pull request
Nov 11, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
rosstimothy
added a commit
that referenced
this pull request
Nov 14, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Nov 14, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
rosstimothy
added a commit
that referenced
this pull request
Nov 17, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
rosstimothy
added a commit
that referenced
this pull request
Nov 17, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
rosstimothy
added a commit
that referenced
this pull request
Nov 17, 2025
Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Nov 17, 2025
* Kubernetes: Handle GOAWAY requests This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests. * Populate GOAWAY response body (#61264) Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Nov 17, 2025
* Kubernetes: Handle GOAWAY requests This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests. * Populate GOAWAY response body (#61264) Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is an attempt to address #57766.
When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header.
This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests. Since we cannot guarantee that every Kubernetes client used by customers will properly retry a request I've opted not close the linked issue as a result of this change. Instead we'll wait for feedback from customers that have been experiencing this issue to see if this truly resolves the problem for them. At which time I'll circle back and close the issue. If this doesn't remediate the problems, then we can pursue more expensive solutions similar to those taken in the linked PRs.
changelog: GOAWAY errors received from Kubernetes API Servers configured with a non-zero --goaway-chance are now forward to clients to be retried.
Manual Testing
Testing was adapted from #57881 and #60695 on a local Kubernetes cluster and an EKS cluster.
--goaway-chance=0.0 prior to this change
--goaway-chance=0.0 with this change
--goaway-chance=0.2 prior to this change
--goaway-chance=0.2 with this change