Fix Kubernetes load balancing GOAWAY errors by buffering request body#60695
Fix Kubernetes load balancing GOAWAY errors by buffering request body#60695
Conversation
84ecb8c to
a47cc32
Compare
ce110e7 to
0eff6f5
Compare
|
How much disk space and memory can be consumed at any given time? Is this guaranteed to only ever happen in the agent, or will we also buffer things in the proxy? Is the temporary path guaranteed to be ephemeral disk storage in the teleport-kube-agent deployments? |
4571de1 to
34fd919
Compare
@espadolini Based on the our slack discussions, buffering to disk is removed, and a weighted semaphore for in memory buffering is added. |
b893409 to
d34e1eb
Compare
| // RetryBufferTotal | ||
| if f.RetryBufferTotal <= 0 { | ||
| if env := os.Getenv(envRetryBufferTotal); env != "" { | ||
| if val, err := strconv.ParseInt(env, 10, 64); err == nil && val > 0 { | ||
| f.RetryBufferTotal = val | ||
| } | ||
| } | ||
| } | ||
| if f.RetryBufferTotal <= 0 { | ||
| f.RetryBufferTotal = defaultRetryBufferTotal | ||
| } | ||
| // RetryBufferPerRequest | ||
| if f.RetryBufferPerRequest <= 0 { | ||
| if env := os.Getenv(envRetryBufferPerRequest); env != "" { | ||
| if val, err := strconv.ParseInt(env, 10, 64); err == nil && val > 0 { | ||
| f.RetryBufferPerRequest = val | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
In what scenarios would users need to alter these values? If we are allowing users to edit these fields, should we also allow them to explicitly opt out of this change?
There was a problem hiding this comment.
RetryBufferTotal would be increased if requests are being blocked at a high rate. RetryBufferPerRequest would be increased if large payloads are not allowed to be transferred. That scenario seems highly unlikely though.
I'm in the process I added allowing the feature to be disabled by setting RetryBufferTotal / TELEPORT_UNSTABLE_KUBE_RETRY_BUFFER_TOTAL to zero or negative.
GCP and Azure don't explicitly mention using the --goaway-chance flag after some searching. The issue may be specific to AWS, and disabling may be beneficial for some customers.
|
Is it possible to detect that a Kubernetes cluster is configured with a 0% goaway chance and disable this feature for requests to it? |
I don't think so. The Kubernetes |
| // If the connection closed in the middle of sending | ||
| // due to a GOAWAY, read remaining data for a retry attempt. | ||
| remaining, readErr := io.ReadAll(rb.src) | ||
| if readErr != nil && !errors.Is(readErr, io.EOF) && rb.readErr == nil { | ||
| rb.readErr = readErr | ||
| } | ||
| if len(remaining) > 0 { | ||
| rb.buf.Write(remaining) | ||
| } | ||
|
|
||
| // Close source and mark as done. | ||
| err := rb.src.Close() | ||
| rb.src = nil | ||
| rb.cond.Broadcast() |
There was a problem hiding this comment.
We don't need to read the full request here, and we definitely must not not block Close on I/O from the original client while we buffer the request, especially because we might not retry at all, depending on why the body is getting closed before reaching the end. The second body can just read from the buffer until the end of the buffer, then read (and buffer) from the source.
| "error", err, | ||
| ) | ||
| } | ||
| rt.semaphore.Release(req.ContentLength) |
There was a problem hiding this comment.
This is still happening while Body and GetBody exist and rb.buf is holding on to its buffer.
There was a problem hiding this comment.
Changed to release in runtime finalizer.
There was a problem hiding this comment.
Using a finalizer (which shouldn't be used because go 1.24+ has cleanups - and cleanups shouldn't be used either) just means that we are going to hold on to the buffer and the memory even while we're streaming the response back to the client - and an arbitrary amount of time after that up to forever, since finalizers and cleanups are outright not guaranteed to actually run.
To solve this and the full read on close problem we should probably have a background goroutine that reads from the original body and fills in the buffer and some refcounting of bodies to close the original body, drop the buffer and release the semaphore when all bodies are closed and the call to the inner RoundTrip returns.
(technically there should also be a guarantee that the original body is guaranteed not to be interacted with anymore after the response body is closed but that is extreme pedantry and we are the only users of this anyway)
There was a problem hiding this comment.
By the way, this is literally the "output log and streaming" portion of the backend coding challenge, with cleanup at the end and with the constraint of having to use io.ReadClosers for readers.
espadolini
left a comment
There was a problem hiding this comment.
I don't think it's ok to have this enabled by default in v18, it can be a significant increase in memory usage depending on the workload.
|
For those rare requests that hit the GOAWAY condition, can't we just return a 429 with |
…body Kubernetes API servers send HTTP/2 GOAWAY errors to redistribute load across replicas for up to 2% of requests. Setting the request's `GetBody` function enables automatic retries. For HTTP/2 requests, `GetBody` is set, and request bodies are incrementally buffered and accumulated as reads occur. In the case of GOAWAY errors occurring mid-send, body buffering is completed before closing. HTTP/1.1 protocol upgrades are not buffered, since they wouldn't receive HTTP/2 GOAWAY errors. A weighted semaphore limits total concurrent buffering to prevent OOM. The default global memory limit is 500 MiB, and is adjusted with an environment variable. Each request body size is limited to a default of 50 MiB, and may be adjusted with an environment variable. Changes: - Added `retryableTransport` and `retryBuffer` enabling incremental request body buffering - Added a weighted semaphore limiting total concurrent buffer size to 500 MiB by default - Added a per-request buffer size limit with a 50 MiB default - Added tunable parameters `RetryBufferTotal` and `RetryBufferPerRequest` - Added environment variable `TELEPORT_UNSTABLE_KUBE_RETRY_BUFFER_TOTAL` - Added environment variable `TELEPORT_UNSTABLE_KUBE_RETRY_BUFFER_PER_REQ` - Added unit tests Fixes #57766 Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com> Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
5f8d054 to
bb0a216
Compare
This is an attempt to fix #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to resolve #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to resolve #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to resolve #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests.
* Kubernetes: Handle GOAWAY requests This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests. * Populate GOAWAY response body (#61264) Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
* Kubernetes: Handle GOAWAY requests This is an attempt to address #57766. When a request is terminated because the upstream Kubernetes API Server GOAWAY chance is exceeded, clients are informed to retry by replying with a 429 status code and a Retry-After header. This deviates from the approaches taken in #57881 and #60695 to favor simplicity and avoid buffering request data in a teleport process. The downside to this approach is that it requires clients to properly handle retry requests. * Populate GOAWAY response body (#61264) Follow up to #61142 which sets the response body so that clients which only look at the reason and not the headers will behave appropriately.
Kubernetes API servers send HTTP/2 GOAWAY errors to redistribute load across replicas for up to 2% of requests. Setting the request's
GetBodyfunction enables automatic retries.For HTTP/2 requests,
GetBodyis set, and request bodies are incrementally buffered and accumulated as reads occur. In the case of GOAWAY errors occurring mid-send, body buffering is completed before closing. HTTP/1.1 protocol upgrades are not buffered, since they wouldn't receive HTTP/2 GOAWAY errors.A weighted semaphore limits total concurrent buffering to prevent OOM. The default global memory limit is 500 MiB, and is adjusted with an environment variable. Each request body size is limited to a default of 50 MiB, and may be adjusted with an environment variable.
In this PR:
retryableTransportandretryBufferenabling incremental request body bufferingRetryBufferTotalandRetryBufferPerRequestTELEPORT_UNSTABLE_KUBE_RETRY_BUFFER_TOTALTELEPORT_UNSTABLE_KUBE_RETRY_BUFFER_PER_REQFixes:
Changelog: Fixed intermittent connection errors when accessing Kubernetes clusters, particularly EKS 1.27+
Manual Testing
A
test-load-balanceapp was written to reproduce the load balancingGOAWAYerrors with Kubernetes.Three manual tests were run and compared.
GOAWAYerrors were seen.GOAWAYerrors were seen.GOAWAYerrors were seen.Test Runs
Latency Comparison
Test app
A
test-load-balanceapp exercises Kubernetes operations to display load balancinggoaway-chanceerrors. It records statistics for display and test comparison.Kind config
A
kindconfiguration file sets the load balancinggoaway-chanceto the maximum of 2%. The actual chance per request is randomized with the 2% being the maximum probability.Teleport config
The Teleport config file used during testing.
Teleport role config
A Teleport role config file granting Kubernetes administrator permissions which enable the test app to run.
Manual Testing
Environment:
Macos, Kind, Kubernetes v1.34.01. Kind setup
The intent is to setup a local Kubernetes cluster with the load balancing option
--goaway-chance=0.02turned on.kindand thekind.yamlconfiguration file.--goaway-chance=0.02flag.> kubectl create namespace teleport-test namespace/teleport-test created2. Test run (no Teleport)
The intent is to see load balancing errors with the
test-load-balanceapp and Kubernetes. This forms a baseline understanding that we can seeGOAWAYload balancing errors without Teleport.test-load-balanceapp. Load balancingGOAWAYerrors are seen.3. Test run (Teleport, no fix)
The intent is to see load balancing errors through Teleport, without the bug fix. We'll compare this test run with a run with the bug fix.
sudo rm -rf /var/lib/teleport sudo mkdir -p -m0700 /var/lib/teleport sudo chown $USER /var/lib/teleportmasterbranch. No bug fix present.Teleport started with the
teleport.yamlconfig file, and is configured withauth + proxy + kube agent.Looking at the Teleport terminal output, Kubernetes health checks show the local kind cluster
kind-load-balanceis healthy.kind-load-balance.test-load-balanceapp. Load balancingGOAWAYerrors are seen.4. Test run (Teleport, fix applied)
The intent is to see no load balancing errors through Teleport when the bug fix is applied. The existing backend database is reused.
rana/kube-retryable-transportbranch.Teleport started with the same
teleport.yamlconfig file.Looking at the Teleport terminal output, Kubernetes health checks show the local kind cluster
kind-load-balanceis healthy.test-load-balanceapp 5 times. No load balancingGOAWAYerrors were seen. The last run is shown../test-load-balance compare \ -files baseline.json,without-fix.json,with-fix-run5.json \ -markdown > test-results.md