Skip to content

Wrap http transport with retry in kube proxy to handle GOAWAY.#57881

Closed
creack wants to merge 4 commits intomasterfrom
creack/proxy-retry
Closed

Wrap http transport with retry in kube proxy to handle GOAWAY.#57881
creack wants to merge 4 commits intomasterfrom
creack/proxy-retry

Conversation

@creack
Copy link
Copy Markdown
Contributor

@creack creack commented Aug 14, 2025

Fixes #57766

Adding test is quite tricky, need a Kube server with the --goaway-chance flag but it is limited to 2% max.

Go script to reproduce the issue / confirm the fix
package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"path/filepath"
	"strings"
	"sync"
	"time"

	v1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes"
	"k8s.io/client-go/tools/clientcmd"
)

func main() {
	ctx := context.Background()

	kubeconfig := filepath.Join(os.Getenv("HOME"), ".kube", "config")
	config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
	if err != nil {
		log.Fatalf("Failed to build config: %s.", err)
	}

	// Increase limits to avoid rate limiting.
	config.QPS = 100
	config.Burst = 200

	clientset, err := kubernetes.NewForConfig(config)
	if err != nil {
		log.Fatalf("Failed to create clientset: %s.", err)
	}

	const numWorkers = 20
	const opsPerWorker = 50

	var wg sync.WaitGroup
	for workerID := range numWorkers {
		wg.Add(1)
		go func() {
			defer wg.Done()

			for i := range opsPerWorker {
				cm := &v1.ConfigMap{
					ObjectMeta: metav1.ObjectMeta{
						Name:      fmt.Sprintf("test-cm-%d-%d-%d", workerID, i, time.Now().Unix()),
						Namespace: "default",
					},
					Data: map[string]string{
						"key":    fmt.Sprintf("value-%d", i),
						"worker": fmt.Sprintf("%d", workerID),
						"data":   strings.Repeat("x", 1024), // Add some data to make the body larger.
					},
				}

				created, err := clientset.CoreV1().ConfigMaps("default").Create(ctx, cm, metav1.CreateOptions{})
				if err != nil {
					if isGoawayError(err) {
						fmt.Printf("[Worker %d] GOAWAY on CREATE: %s.\n", workerID, err)
					}
					continue
				}

				created.Data["updated"] = "true"
				created.Data["timestamp"] = time.Now().String()
				if _, err := clientset.CoreV1().ConfigMaps("default").Update(ctx, created, metav1.UpdateOptions{}); err != nil {
					if isGoawayError(err) {
						fmt.Printf("[Worker %d] GOAWAY on UPDATE: %s.\n", workerID, err)
					}
				}

				if err := clientset.CoreV1().ConfigMaps("default").Delete(ctx, cm.Name, metav1.DeleteOptions{}); err != nil {
					if isGoawayError(err) {
						fmt.Printf("[Worker %d] GOAWAY on DELETE: %s.\n", workerID, err)
					}
				}

				// No delay - stress the connection.
			}
		}()
	}

	wg.Wait()
}

func isGoawayError(err error) bool {
	if err == nil {
		return false
	}

	errStr := err.Error()
	goawayPatterns := []string{
		"cannot retry err",
		"GOAWAY",
		"http2: Transport received Server's graceful shutdown",
		"after Request.Body was written",
		"graceful shutdown",
	}

	for _, pattern := range goawayPatterns {
		if strings.Contains(errStr, pattern) {
			fmt.Printf("<<< % #v -- %T\n", err, err)
			return true
		}
	}

	return false
}
Kind config with goaway-chance
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: teleport-goaway-test
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: ClusterConfiguration
    apiServer:
      extraArgs:
        goaway-chance: "0.02"
        v: "2"
kind create cluster --config goaway.yaml

changelog: Add retry logic in kube proxy to handle EKS goaway chance

@creack creack marked this pull request as ready for review August 14, 2025 02:53
@github-actions github-actions bot requested review from bl-nero and eriktate August 14, 2025 03:02
@creack creack force-pushed the creack/proxy-retry branch from 86c5f51 to 0a06ea2 Compare August 14, 2025 03:18
@creack creack marked this pull request as draft August 15, 2025 04:58
// transportWrapper wraps an http.RoundTripper and ensures requests with bodies
// are retryable by setting GetBody when possible. This allows the underlying
// transport (typically http2.Transport) to automatically retry on GOAWAY.
type transportWrapper struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move this under the forward library so if App access needs to use it's available there and you can enabled. I didn't enabled because I was afraid of breaking anything without running the test plan

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think while there may be a case to make it available widely under the forward lib, here, we discriminate on which endpoints get the retryable handling which is quite specific to kube proxy. To make it somewhere else we'd need to add an option to pass a function that checks whether or not to apply the logic, which is a lot of overhead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start simple and iterate here and keep this limited to Kubernetes for now.

@bl-nero bl-nero removed their request for review August 18, 2025 13:08
@creack creack force-pushed the creack/proxy-retry branch from a566bdb to 6ecb987 Compare September 8, 2025 14:02
@creack creack requested a review from tigrato September 8, 2025 14:14
@creack creack marked this pull request as ready for review September 8, 2025 14:14
@ravicious ravicious removed their request for review September 8, 2025 15:26
// transportWrapper wraps an http.RoundTripper and ensures requests with bodies
// are retryable by setting GetBody when possible. This allows the underlying
// transport (typically http2.Transport) to automatically retry on GOAWAY.
type transportWrapper struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start simple and iterate here and keep this limited to Kubernetes for now.

// transportWrapper wraps an http.RoundTripper and ensures requests with bodies
// are retryable by setting GetBody when possible. This allows the underlying
// transport (typically http2.Transport) to automatically retry on GOAWAY.
type transportWrapper struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Rename this to something less generic and more descriptive.

// are retryable by setting GetBody when possible. This allows the underlying
// transport (typically http2.Transport) to automatically retry on GOAWAY.
type transportWrapper struct {
http.RoundTripper
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Remove the embedding if possible.

Comment on lines +111 to +114
// Check for log streaming / watch operation. Those can be HTTP/2 and susceptible to the GOAWAY issue,
// however, as they are long-running, the only way to recover would be to dump the body to disc,
// resulting in concerns about infinitely growing file, and privacy/security as we deal with unencrypted
// traffic and we don't want to store that somewher, even if it is a tmpfs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, is this accurate? Aren't we storing the request body, and not the response body? Why would this be any less of a problem for requests which are not for /log or are for a watch?

Copy link
Copy Markdown
Contributor

@tigrato tigrato Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all endpoints - watch requests included - suffer from this issue. only HTTP/1 requests (such as WebSocket or SPDY) do not and a simple protocol check avoids the problem.

In the Kubernetes codebase, every other type of request is required to retry the request body - for them it's not a problem because requests are bytes.Buffer and http client already populates the getbody after a type assertion.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically kubernetes added this feature for watch requests

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asking why the comment indicated that log and watches needed special handling. If we are only caching the request and not the responses, then does it really matter that the response for a long lived request could be very large? Shouldn't this approach work for all requests?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain what I meant.

The reason why goway-chance in Kubernetes was introduced was to close long-lived connections to the same kubernetes api server and balance the incoming connections through all replicas - this was designed specially for watch streams.

Looking at Kuberentes implementation, you can see that only HTTP2 requests might receive the go away - connection close. This means that any HTTP1 request won't receive the connection close header.

This HTTP2 filter, excludes long-lived bidirectional connections such as kubectl exec, kubectl port-forward, kubectl cp where the user can send very large payloads since all of these use HTTP1 + WebSocket/SPDY.

Long lived read only - such as watch or log streams - aren't affected as well because the GO HTTP2 client only retries the request body - means we only need to record the request body and not the response. This makes our life simpler given that Teleport only needs to be able to reset the request body and never the response body. Request bodies in Kubernetes are way smaller than response for any endpoint apart for the bidi-directional streams mentioned above.

In theory, we can record the body into memory and retry - the go HTTP2 client already does that automatically if you provide a GetBody function. In reality we can make a threshold of 1MB that goes into memory and if the body is bigger we write to a temp file.

I still think the approach that this PR tries to implement isn't the best - load the body in advance, trim the body... and we can definitively improve it. This PR also includes HTTP1 requests which will cause some problems specially with connection upgrade mechanisms.

rosstimothy added a commit that referenced this pull request Nov 6, 2025
This is an attempt to fix #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
rosstimothy added a commit that referenced this pull request Nov 7, 2025
This is an attempt to resolve #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
rosstimothy added a commit that referenced this pull request Nov 7, 2025
This is an attempt to resolve #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
rosstimothy added a commit that referenced this pull request Nov 10, 2025
This is an attempt to resolve #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
rosstimothy added a commit that referenced this pull request Nov 11, 2025
This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
rosstimothy added a commit that referenced this pull request Nov 11, 2025
This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
github-merge-queue bot pushed a commit that referenced this pull request Nov 11, 2025
This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
backport-bot-workflows bot pushed a commit that referenced this pull request Nov 11, 2025
This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
backport-bot-workflows bot pushed a commit that referenced this pull request Nov 11, 2025
This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.
github-merge-queue bot pushed a commit that referenced this pull request Nov 17, 2025
* Kubernetes: Handle GOAWAY requests

This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.

* Populate GOAWAY response body (#61264)

Follow up to #61142 which
sets the response body so that clients which only look at the reason and
not the headers will behave appropriately.
github-merge-queue bot pushed a commit that referenced this pull request Nov 17, 2025
* Kubernetes: Handle GOAWAY requests

This is an attempt to address #57766.

When a request is terminated because the upstream Kubernetes API
Server GOAWAY chance is exceeded, clients are informed to retry
by replying with a 429 status code and a Retry-After header.

This deviates from the approaches taken in
#57881 and
#60695 to favor
simplicity and avoid buffering request data in a teleport process.
The downside to this approach is that it requires clients to properly
handle retry requests.

* Populate GOAWAY response body (#61264)

Follow up to #61142 which
sets the response body so that clients which only look at the reason and
not the headers will behave appropriately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kube Agent should handle GOAWAY

3 participants