Skip to content

[v16] Fixes Kubernetes Service using expired credentials#50198

Merged
tigrato merged 2 commits intobranch/v16from
bot/backport-50074-branch/v16
Dec 13, 2024
Merged

[v16] Fixes Kubernetes Service using expired credentials#50198
tigrato merged 2 commits intobranch/v16from
bot/backport-50074-branch/v16

Conversation

@tigrato
Copy link
Copy Markdown
Contributor

@tigrato tigrato commented Dec 13, 2024

Backport #50074 to branch/v16

changelog: Fixes an intermittent EKS authentication failure when dealing with EKS auto-discovery.

The Kubernetes service occasionally fails to forward requests to EKS clusters or retrieve the cluster schema due to AWS rejecting the request with an "expired token" error.

EKS access tokens are generated using STS presigned URLs, which include details such as the cluster, backend credentials, and assumed roles. By default, these tokens are valid for 15 minutes, and the Kubernetes service refreshes them every $(15 - 1) / 2 = 7\text{ }minutes$.
However, our cloud SDK caches the underlying `aws.Session`, particularly those with assumed roles, for 15 minutes.

This leads to a scenario where the token is refreshed a second time at approximately 14 minutes, close to the token's 15-minute validity. If the underlying credentials expire before the next token refresh, given that they were reused from the previous query and cached since then, it  results in the Kubernetes Service considering the token valid (since it is a Base64-encoded presigned URL without knowledge about the credentials), but AWS EKS cluster rejects the request, treating the credentials as expired.

This PR adds an option to disable cache for EKS STS token signing which results in creating a session per EKS cluster sign process.

Bellow one can find the error message EKS returns.
```
2024-12-09T17:00:15Z ERRO [KUBERNETE] Failed to update cluster schema error:[
ERROR REPORT:
Original Error: *errors.StatusError the server has asked for the client to provide credentials
Stack Trace:
	github.com/gravitational/teleport/lib/kube/proxy/scheme.go:140 github.com/gravitational/teleport/lib/kube/proxy.newClusterSchemaBuilder
	github.com/gravitational/teleport/lib/kube/proxy/cluster_details.go:193 github.com/gravitational/teleport/lib/kube/proxy.newClusterDetails.func1
	runtime/asm_amd64.s:1695 runtime.goexit
User Message: the server has asked for the client to provide credentials] pid:7.1 start_time:2024-12-09T17:00:15Z proxy/cluster_details.go:210
2024-12-09T17:00:24Z ERRO [KUBERNETE] Failed to update cluster schema  error:[
ERROR REPORT:
Original Error: *errors.StatusError the server has asked for the client to provide credentials
Stack Trace:
	github.com/gravitational/teleport/lib/kube/proxy/scheme.go:140 github.com/gravitational/teleport/lib/kube/proxy.newClusterSchemaBuilder
	github.com/gravitational/teleport/lib/kube/proxy/cluster_details.go:193 github.com/gravitational/teleport/lib/kube/proxy.newClusterDetails.func1
	runtime/asm_amd64.s:1695 runtime.goexit
User Message: the server has asked for the client to provide credentials] pid:7.1 start_time:2024-12-09T17:00:24Z proxy/cluster_details.go:210
```

Changelog: Fixes an intermittent EKS authentication failure when dealing with EKS auto-discovery.

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
@aws-amplify-us-west-2
Copy link
Copy Markdown

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-50198.d1v2yqnl3ruxch.amplifyapp.com

@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from creack December 13, 2024 12:59
@tigrato tigrato enabled auto-merge December 13, 2024 14:26
@tigrato tigrato added this pull request to the merge queue Dec 13, 2024
Merged via the queue into branch/v16 with commit cfb3726 Dec 13, 2024
@tigrato tigrato deleted the bot/backport-50074-branch/v16 branch December 13, 2024 22:01
@doggydogworld doggydogworld mentioned this pull request Dec 18, 2024
@fheinecke fheinecke mentioned this pull request Apr 9, 2025
@fheinecke fheinecke mentioned this pull request Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants