Skip to content

Enable HTTP caching for Kubernetes OIDC joining#57789

Closed
timothyb89 wants to merge 4 commits intomasterfrom
timothyb89/oidc-caching-k8s-validator
Closed

Enable HTTP caching for Kubernetes OIDC joining#57789
timothyb89 wants to merge 4 commits intomasterfrom
timothyb89/oidc-caching-k8s-validator

Conversation

@timothyb89
Copy link
Copy Markdown
Contributor

@timothyb89 timothyb89 commented Aug 11, 2025

This enables HTTP caching for the Kubernetes OIDC join subtype.

This adds a new ValidateTokenWithClient helper that can be used to provide a custom client for anything currently using our shared OIDC validation helper. Clients just need to swap in a caching *http.Client; this PR only updates the Kubernetes OIDC join method for backporting purposes, but it should be straightforward to enable caching on other join methods in the future.

This relies on caching at the HTTP client layer as we otherwise don't have any ability to cache responses without significant tweaks to our OIDC library. This makes use of gregjones/httpcache which is already in our dependency tree; see this thread for some additional library selection notes.

changelog: Cache OIDC responses for Kubernetes OIDC joining

This enables HTTP caching for the Kubernetes OIDC join subtype.

This adds a new `ValidateTokenWithClient` helper that can be used to
provide a custom client for anything currently using our shared OIDC
validation helper. Clients just need to swap in a caching
`*http.Client`.

changelog: Cache OIDC responses for Kubernetes OIDC joining
Comment thread go.mod
github.com/gravitational/roundtrip v1.0.2
github.com/gravitational/teleport/api v0.0.0
github.com/gravitational/trace v1.5.1
github.com/gregjones/httpcache v0.0.0-20190611155906-901d90724c79
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some library usage notes:

  • gregjones/httpcache is deprecated and archived, however...
  • We already include this library via the Kubernetes client library
  • It otherwise appears to be fit for purpose, and no other similarly mature libraries exist (bartventer/httpcache exists but is much less mature)
  • It is MIT licensed

I think we can switch to an alternative if/when we find a compelling reason to do so - it should just mean swapping out the implementation we return in NewCachingHTTPClient().

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use a third party library for caching jwks?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we certainly could build everything ourselves, but we'd need to rewrite a meaningful portion of zitadel/oidc to support caching natively, and then still build our own RFC 9111-compliant HTTP client anyway since the only way for providers to provide cache hints is through standard HTTP caching headers. We could disregard cache hints and just set a modest value, but then we're still stuck making fairly sweeping changes to our OIDC library.

From what I can tell, just using a caching *http.Client achieves fundamentally the same thing with very little effort required on our end. That said, I'd be very happy to hear suggestions if there's simpler alternatives I've missed!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that the memory cache is unbounded, and entries are only cleared upon retries. Since the cache key is based on the URL, if we ever introduce dynamic fields in the JWKS endpoint-such as a timestamp-or encounter an unusually large number of OIDC tokens, memory usage could grow significantly.

Copy link
Copy Markdown
Contributor Author

@timothyb89 timothyb89 Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a fair point. We could use a LRU cache backend instead, https://github.com/die-net/lrucache is linked directly from gregjones/httpcache - though similar maintenance caveats apply - or I could build one off github.com/hashicorp/golang-lru which we already include.

That said, I don't expect many problems here in practice:

  • OIDC endpoints themselves are user specified and cannot change regularly
  • JWKS URIs technically could change but zero providers I'm currently aware of do so.

I do agree that an LRU cache of some sort is much better but this trivial implementation is unlikely to actually grow unbounded in practice. In any case, I'll look into the best path for evicting entries.

@timothyb89 timothyb89 marked this pull request as ready for review August 12, 2025 03:10
@github-actions github-actions bot requested a review from creack August 12, 2025 03:10
@timothyb89 timothyb89 requested a review from strideynet August 12, 2025 03:11
timothyb89 added a commit that referenced this pull request Aug 12, 2025
This partially backports various dependencies of the original change.
Note that this is not a clean backport, and minor code and
organizational changes were made to facilitate this backport,
including copying some OIDC code directly into the token validator.

Additionally, caching support from #57789 was integrated directly. It
inherently depends on the same OIDC code that required modification,
so it has been pulled in directly.
timothyb89 added a commit that referenced this pull request Aug 13, 2025
This adds a generic caching OIDC validator which is initially used
to cache Kubernetes OIDC requests. This takes a simpler approach to
the one explored in #57789 and explicitly caches the two resources
in question:
- We cache the OIDC discovery configuration for an hour
- We cache the JWKS keyset for 24 hours, and additionally allow the
  library to cache internally. It currently has no internal method
  for invalidating old keys, so we purge the `oidc.KeySet` regularly.
  However, upon encountering an unknown `kid`, the library will
  attempt to fetch new keys on its own.

This validator is designed to be generic and should be easy to
apply to other users of our existing OIDC helper.
@timothyb89
Copy link
Copy Markdown
Contributor Author

Closing this in favor of a different approach in #57862: caching at the HTTP layer doesn't have desirable "refresh on unknown key ID" behavior and the library doesn't provide any sane way of building that beyond what it already provides.

@timothyb89 timothyb89 closed this Aug 14, 2025
timothyb89 added a commit that referenced this pull request Aug 14, 2025
I've abandoned the caching approach in #57789, so this PR will proceed
without caching, as in branch/v18.
github-merge-queue bot pushed a commit that referenced this pull request Aug 22, 2025
* [v18] Add new `oidc` subtype for Kubernetes joining (#57538) (#57683)

* MWI: Add `oidc` subtype for Kubernetes joining

This adds a new `oidc` subtype for Kubernetes joining, which allows
workloads to join using a Kubernetes service account token (JWT)
as usual, but verifies it using an OIDC flow rather than a static JWKS
keyset. This should better support platforms like EKS where the JWKS
keyset is rotated regularly.

* Validate OIDC subtype properly

* Initialize OIDC validator at startup

* Fix OIDC claims implementation

* Force use of cluster name as audience

* Remove authorized party workaround

* Reintroduce kubernetes claims check

* Remove unused constant

* Update generated operator resources

* Update generated TF docs

* Add validator tests

* Improve proto docs, enforce HTTP issuers, and add insecure override

* Update autogenerated docs

* Partially backport OIDC changes to fix build, include #57789

This partially backports various dependencies of the original change.
Note that this is not a clean backport, and minor code and
organizational changes were made to facilitate this backport,
including copying some OIDC code directly into the token validator.

Additionally, caching support from #57789 was integrated directly. It
inherently depends on the same OIDC code that required modification,
so it has been pulled in directly.

* Remove caching support from the backport

I've abandoned the caching approach in #57789, so this PR will proceed
without caching, as in branch/v18.
github-merge-queue bot pushed a commit that referenced this pull request Sep 25, 2025
* Add a caching OIDC validator for use with Kubernetes OIDC

This adds a generic caching OIDC validator which is initially used
to cache Kubernetes OIDC requests. This takes a simpler approach to
the one explored in #57789 and explicitly caches the two resources
in question:
- We cache the OIDC discovery configuration for an hour
- We cache the JWKS keyset for 24 hours, and additionally allow the
  library to cache internally. It currently has no internal method
  for invalidating old keys, so we purge the `oidc.KeySet` regularly.
  However, upon encountering an unknown `kid`, the library will
  attempt to fetch new keys on its own.

This validator is designed to be generic and should be easy to
apply to other users of our existing OIDC helper.

* Manage validator instances by issuer; use in Kubernetes validator

This adds an indirection layer to automatically manage validator
instances as cached data is specific to a particular (issuer,
audience) combination.

Additionally, this swaps the new caching validator in in place of the
standard non-caching validator.

* Fix comments and add missing mutex lock

* Add tests for the caching validator

* Fix lints

* Add logging

* Fix imports

* Code review suggestions: atomic expiry, misc fixes

This changes `validatorExpires` to be an atomic int64, which removes
a locks from `Expires()` / `IsStale()` and ensures an ongoing pruning
won't block `ValidateToken()`.

* Replace validator cache and pruning with `utils.FnCache`

This swaps the hand-rolled cache for validators with utils.FnCache,
since it's more robust and its TTL mechanism works properly for our
needs.

* Remove commented out test code

* Fix build errors and enable ReloadOnErr

* Code review suggestions
timothyb89 added a commit that referenced this pull request Oct 29, 2025
* Add a caching OIDC validator for use with Kubernetes OIDC

This adds a generic caching OIDC validator which is initially used
to cache Kubernetes OIDC requests. This takes a simpler approach to
the one explored in #57789 and explicitly caches the two resources
in question:
- We cache the OIDC discovery configuration for an hour
- We cache the JWKS keyset for 24 hours, and additionally allow the
  library to cache internally. It currently has no internal method
  for invalidating old keys, so we purge the `oidc.KeySet` regularly.
  However, upon encountering an unknown `kid`, the library will
  attempt to fetch new keys on its own.

This validator is designed to be generic and should be easy to
apply to other users of our existing OIDC helper.

* Manage validator instances by issuer; use in Kubernetes validator

This adds an indirection layer to automatically manage validator
instances as cached data is specific to a particular (issuer,
audience) combination.

Additionally, this swaps the new caching validator in in place of the
standard non-caching validator.

* Fix comments and add missing mutex lock

* Add tests for the caching validator

* Fix lints

* Add logging

* Fix imports

* Code review suggestions: atomic expiry, misc fixes

This changes `validatorExpires` to be an atomic int64, which removes
a locks from `Expires()` / `IsStale()` and ensures an ongoing pruning
won't block `ValidateToken()`.

* Replace validator cache and pruning with `utils.FnCache`

This swaps the hand-rolled cache for validators with utils.FnCache,
since it's more robust and its TTL mechanism works properly for our
needs.

* Remove commented out test code

* Fix build errors and enable ReloadOnErr

* Code review suggestions
github-merge-queue bot pushed a commit that referenced this pull request Oct 30, 2025
…60711)

* Add a caching OIDC validator for use with Kubernetes OIDC

This adds a generic caching OIDC validator which is initially used
to cache Kubernetes OIDC requests. This takes a simpler approach to
the one explored in #57789 and explicitly caches the two resources
in question:
- We cache the OIDC discovery configuration for an hour
- We cache the JWKS keyset for 24 hours, and additionally allow the
  library to cache internally. It currently has no internal method
  for invalidating old keys, so we purge the `oidc.KeySet` regularly.
  However, upon encountering an unknown `kid`, the library will
  attempt to fetch new keys on its own.

This validator is designed to be generic and should be easy to
apply to other users of our existing OIDC helper.

* Manage validator instances by issuer; use in Kubernetes validator

This adds an indirection layer to automatically manage validator
instances as cached data is specific to a particular (issuer,
audience) combination.

Additionally, this swaps the new caching validator in in place of the
standard non-caching validator.

* Fix comments and add missing mutex lock

* Add tests for the caching validator

* Fix lints

* Add logging

* Fix imports

* Code review suggestions: atomic expiry, misc fixes

This changes `validatorExpires` to be an atomic int64, which removes
a locks from `Expires()` / `IsStale()` and ensures an ongoing pruning
won't block `ValidateToken()`.

* Replace validator cache and pruning with `utils.FnCache`

This swaps the hand-rolled cache for validators with utils.FnCache,
since it's more robust and its TTL mechanism works properly for our
needs.

* Remove commented out test code

* Fix build errors and enable ReloadOnErr

* Code review suggestions
rhammonds-teleport pushed a commit that referenced this pull request Nov 6, 2025
* Add a caching OIDC validator for use with Kubernetes OIDC

This adds a generic caching OIDC validator which is initially used
to cache Kubernetes OIDC requests. This takes a simpler approach to
the one explored in #57789 and explicitly caches the two resources
in question:
- We cache the OIDC discovery configuration for an hour
- We cache the JWKS keyset for 24 hours, and additionally allow the
  library to cache internally. It currently has no internal method
  for invalidating old keys, so we purge the `oidc.KeySet` regularly.
  However, upon encountering an unknown `kid`, the library will
  attempt to fetch new keys on its own.

This validator is designed to be generic and should be easy to
apply to other users of our existing OIDC helper.

* Manage validator instances by issuer; use in Kubernetes validator

This adds an indirection layer to automatically manage validator
instances as cached data is specific to a particular (issuer,
audience) combination.

Additionally, this swaps the new caching validator in in place of the
standard non-caching validator.

* Fix comments and add missing mutex lock

* Add tests for the caching validator

* Fix lints

* Add logging

* Fix imports

* Code review suggestions: atomic expiry, misc fixes

This changes `validatorExpires` to be an atomic int64, which removes
a locks from `Expires()` / `IsStale()` and ensures an ongoing pruning
won't block `ValidateToken()`.

* Replace validator cache and pruning with `utils.FnCache`

This swaps the hand-rolled cache for validators with utils.FnCache,
since it's more robust and its TTL mechanism works properly for our
needs.

* Remove commented out test code

* Fix build errors and enable ReloadOnErr

* Code review suggestions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants