Relax Kubernetes CRD discovery when building cache#36214
Merged
Conversation
AntonAM
approved these changes
Jan 2, 2024
Teleport Kubernetes Service has a monitor that constantly watches the Resources registered in the Kubernetes Cluster via API Discovery. The goal is to keep an up-to-date representation of all resources existing in the cluster in order to be able to register them for Teleport per-Resource RBAC. Having an up-to-date represenation allows us to unmarshal the API responses and filter them when the custom resources are local. When the Kubernetes APIs are registered using non-local services - i.e. the API is served by a POD running within the cluster like metrics API - and those services aren't healthy - i.e. pod not running, invalid selector, cluster has no nodes - the discovery watcher returns an error and fails. This is an improper configuration but seems to be a common problem. This PR relaxes the discovery mechanism and doesn't enforce that all APIs return their resources if they aren't currently available. When the `client.Discovery().ServerGroupsAndResources()` returns a `*discovery.ErrGroupDiscoveryFailed`, it also returns the partial results that we will use for registration. Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
c3907f0 to
29fd151
Compare
rosstimothy
approved these changes
Jan 3, 2024
Envek
added a commit
to Envek/teleport
that referenced
this pull request
Jan 4, 2024
…se-anon-key * origin/master: (344 commits) Undelete CreateHostUserMode_HOST_USER_MODE_DROP (gravitational#36273) allow cwd to be changed in difftest (gravitational#35946) Auth device list component (gravitational#36235) make unified resources responsive (gravitational#35961) Support running Teleport in a "hot reload" mode (gravitational#35040) Prevent deleting enum values, allow deleting enum reservations in types.proto (gravitational#36248) Remove support for legacy (Amazon Linux 2) AMIs (gravitational#36153) Bump version(s) used for teleport-lab and teleport-quickstart (gravitational#36167) Allow Reconciler update handler to examine old value during update (gravitational#36171) Validate the user still exists during account reset (gravitational#35676) ButtonTextWithAddIcon shared component (gravitational#36103) Refactor hostname resolution for SSH connections via the WebUI (gravitational#35773) add structuredClone to jest JSDOMEnvironment (gravitational#36213) fix flaky `lib/auth` cache-enabled tests (gravitational#36216) Report resource usage counts by handling heartbeat events (gravitational#35968) Reviewer bot should use the stable version of Go (gravitational#36242) RFD 0153 Resource Guidelines (gravitational#34103) Use cmp and cmpots properly in operator tests (gravitational#36215) Relax Kubernetes CRD discovery when building cache (gravitational#36214) Add Access List messages to TAG protobuf (gravitational#36176) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Teleport Kubernetes Service has a monitor that constantly watches the Resources registered in the Kubernetes Cluster via API Discovery. The goal is to keep an up-to-date representation of all resources existing in the cluster in order to be able to register them for Teleport per-Resource RBAC.
Having an up-to-date representation allows us to unmarshal the API responses and filter them when the custom resources are local.
When the Kubernetes APIs are registered using non-local services - i.e. the API is served by a POD running within the cluster like metrics API - and those services aren't healthy - i.e. pod not running, invalid selector, cluster has no nodes - the discovery watcher returns an error and fails. This is an improper configuration but seems to be a common problem.
This PR relaxes the discovery mechanism and doesn't enforce that all APIs return their resources if they aren't currently available.
When the
client.Discovery().ServerGroupsAndResources()returns a*discovery.ErrGroupDiscoveryFailed, it also returns the partial results that we will use for registration.Changelog: Safeguard against the disruption of cluster access caused by incorrect Kubernetes APIService configurations.