Conversation
d39994f to
146b47d
Compare
ceb68bc to
cc95e76
Compare
d396d51 to
7893980
Compare
tigrato
left a comment
There was a problem hiding this comment.
Kubernetes health checks should go beyond simply verifying that the Kubernetes API server is responsive.
A critical aspect to validate is whether each agent can operate correctly - specifically, that it has the necessary permissions to impersonate users and groups within the cluster.
While not strictly required, it’s also valuable to test Kubernetes GET Pod requests. This ensures that session start/end events are properly populated with fields like kubernetes_pod_labels and kubernetes_pod_image.
Another important area comes from a recent discussion with @programmerq, @webvictim, and a customer. The customer had a script that mistakenly copied Teleport agent state between different clusters. This led to agents across clusters sharing the same host_uid, which caused proxies to make incorrect routing decisions. Instead of forwarding requests to Cluster A, they were sent to Cluster B, since both clusters reported identical host_uid values. Detecting and preventing this scenario - and providing visibility when it occurs - would be a significant improvement.
I strongly agree here. These health checks should be able to indicate that the Kubernetes Cluster is usable by Teleport users.
I don't think that we should be using health checks to validate the presence of data in audit events.
I think this should be out of scope for health checks. Duplicate UUIDs are a massive footgun, but I don't know that we should be complicating health checking to catch this scenario. |
|
I added c-cv as they are interested in this feature but specifically for database. |
@TeleLos Thank you, health checks for databases happens to already be released in |
7893980 to
932d0ea
Compare
32b8891 to
090b479
Compare
8fe417c to
33ce1a0
Compare
The RFD proposes automated health checks for Kubernetes clusters with monitoring from the Web UI, `tctl`, and Prometheus metrics. Health checks use the Kubernetes `SelfSubjectAccessReview` API to verify connectivity and RBAC configuration. Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com> Relates to #58413
33ce1a0 to
7b12dd6
Compare
A request for discussion on Kubernetes health checks.
Relates to: