-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not store local service account token and CA to config. #122
Conversation
When defaulting to local JWT token and CA certificate in a pod, always read them from local filesystem and do not store them persistenly with the config. Token will be re-read periodically to avoid using expired token. The change allows running Vault on Kubernetes 1.21 and newer, which switched to ID token that is bound to the pod and will expire. Signed-off-by: Tero Saarni <[email protected]>
2fa1796
to
b657e15
Compare
@hsimon-hashicorp Kindly asking if someone could take a look at this PR? Currently Vault's Kubernetes service account authentication method is broken and it would be really great to get it working again! |
Chatted with @tvoran about this briefly; this is a note to myself to circle back on this later. |
Hi @tsaarni, thanks for the PR! Sorry for the delay reviewing, but we're definitely interested in this feature. I haven't reviewed in detail yet, but will try to do so this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍 I'd like to remove the CA changes from the scope if possible, and I also want to make sure this is as robust as practically possible, so one comment on the caching approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review @tomhjp, looking forward for further comments & doing final adjustments to the PR!
BTW Sometimes I find github really confusing 🤔 My response to ^^^ above ^^^ review comment is only visible in Files Changed
view...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay - I got part-way through a review and then got to the point where I was going to need to do some more involved testing for backwards compatibility/different deployment scenarios etc. I'm still yet to do that, so some of the comments are only semi-actionable. I'll leave what I've got here for now and revisit once I've had a chance to do the testing.
BTW I think it could be nice if there was a test environment with "fake" Kubernetes API server, allowing more thorough testing without compiling this plugin into Vault and running that against actual Kubernetes, in cluster and outside. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this approach! Left a few comments, but so far it has tested well for me for both short and long-lived JWTs, and when using the client token as the reviewer JWT.
I think in the future we may want to expose the local JWT and CA paths as options, but I think we can worry about that in another PR when the need arises.
I've realized that I should rework the lock handling a bit: I need to have write lock for backend, when loading local files and assigning them to variables in |
I wrote my manual test procedures here https://gist.github.com/tsaarni/66b085e0244fe1b0635cc1f806b5c334 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the test updates are really nice. Thanks for contributing such a significant feature! I'll let @tvoran have one last look but I think we can land this soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and worked well for me when I did some testing. Just one more small suggestion and then I think it'll be good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again!
When defaulting to local JWT token and CA certificate in a pod, always read them from local filesystem and do not store them persistently with the config. Token will be re-read periodically to avoid using expired token. The change allows running Vault on Kubernetes 1.21 and newer, which switched to ID token that is bound to the pod and will expire. Signed-off-by: Tero Saarni <[email protected]> * review comment fix: load only token or ca cert if other is given in config * changed the reload period to 1 minute * fixed review comments * take lock also on alias lookahead path * more review fixes * proposal to fix the read/write lock issue * fixed typo * cachedFile by value to avoid mutation while not holding log * acquire lock in same place as in pathLogin * added debug log entry when local token is not found and falling back to client token
When defaulting to local JWT token and CA certificate in a pod, always read them from local filesystem and do not store them persistently with the config. Token will be re-read periodically to avoid using expired token. The change allows running Vault on Kubernetes 1.21 and newer, which switched to ID token that is bound to the pod and will expire. Signed-off-by: Tero Saarni <[email protected]> * review comment fix: load only token or ca cert if other is given in config * changed the reload period to 1 minute * fixed review comments * take lock also on alias lookahead path * more review fixes * proposal to fix the read/write lock issue * fixed typo * cachedFile by value to avoid mutation while not holding log * acquire lock in same place as in pathLogin * added debug log entry when local token is not found and falling back to client token Co-authored-by: Tero Saarni <[email protected]>
Hi All, We are facing a similar issue with Vault in our production workloads & and creating downtime. Can someone assist me with the vault part? I mean resolution for this. A few days back we upgraded our Azure AKS Cluster from 1.20.7 to 1.21.7 and updated "disable_iss_validation=true" in vault config. It worked for some time, but intermittently we are facing issues with application pods which are crashing with permission denied errors and in vault logs, I can see below errors. After going through multiple issues/docs, what I understood is vault is not able to load/read dynamically the service account/JWT tokens of pods, as it's changing frequently/expiring every hour in K8s version 1.21.7. Please assist me on this [ERROR] auth.kubernetes.auth_kubernetes_34b3b46a: login unauthorized due to: lookup failed: service account unauthorized; this could mean it has been deleted or recreated with a new token Error making API request. Code: 403. Errors:
Config File: Config File: aultWriteAuthKubeConfig() { echo && echo "[INFO]:: Writing Kubernetes auth config..." vault write auth/kubernetes/config \ token_reviewer_jwt="$(cat ${KUBERNETES_SERVICE_TOKEN})" \ kubernetes_host=${KUBERNETES_HOST} \ kubernetes_ca_cert=@${KUBERNETES_CA_CERT} \ disable_iss_validation=true |
@zohebs341 upgrade vault to V1.9.3 or + , it will support for dynamically reloading short-lived tokens. |
@krishnamohan987 Thank you Krishna. We will try upgrading version. |
@krishnamohan987 Hi, the helm chart is still on v1.9.2, does it take a while to update there? . |
@OneideLuizSchneider you can override image version in values yaml while you apply the chart. it works. downlaod the image ready and push to your registry. |
Overview
Kubernetes 1.21 switched to new ID tokens which are now bound to a specific pod. When pod is deleted and recreated the old token will be invalidated and not accepted by Kubernetes API server anymore. Tokens will also expire and Kubernetes will renew them periodically. The new tokens cause major problem with Vault since currently, even when using local tokens, the token and CA certificate are read when the authentication method is configured and stored persistently to config.
Kubernetes authentication backend will stop working when:
Vault will not recover automatically, it requires reconfiguration of the authentication method.
Design of Change
This change avoids storing local token and local CA certificate in config. Instead, they are loaded lazily when needed and kept in memory only. Token will also re-read periodically to facilitate token rotation and to avoid authentication failure due to expired token.
Related Issues/Pull Requests
Fixes #121
Updates #95
Tests
Acceptance tests pass.
New tests added.
Manual test on Kubernetes pass..
Backwards compatibility
I believe that previously
GET /auth/kubernetes/config
returned a copy of local CA file. This is not done anymore. The justification is that local file is not part of stored configuration. CA file is still returned if user explicitly configured it withPOST /auth/kubernetes/config
.Contributor Checklist
Issue link Kubernetes Auth Method documentation not compatible with Kubernetes v1.21+ vault#12855