Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting service account keys not retried #75

Open
david-behnke opened this issue Jan 31, 2020 · 3 comments
Open

Deleting service account keys not retried #75

david-behnke opened this issue Jan 31, 2020 · 3 comments

Comments

@david-behnke
Copy link

david-behnke commented Jan 31, 2020

Describe the bug
We discovered that short lived service account keys are slowly accumulating in some of our environments.
After investigation we found out that Vault / the plugin does not retry deleting the corresponding keys when the initial deletion request failed even though the leases are no longer used / available in Vault.

Updating Vault from 1.3.0 to 1.3.2 did not resolve the issue (delete the expired service account keys).

To Reproduce
We have short lived TTLs for these keys (30 minutes) and request a new key every 5 minutes.
Having such a high frequency (or by means of forcing GCP-API errors) might help in reproducing this.

Expected behavior
Service accounts which are managed by Vault, should be checked regularly for expired/unused keys.
From what I understand this should already be done by the Rollback functionality.

Environment
Vault Server Version: v1.3.2
Vault Client Version: v1.3.2

@sethvargo
Copy link
Contributor

@emilymye might be able to speak to it more, but I don't believe Vault gives us a good mechanism to retry these deletion requests.

@david-behnke
Copy link
Author

david-behnke commented Feb 5, 2020

I was mistaken regarding the already existing implementation. WALRollbacks are implemented but the missing functionality should be part of the PeriodicFunc.

The way I see it there are 2 options (option 2 sounds better to me):

  1. implement a periodic function that iterates through the keys of the service account that is tied to the role set, compare the creation date to the MaxTTL of the config and delete the key if it should have been expired necessary.
  2. capture failed deletion attempts and write the necessary data to the storage and retry these deletion attempts within the periodic function.

Alternatively we could probably solve the issue for us by regularly rotating the service account via the Vault API.

What's your take on this?

@frodera
Copy link

frodera commented Feb 14, 2020

We're having the same issue described by @david-behnke.

One of our monitoring processes requests a new key every 2 minutes (with a low TTL) via Vault roleset and after a few weeks running we have hit an Error 429: Maximum number of keys on account reached.. After investigation we noticed that old service account keys had piled up and reached the maximum of 10.

While the process of cleaning up "orphan" keys can be easily automated by us the ideal and more robust solution would be Vault handling the deletion retries itself.

Vault Server Version: v1.3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants