-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backup with transit seal method and revoked token silently fails #13130
Comments
Hi @laugmanuel - were you testing your snapshot restores previously? In #12388, the changes were made to expose broken seals that are resulting in unusable snapshots. Prior to the changes, the snapshot creation would appear to be successful, but the snapshots could not be restored. If you could let us know, I'd appreciate it. :) |
Hi @hsimon-hashicorp , Nevertheless, the other points regarding docs and serving a broken backup through API and UI are still valid 😉 |
When a snapshot is initiated via the API, a success is returned immediately upon the snapshot starting to stream. The snapshot is not buffered on the server, because the size of the snapshot is unknown. So, the snapshot API request returns a "success", starts to stream, and then if at some point the seal isn't available, the snapshot will be broken. This is why testing restores is a critical part of any backup process. |
I've tested with Vault 1.8.5 and Vault 1.7.4 (which does, according to the Changelog, not contain the above fix). In both cases, the snapshot was valid and restorable with a valid token and became broken after the token expired. For us, I fixed it temporarily by issuing a token with a relatively long lifetime (based on an approle which overrides the default ttl of 32d). |
Hi @taoism4504 - we were discussing this today - this might be good to clarify and expand in the snapshot and restore documentation with regards to token longevity and not breaking snapshots. :) |
Hi @hsimon-hashicorp , what's the status on this? |
We've had this problem happen today, the token in the config for the autounseal had expired. We renewed the token, updated the config, reloaded vault (using kill -HUP), but the snapshot still failed with the same error until we actually restarted all our nodes. If the transit token not reloaded on SIGHUP? |
Pinging @schavis for docs update. Thanks @laugmanuel! |
Whats the status here? |
Describe the bug
We use Raft as our storage backend.
We also do use transit sealing against a secondary Vault instance to provide auto unsealing for our primary Vault installed in Kubernetes. The token we use for that gets created by an init-container and is only valid for a few minutes.
Until recently, this setup worked fine for us. The pods got unsealed automatically and the backups were present and valid (could be successfully restored).
Probably due to #12388, this behaviour changed!
Creating a backup using
vault operator raft snapshot save <snapshot file>
results in an error regarding theSHA256SUMS.sealed
file. Using the API endpoint, we can successfully download the snapshot without any error.In both cases the snapshot file gets created and looks to contain data:
file <snapshot file>
the backup is recognized asgzip compressed data
However, the backup can not be restored and Vault complains about
Load error
in the UI. Restoring using the CLI also fails. If I try to unpack the backup using gzip, I getunexpected end of file
-> it looks like the backup file is corrupted.If I extend the lifetime of the unseal token, the backup gets created and can be restored successfully!
There is no word in the docs, that the transit token used in the Vault config/env variables must still be valid for a backup to succeed!
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Either a valid backup file (a file that can be extracted using gzip+tar and restored) should be created; even though there is the warning about
SHA256SUMS.sealed
file.OR the creation of the backup should hard fail without any file being created.
If someone uses the API to create the backup but does not regularly check the restore, there would be no way to see, that the backup file is corrupted.
Also, the docs about raft snapshotting should mention, that the seal-configuration (including the token) must be valid for the backup to fully work.
Environment:
vault status
): 1.8.3, 1.8.4vault version
): 1.8.3, 1.8.4Vault server configuration file(s):
Additional context
There must be a notice in the docs about the token used for transit. The docs and also the howto guides only mention to create a new token and to put it in the config/env variable. This would also break after the default lifetime of 32d:
The text was updated successfully, but these errors were encountered: