Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protections on PVs to prevent data loss #2700

Open
Adam-D-Lewis opened this issue Sep 3, 2024 · 0 comments
Open

Protections on PVs to prevent data loss #2700

Adam-D-Lewis opened this issue Sep 3, 2024 · 0 comments

Comments

@Adam-D-Lewis
Copy link
Member

Context

We recently had the situation that a storage PR caused users upgrading an existing deployment to the develop branch to lose the data associated with jupyterhub users. This issue was subsequently resolved by #2639 and #2673 prior to the next release of Nebari.

Value and/or benefit

As usual, we recommend a backup be done prior to every upgrade. However, given the critical nature of the info on those drives, there is also great benefit in preventing the user home directory and shared storage and conda envs PVs from being deleted. More investigation is needed, but some ideas on how best to accomplish this are as follows:

  • Use a similar approach to what is done Support disallowed nebari config changes #2660 to look at the terraform plan prior to deployment on a particular terraform stage. If the plan includes destruction of a critical PVC, then raise an error. This would only be applicable for terraform stages (as opposed to other types of Nebari stages), but so far the plan is to keep provisioning infrastructure with terraform so this seems like it would be fairly robust once set up.
  • Set prevent deploy on the critical PVCs in terraform ideally by adding a new class in https://github.com/nebari-dev/nebari/blob/develop/src/_nebari/stages/tf_objects.py. Not sure this would work, but seems like a promising solution if we can find a way to patch resources defined in the .tf files. We'd then remove prevent deploy when nebari destroy is run.
  • Change the PVs reclaim policy to retain. When we go to run nebari destroy ... the policy may need to be changed, but only then. I'm not sure this would cover every scenario with terraform so more testing would be needed and/or potentially this and other measures would need to be taken.
  • Other options yet to be suggested.

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New 🚦
Development

No branches or pull requests

1 participant