-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Bug 1791400: cmd/openshift-install/destroy: Remove terraform.tfstate in 'destroy cluster' #2433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1791400: cmd/openshift-install/destroy: Remove terraform.tfstate in 'destroy cluster' #2433
Conversation
8e18cbb to
06a836c
Compare
cmd/openshift-install/destroy.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to tighten this up a little:
if err = os.Remove(tfStateFilePath); !os.IsNotExist(err) {
return errors.Wrap(err, "failed to remove Terraform state")
}
(stolen from here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err = os.Remove(tfStateFilePath); !os.IsNotExist(err) {
We don't want to error on nil, and IsNotExist(nil) is false. I like separating the call from the error-handling conditionals, but I can squash down to:
if err = os.Remove(tfStateFilePath); err != nil && !os.IsNotExist(err) {
if folks see that as a blocker ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to error on nil
Oh of course. I left out the most important condition. Nvm!
|
Should https://github.com/openshift/installer/blob/master/cmd/openshift-install/destroy.go#L69 already be removing the tfstate file or is that referring to a different "state file"? |
Different (that |
|
This lgtm but I will give others a chance to review. |
|
how is the tfstate not part of the asset graph, compared to the tfvars files? Can we define the case where the tfstate is left behind first, instead of deleting this file blindly in destroy? |
Hmm, yeah. We should be adding it to the state here. I'll hunt through CI to try and find a reproducer. |
|
Here is a job that leaked But an asset is only committed to the store if its |
I'm still getting familiar with this code, but this approach of adding tfstate to the store looks cleaner and less of a special case than direct handling of the state file. I'm not sure of the exact definition of assets, but I think it should encompass this file (something written to disk and needed by the installer). |
|
So for the leaked run the terraform.tfstate file was created in the install_dir but not with the help of the asset graph?? or did we copy the state file from tmp_terraform_workspace to install_dir but didn't report that as part of asset output.. because if it's the latter, that to be seems like the bug. |
We created it through the asset graph in if err := a.Generate(parents); err != nil {
if a was the Cluster asset {
assetState.asset = a
assetState.source = generatedSource
}
return errors.Wrapf(err, "failed to generate asset %q", a.Name())
}
assetState.asset = a
assetState.source = generatedSourceI'm fine either way. |
|
If assets are only added to the asset store as a result of Generate(), it seems like the best solution would be to delegate that action to the assets themselves. In this case, I think that would mean expanding Generate to accept assetState. That is probably more code than we want to write to fix this (though not that bad I think), but worth discussing. |
My asset-graph opinions are in #556, which is far enough from what we have now that I don't have opinions about minor pivots ;). Restructuring the asset framework to allow assets to decide if/when to save themselves would work, but it's a larger pivot than either of the two alternatives I gave here. But folks should pick an approach, and then tell me and I'll implement it ;). |
|
This looks like a good quick fix. We should create a card to revisit if/how to do it using asset-graph. |
|
/hold |
|
/retest Before we reopen discussion here, I want to see if this thing actually compiles ;) |
|
You put a hold on it. |
|
/retitle Bug 1791400: cmd/openshift-install/destroy: Remove terraform.tfstate in 'destroy cluster' |
|
/approve /hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, jstuever The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
6 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
Update job 4676 failed with: Can't be related to my teardown change. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
12 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@wking: All pull requests linked via external trackers have merged. Bugzilla bug 1791400 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
So
destroy clustergets you all the way back to a clean slate, even if the cluster in question died during infrastructure provisioning (leaking mentioned here and here).This will obviously not clean up the original asset directory in workflows where the user copies their
metadata.jsonover into a new directory and runsdestroy clusterin the new directory. But since we have existing asset-state removal code that also behaves that way, I don't think it's a big deal.cmd/openshift-install/destroy.gois a convenient place to put this now, when all of our providers are Terraform-based. If, in the future, we move some providers off of Terraform (or add new, non-Terraform providers), we can push this down into the platform-specific destroyers. We could also leave it here, becauseterraform.tfstateis unlikely to exist in the asset directory for non-Terraform providers and be something that the user wants to keep around, so the risk of false-positive removal is low.