-
Notifications
You must be signed in to change notification settings - Fork 1.5k
pkg/destroy: data/aws: delete instance profiles even if they are detached #1268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/destroy: data/aws: delete instance profiles even if they are detached #1268
Conversation
|
@wking get ready to throw up when you see what I did to your lovely destroy code :) |
|
With cluster IDs in the profile names, new clusters won't be bothered by any previous instance profiles leaked by buggy reapers (the #1174 issue). So I'm in favor of adding the cluster ID here (and maybe removing the cluster name if we are concerned about name-length limits?), but I don't think we need to change the destroy code. |
pkg/destroy/aws/aws.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the term clusterName instead of clusterID for this elsewhere in the installer code.
pkg/destroy/aws/aws.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to have a deleteIAMInstanceProfileByName function that is passed the name of the profile rather than making up an ARN with a fake account ID that the caller is trusting won't actually be used.
dad35dd to
fc40883
Compare
|
updated to address review comments. I've also changed my mind, I don't think this is a 'hack'. It is cleaning up what we created. We know we created it, we should clean it up. |
I think once another reaper leans in and deletes our instances and IAM roles, it's assuming responsibility for reaping instance profiles too. I don't think we need to support step 3 in:
Are users who pass step 1 likely to take step 3 when creating a new cluster with the same name no longer has resource conflicts (because of your appended cluster ID)? |
|
I'm not sure why #3 is 'User reassembles`. All the user needs to do is call destroy_cluster in #3 for this PR to be helpful. While I agree that if something else starts to clean up a cluster it probably should clean up the entire cluster, I say that if the installer creates something the installer should be able to destroy it. A customer could (less likely admittedly) call |
We currently order removal so roles and instances are not removed before the instance profile. So "network hiccup" is not sufficient, you'd need "AWS API returns no associated instance-profiles when that instance profile actually did exist" to get into trouble here. |
|
I just need to remove the role, instance profile removal will fail. remove the instance, hickup, stuck. No? |
We only attempt to remove roles after successfully removing associated instance profiles. Same for instances. So how do you get the role and instance removed first, except via a buggy external reaper? |
|
/retest |
|
/retest |
|
/retest |
|
/rest |
ec88a6a to
d9f8f73
Compare
pkg/destroy/aws.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should us metadata.InfraID now.
d9f8f73 to
64b7938
Compare
Today we find instance profiles that we need to delete through either the associated role or the associated instance. We find those objects via aws tags. If those objects have been deleted we are unable to find the instance profiles because those are not tagged. Since we embed both the cluster id inside the name we can find the instance profiles we created by name and destroy them that way.
64b7938 to
6ae7598
Compare
|
ping @wking can you take another look at the PR? |
| Filters: filters, | ||
| Region: metadata.ClusterPlatformMetadata.AWS.Region, | ||
| Logger: logger, | ||
| ClusterID: metadata.InfraID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause a trivial conflict with #1365, which is a higher-priority bugfix.
| } | ||
| return err | ||
| } | ||
| logger.WithField("name", name).Info("Deleted") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd rather have the caller set this context, like we do here, etc. The caller can chose which of the information available to it should be logged. Functions can add additional logging context as new information become available, but they shouldn't write their arguments directly into the logging context.
|
I left two minor nits inline. Otherwise, this looks fine to me, and I'm ok with it landing (with or without the nits getting addressed), although I still don't feel like we need it ;). |
the nits don't look like blockers. /lgtm PS: It's sad that the instance profiles cannot be tagged :( . |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, eparis The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
This reverts commit 6ae7598, openshift#1268. That was a workaround to recover from openshift-dev clusters where an in-house pruner is removing instances but not their associated instance profiles. Folks using the installer's destroy code won't need it, and while the risk of accidental name collision is low, I don't think it's worth taking that risk. With this commit, folks using external reapers are responsible for ensuring that they reap instance profiles when they reap instances, and we get deletion logic that is easier to explain to folks mixing multiple clusters in the same account.
Today we find instance profiles that we need to delete through either the
associated role or the associated instance. We find those objects via aws
tags. If those objects have been deleted we are unable to find the instance
profiles because those are not tagged.
Since we now embed both cluster id inside the name this PR adds the ability to find those object and to delete them even if the roles and instances have been deleted.