-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Bug 2047732: [IBM]Volume is not deleted after destroy cluster #5962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @sameshai. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
You need to fix the title to start with "Bug XXXX" so it's properly linked with bugzilla. |
|
/ok-to-test |
|
/bugzilla-refresh |
|
/bugzilla refresh |
|
@r4f4: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold |
|
/bugzilla refresh |
|
@jsafrane: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
missing double colon :-) |
|
@sameshai: This pull request references Bugzilla bug 2047732, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Uninstall logs for verification |
jstuever
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR look functionally ok. However, I suspect there is some unnecessary code here. I see this borrows heavily from the GCP destroy. One significant difference is that GCP destroy runs in an infinite loop until everything is deleted whereas this code runs only one time with loops on individual items. I see it is using the pendingItems() functionality, which I'm not sure makes (as much) sense without that primary loop. However, this PR didn't add this pattern, so I don't feel we should block on this.
Please add a description to the PR.
arahamad-zz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check these comments and take action accordingly
|
@sameshai: This pull request references Bugzilla bug 2047732, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1 similar comment
|
@sameshai: This pull request references Bugzilla bug 2047732, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@jstuever thanks for the quick review. Apologies missed the PR description. Please refer the logic description we are doing like GCP but avoiding redundant list disk calls. |
|
pre-merge test OK, |
|
@MayXuQQ: This pull request references Bugzilla bug 2047732, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fix for Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Fix for Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug 2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster Bug:2047732 - [IBM]Volume is not deleted after destroy cluster
|
@r4f4 can we go ahead with merge |
arahamad-zz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, only I see issue at https://github.com/openshift/installer/pull/5962/files#diff-c98fece70f92d74f8db64396e4256d51b5ff7c1861479479360410ec388556e3R23 line where we are returning which will cause issue as any page in that re-try fail then we are marking entire list failed and this loop will keep going on.
@arahamad I have already mentioned in the slack channel discussion.So are following GCP flow as of now. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arahamad, GunaKKIBM, patrickdillon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
|
/hold cancel |
|
@sameshai: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@jstuever there is no duplicate code and it is similar to GCP code. |
|
@sameshai: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged: These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with Bugzilla bug 2047732 has not been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Description
The existing GCP destroy disk logic will continue to execute listdisks() until there are zero pending items in the list i.e until all disks are deleted successfully. We found this logic as redundant as there is no point in calling list disks as we are in uninstallation phase. There is no chance of new disk coming into picture.
Hence we have come-up with combination of both the flows. Note that we will be running infinitely until all the items in the list are deleted successfully. The difference is we are avoiding unnecessary listDisks() calls like GCP and making sure in the same first call to destroy disk we will wait for all disks to go away. If still something is pending it will be retried from the pending list.
1.) We will list all disks using pagination and filter it based on tags.
2.) The list of disks in #1 will be added to pendingItem list.
3.) Disk deletion will be invoked for all the disks and if any of the call fails it will return error and retried later with pendingItem list logic.
4.) Once all disk deletion is invoked we will wait for the all the disk deletion calls to get completed. Ideally first few disk deletion will take time , rest of others would have been already deleted.
5.) The pendingItem list will be empty only if all disk deletion would have been deleted successfully.
6.) If there is any issue in disk deletion for around 20 mins user will have to take action as suggested in the error for respective disks.
time="2022-06-10T12:20:22Z" level=debug msg="Failed to delete disk name=pvc-0d111d56-cf74-4045-8df0-fc439b1b4cb8, id=r018-ef565808-3491-4262-a6d5-27428e27ec0b.If this error continues to persist for more than 20 minutes then please try to manually cleanup the volume using - ibmcloud is vol r018-ef565808-3491-4262-a6d5-27428e27ec0b: Delete volume failed. Volume can be deleted only when its status is available or failed."