-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Bug 1916692: OpenStack: Delete leftover LBs when destroying cluster #4563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1916692: OpenStack: Delete leftover LBs when destroying cluster #4563
Conversation
|
@mandre: This pull request references Bugzilla bug 1916692, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@mandre: This pull request references Bugzilla bug 1916692, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/bugzilla refresh |
|
@mandre: This pull request references Bugzilla bug 1916692, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/label platform/openstack |
pkg/destroy/openstack/openstack.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps this should rely on Loadbalancer tags rather than text on the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't. The in-tree cloud provider doesn't tag the LB resources.
pkg/destroy/openstack/openstack.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleteLeftoverLoadBalancers returns bool and error. are you going to handle it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't planning to. I don't want to change the return value of deleteNetworks() based on the return of deleteLeftoverLoadBalancers(), since this would only be triggered when deleteNetworks() has failed already.
IMO, the destroy module needs a good refactoring that I don't want to commit to right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean you created a new function deleteLeftoverLoadBalancers and it returns two values that you ignore later. I think we need to either change the signature of the function, or handle the returned values properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I changed the signature. PTAL.
pkg/destroy/openstack/openstack.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to handle ErrDefault403 errors separately to prevent situations when a user doesn't have permissions to list load balancers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is a pattern we're using all over the place in this file, and if we wanted to treat 403 differently I think we should do it in a separate change.
How do you suggest we should handle a ErrDefault403 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil {
var gerr gophercloud.ErrDefault403
if !errors.As(err, &gerr) {
logger.Debugf("It's forbidden to list load balancers")
return true, nil
}
logger.Error(err)
return false, nil
}
pkg/destroy/openstack/openstack.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you use it as a pointer type, and below in the similar clause, as a refular type. This doesn't look consistent?
pkg/destroy/openstack/openstack.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please compare to the line #660
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is correct... This is how errors in gophercloud are organized
If the user destroyed a cluster without removing all the associated service LBs, the `destroy` command would fail to remove the network and loop until it hits the timeout. The destroy command now looks if there are any leftover LBs where its `VipNetworkID` matches the network ID and deprovisions it. We filter on services LBs created by the openstack cloud provider, matching the `Kubernetes external service` string in the description [1], to ensure we're not destroying a user-created resource by mistake. [1] https://github.com/openshift/kubernetes/blob/442a69c/staging/src/k8s.io/legacy-cloud-providers/openstack/openstack_loadbalancer.go#L446
23052aa to
003baff
Compare
|
/retest |
|
@mandre: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Fedosin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Fedosin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@mandre: All pull requests linked via external trackers have merged: Bugzilla bug 1916692 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
If the user destroyed a cluster without removing all the associated
service LBs, the
destroycommand would fail to remove the networkand loop until it hits the timeout.
The destroy command now looks if there are any leftover LBs where its
VipNetworkIDmatches the network ID and deprovisions it. We filter onservices LBs created by the openstack cloud provider, matching the
Kubernetes external servicestring in the description [1], to ensurewe're not destroying a user-created resource by mistake.
[1] https://github.com/openshift/kubernetes/blob/442a69c/staging/src/k8s.io/legacy-cloud-providers/openstack/openstack_loadbalancer.go#L446