Skip to content

Conversation

@mandre
Copy link
Member

@mandre mandre commented Jan 18, 2021

If the user destroyed a cluster without removing all the associated
service LBs, the destroy command would fail to remove the network
and loop until it hits the timeout.

The destroy command now looks if there are any leftover LBs where its
VipNetworkID matches the network ID and deprovisions it. We filter on
services LBs created by the openstack cloud provider, matching the
Kubernetes external service string in the description [1], to ensure
we're not destroying a user-created resource by mistake.

[1] https://github.com/openshift/kubernetes/blob/442a69c/staging/src/k8s.io/legacy-cloud-providers/openstack/openstack_loadbalancer.go#L446

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Jan 18, 2021
@openshift-ci-robot
Copy link
Contributor

@mandre: This pull request references Bugzilla bug 1916692, which is invalid:

  • expected the bug to target the "4.7.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1916692: OpenStack: Delete leftover LBs when destroying cluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jan 18, 2021
@mandre
Copy link
Member Author

mandre commented Jan 18, 2021

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jan 18, 2021
@openshift-ci-robot
Copy link
Contributor

@mandre: This pull request references Bugzilla bug 1916692, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jan 18, 2021
@mandre
Copy link
Member Author

mandre commented Jan 18, 2021

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot removed the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Jan 18, 2021
@openshift-ci-robot
Copy link
Contributor

@mandre: This pull request references Bugzilla bug 1916692, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. label Jan 18, 2021
@mandre
Copy link
Member Author

mandre commented Jan 18, 2021

/label platform/openstack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this should rely on Loadbalancer tags rather than text on the description

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't. The in-tree cloud provider doesn't tag the LB resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteLeftoverLoadBalancers returns bool and error. are you going to handle it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't planning to. I don't want to change the return value of deleteNetworks() based on the return of deleteLeftoverLoadBalancers(), since this would only be triggered when deleteNetworks() has failed already.

IMO, the destroy module needs a good refactoring that I don't want to commit to right now.

Copy link
Contributor

@Fedosin Fedosin Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you created a new function deleteLeftoverLoadBalancers and it returns two values that you ignore later. I think we need to either change the signature of the function, or handle the returned values properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I changed the signature. PTAL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to handle ErrDefault403 errors separately to prevent situations when a user doesn't have permissions to list load balancers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a pattern we're using all over the place in this file, and if we wanted to treat 403 differently I think we should do it in a separate change.
How do you suggest we should handle a ErrDefault403 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil {
    var gerr gophercloud.ErrDefault403
    if !errors.As(err, &gerr) {
        logger.Debugf("It's forbidden to list load balancers")
        return true, nil
    }
    logger.Error(err)
    return false, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you use it as a pointer type, and below in the similar clause, as a refular type. This doesn't look consistent?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please compare to the line #660

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is correct... This is how errors in gophercloud are organized

If the user destroyed a cluster without removing all the associated
service LBs, the `destroy` command would fail to remove the network
and loop until it hits the timeout.

The destroy command now looks if there are any leftover LBs where its
`VipNetworkID` matches the network ID and deprovisions it.  We filter on
services LBs created by the openstack cloud provider, matching the
`Kubernetes external service` string in the description [1], to ensure
we're not destroying a user-created resource by mistake.

[1] https://github.com/openshift/kubernetes/blob/442a69c/staging/src/k8s.io/legacy-cloud-providers/openstack/openstack_loadbalancer.go#L446
@mandre mandre force-pushed the openstack_leftover_lb branch from 23052aa to 003baff Compare February 3, 2021 14:15
@mandre
Copy link
Member Author

mandre commented Feb 3, 2021

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 3, 2021

@mandre: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-workers-rhel7 003baff link /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@Fedosin Fedosin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Fedosin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 3, 2021
@openshift-merge-robot openshift-merge-robot merged commit 8973686 into openshift:master Feb 3, 2021
@openshift-ci-robot
Copy link
Contributor

@mandre: All pull requests linked via external trackers have merged:

Bugzilla bug 1916692 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1916692: OpenStack: Delete leftover LBs when destroying cluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pierreprinetti pierreprinetti deleted the openstack_leftover_lb branch June 19, 2023 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. platform/openstack

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants