Skip to content

Conversation

@danehans
Copy link
Contributor

Previously, a service of type loadBalancer would still exist after an ingresscontroller resource is deleted. Allowing the service to exist after an ingresscontroller is deleted can cause it to be reused when the same ingresscontroller is recreated.

@openshift-ci-robot
Copy link
Contributor

@danehans: This pull request references Bugzilla bug 1766141, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 1766141: Ensures LB service finalizer is removed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 31, 2019
@danehans
Copy link
Contributor Author

/assign @Miciah @ironcladlou

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2019
// finalizer does not exist for the load balancer service of ic. For additional background on
// this finalizer, see:
// https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/20190423-service-lb-finalizer.md
func (r *reconciler) ensureLoadBalancerCleanupFinalizer(ic *operatorv1.IngressController) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this logic were moved inline to finalizeLoadBalancerService() you could avoid a new function and call to r.currentLoadBalancerService() (and I think it makes sense anyway because we're literally delaying finalization of the LB service until some other condition is true)

Copy link
Contributor Author

@danehans danehans Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ironcladlou I tried that initially, but (a) the "service.kubernetes.io/load-balancer-cleanup" finalizer needs to be checked after the deployment (ownerReferences) is deleted and (b) ensureIngressDeleted() never proceeds to remove other ingresscontroller dependent resources until the cloud infra is cleaned-up. With this approach, the final ingresscontroller reconciliation only needs to remove the "ingresscontroller.operator.openshift.io/finalizer-ingresscontroller" finalizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, is this back to @Miciah's point about failing too soon? To avoid dealing with graphs directly (a path we played with once), the easy thing I've learned to do (from @deads2k) is to make as much progress as possible with parents, aggregate errors, and retry.

Right now we do

finalizeLoadBalancerService() // if fail return early
deleteWildcardDNSRecord() // if fail return early
ensureRouterDeleted() // if fail return early — but finalizeLoadBalancerService will never succeed until this is called!
// finalize ingresscontroller

Would this make things better?

finalizeLoadBalancerService() // append errors
deleteWildcardDNSRecord() // append errors
ensureRouterDeleted() // append errors
if errors > 0 return // and retry
// finalize ingresscontroller

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ironcladlou ptal at the latest commit that follows your guidance.

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 1, 2019
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 1, 2019
return err
}
}
log.Info("deleted deployment for ingress", "namespace", ci.Namespace, "name", ci.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's worth logging at v=0, would an event be better? Not sure I have a consistent set of principles to apply for logging nothing vs. logging v=0, v=1, vs. events in some of these cases. cc @Miciah

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both log.Info and emitting an event for "I deleted your router" seem reasonable to me. The operator's log level is hard-coded to "Debug", which is I guess like v=6 or higher (because glog and zap have to be contrary for whatever reason).

@ironcladlou
Copy link
Contributor

Just a non-blocking question about logging, lgtm though. Will let @Miciah tag after his review

@danehans
Copy link
Contributor Author

danehans commented Nov 4, 2019

Build error:
The build "a234567890123456789012345678901234567890123456789012345678-1" status is "Failed"

e2e-aws passes when run locally.

/test e2e-aws

@danehans
Copy link
Contributor Author

danehans commented Nov 5, 2019

openshift/origin#24085 should have fixed the test flake.

/test e2e-aws

Copy link

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 5, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, knobunc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 1e604c6 into openshift:master Nov 6, 2019
@openshift-ci-robot
Copy link
Contributor

@danehans: All pull requests linked via external trackers have merged. Bugzilla bug 1766141 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1766141: Ensures LB service finalizer is removed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@danehans
Copy link
Contributor Author

danehans commented Nov 7, 2019

/cherry-pick release-4.2

@openshift-cherrypick-robot

@danehans: new pull request created: #325

Details

In response to this:

/cherry-pick release-4.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@danehans danehans deleted the bz_1766141 branch July 31, 2020 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants