Skip to content

WIP [release-4.9] Bug 2047416: restart pod on non-retriable failures when deleting stale objects#936

Closed
flavio-fernandes wants to merge 1 commit intoopenshift:release-4.9from
flavio-fernandes:fatal_on_rm_stale_4.9
Closed

WIP [release-4.9] Bug 2047416: restart pod on non-retriable failures when deleting stale objects#936
flavio-fernandes wants to merge 1 commit intoopenshift:release-4.9from
flavio-fernandes:fatal_on_rm_stale_4.9

Conversation

@flavio-fernandes
Copy link
Contributor

In cases where we currently miss doing retries for removal of stale
objects, it is best to restart the pod than simply log an error and
bring the pod up. This change is changing that behavior on functions
run early on the pod start up.

Conflicts:
go-controller/pkg/ovn/pods.go
go-controller/pkg/ovn/policy.go

Signed-off-by: Flavio Fernandes flaviof@redhat.com
(cherry picked from commit 033cc76)

@openshift-ci openshift-ci bot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Jan 27, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 27, 2022

@flavio-fernandes: This pull request references Bugzilla bug 2047416, which is invalid:

  • expected dependent Bugzilla bug 2042999 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

[release-4.9] Bug 2047416: restart pod on non-retriable failures when deleting stale objects

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jan 27, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 27, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: flavio-fernandes
To complete the pull request process, please assign squeed after the PR has been reviewed.
You can assign the PR to them by writing /assign @squeed in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from abhat and dcbw January 27, 2022 20:54
In cases where we currently miss doing retries for removal of stale
objects, it is best to restart the pod than simply log an error and
bring the pod up. This change is changing that behavior on functions
run early on the pod start up.

Conflicts:
  go-controller/pkg/ovn/pods.go
  go-controller/pkg/ovn/policy.go

Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
(cherry picked from commit 7954229)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 28, 2022

@flavio-fernandes: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-gcp-ovn db006dd link false /test okd-e2e-gcp-ovn
ci/prow/e2e-azure-ovn db006dd link false /test e2e-azure-ovn
ci/prow/e2e-gcp-ovn db006dd link true /test e2e-gcp-ovn
ci/prow/e2e-metal-ipi-ovn-dualstack db006dd link true /test e2e-metal-ipi-ovn-dualstack
ci/prow/4.9-upgrade-from-stable-4.8-e2e-aws-ovn-upgrade db006dd link false /test 4.9-upgrade-from-stable-4.8-e2e-aws-ovn-upgrade
ci/prow/e2e-openstack-ovn db006dd link false /test e2e-openstack-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dcbw
Copy link
Contributor

dcbw commented Jan 28, 2022

/hold
while Flavio reworks the upstream PR

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2022
@flavio-fernandes flavio-fernandes changed the title [release-4.9] Bug 2047416: restart pod on non-retriable failures when deleting stale objects WIP [release-4.9] Bug 2047416: restart pod on non-retriable failures when deleting stale objects Jan 28, 2022
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 28, 2022
@flavio-fernandes
Copy link
Contributor Author

This change has been folded into #981 .
Closing.

@flavio-fernandes flavio-fernandes deleted the fatal_on_rm_stale_4.9 branch April 8, 2022 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments