OCPBUGS-62670: [release-4.19] Networking: reset ovn-remote config and allow ovnkube controller to set it#5324
Conversation
…et it This fixes the issue where ovn-remote is set prior to reboot and when boot occurs, ovn-controller syncs quickly with a stale SB DB. This PR is part of the EIP GARP issue fix. Its required because when ovnkube-controller and ovn-controller container start on boot, there is no order to which container will start first, and we dont want ovn-controller to connect to SB DB before ovnkube controller has added the drop flows. Ideally, we would only allow ovn-controller to sync with SB DB when ovnkube controller has concluded syncing and the changes are available in SB DB. That maybe future work. Signed-off-by: Martin Kennelly <mkennell@redhat.com> (cherry picked from commit 567a191)
|
@martinkennelly: This pull request references Jira Issue OCPBUGS-62670, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload-with-prs 4.19 nightly blocking openshift/cluster-network-operator#2809 openshift/ovn-kubernetes#2774 |
|
@martinkennelly: trigger 11 job(s) of type blocking for the nightly release of OCP 4.19
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a89d3170-9f8b-11f0-93fd-ee70f1d60e20-0 |
|
@yuqi-zhang Thank you for reviewing the 4.20 PR - its not merged but we want the approvers lined up and labels added. Its a critical bug and we have the fastfix label applied. We will only merge when QE has verified. Its a clean cherry-pick. |
|
/approve The only thing I'd like to add is that currently the manual bugs for 4.19 and 4.18 has weird cloning, and I think prow expects a clone of the previous version (so in the clone links, the depends on should be the 4.20 bug and not the 4.18 bug) |
|
/payload-with-prs 4.19 nightly blocking openshift/cluster-network-operator#2809 openshift/ovn-kubernetes#2774 |
|
@martinkennelly: trigger 11 job(s) of type blocking for the nightly release of OCP 4.19
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c3464cd0-a43d-11f0-8c73-68fbceb6751c-0 |
|
/test e2e-aws-mco-disruptive Unrelated: |
|
/jira refresh |
|
@zshi-redhat: This pull request references Jira Issue OCPBUGS-62670, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (jechen@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@yuqi-zhang Hey Jerry, can you take a look? Its missing a label. You looked at higher version of this. Thank you for your support throughout this. |
|
/unhold |
|
Nighly blocking is good :) |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: martinkennelly, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/tide refresh |
|
Tide is giving me conflicting info, it says |
|
/tide refresh |
|
/test e2e-gcp-op |
|
/tide refresh |
|
Job The test is called Connection refused is coming from the Requesting over ride. |
|
/test e2e-gcp-op Incase mco team dont act on my request in time. |
|
/test e2e-gcp-op Failed to build image, unrelated: |
|
Twice in a row CI is borked for job I think it should overrided anyway based on history. |
|
/test e2e-gcp-op See nothing on test platforum regarding any error. |
|
/test e2e-gcp-op See previous comment - CI was borked for building image. Still requesting override based on job history. |
|
@martinkennelly: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/test e2e-gcp-op |
|
CI is borked :/ |
|
/override ci/prow/e2e-gcp-op This shouldn't affect gcp-op |
|
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-gcp-op DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
49dbecf
into
openshift:release-4.19
|
@martinkennelly: Jira Issue Verification Checks: Jira Issue OCPBUGS-62670 Jira Issue OCPBUGS-62670 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
…et it
This fixes the issue where ovn-remote is set
prior to reboot and when boot occurs, ovn-controller syncs quickly with a stale SB DB.
This PR is part of the EIP GARP issue fix.
Its required because when ovnkube-controller and
ovn-controller container start on boot, there
is no order to which container will start first,
and we dont want ovn-controller to connect to SB DB before ovnkube controller has added the drop flows.
Ideally, we would only allow ovn-controller to sync with SB DB when ovnkube controller has concluded
syncing and the changes are available in SB DB.
That maybe future work.
(cherry picked from commit 567a191)
/hold
Depends on #5317