Skip to content

OCPBUGS-50709: DownStream Merge [10-28-2025]#2832

Merged
openshift-merge-bot[bot] merged 4 commits intoopenshift:masterfrom
jluhrsen:d/s-merge-10-28-2025
Oct 30, 2025
Merged

OCPBUGS-50709: DownStream Merge [10-28-2025]#2832
openshift-merge-bot[bot] merged 4 commits intoopenshift:masterfrom
jluhrsen:d/s-merge-10-28-2025

Conversation

@jluhrsen
Copy link
Contributor

📑 Description

Fixes #

Additional Information for reviewers

✅ Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

tssurya and others added 4 commits October 21, 2025 00:45
We need to differentiate between
the field not being set and field
being set to 0 which means no
available capacity on that node.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
…d-capacity-support

EIP: Change capacity storage to pointers
Scenario:
- Nodes: node-1, node-2, node-3
- Egress IPs: EIP-1
- Pods: pod1 on node-1, pod2 on node-3 (pods are created via deployment replicas)
- Egress-assignable nodes: node-1, node-2
- EIP-1 assigned to node-1

During a simultaneous reboot of node-1 and node-2, EIP-1 failed over to node-2 and
ovnkube-controller restarted at nearly the same time:

1) EIP-1 was reassigned to node-2 by the cluster manager.
2) The sync EIP happened for EIP1 with stale status, though it cleaned SNATs/LRPs
   referring to node-1 due to outdated pod IPs (this is because pods will be
   recreated due to node reboots).
3) pod1/pod2 Add events arrived while the informer cache still had the
   old EIP status, so new SNATs/LRPs were created pointing to node-1.
4) The EIP-1 Add event arrived with the new status; entries for node-2
   were added/updated.
5) Result: stale SNATs and LRPs with stale nexthops for node-1 remained.

Fix:
- Populate pod EIP status during EgressIP sync so podAssignment has
  accurate egressStatuses.
- Reconcile stale assignments using podAssignment (egressStatuses) when
  the informer cache is not up to date, ensuring SNAT/LRP for the
  previously assigned node are corrected.
- Remove stale EIP SNAT entries for remote-zone pods accordingly.
- Add coverage for simultaneous EIP failover and controller restart.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 28, 2025
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: This pull request references Jira Issue OCPBUGS-50709, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

📑 Description

Fixes #

Additional Information for reviewers

✅ Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jluhrsen
Copy link
Contributor Author

/payload 4.21 ci blocking
/payload 4.21 nightly blocking

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 28, 2025

@jluhrsen: trigger 5 job(s) of type blocking for the ci release of OCP 4.21

  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aks
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aws-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5aa6ad90-b433-11f0-8890-276b2b004eb1-0

trigger 13 job(s) of type blocking for the nightly release of OCP 4.21

  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-upgrade-fips
  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-1of2
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-2of2
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-1of3
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-2of3
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-3of3
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5aa6ad90-b433-11f0-8890-276b2b004eb1-1

@asood-rh
Copy link
Contributor

/test e2e-aws-ovn-fdp-qe

@jluhrsen
Copy link
Contributor Author

/retest

and, wow, probably the best first-run of payloads I've ever seen. only one to re-run:

/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@jluhrsen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a021ab20-b480-11f0-8bc2-78c486e67beb-0

@asood-rh
Copy link
Contributor

We can mark this PR verified with CI with QE 's perspective as FDP QE workflow succeeded

@pperiyasamy
Copy link
Member

/test e2e-gcp-ovn
/test 4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

@huiran0826
Copy link
Contributor

/verified by 'Pre-merge testing'

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Oct 29, 2025
@openshift-ci-robot
Copy link
Contributor

@huiran0826: This PR has been marked as verified by 'Pre-merge testing'.

Details

In response to this:

/verified by 'Pre-merge testing'

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@jluhrsen: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn e9ebeac link false /test okd-scos-e2e-aws-ovn
ci/prow/security e9ebeac link false /test security

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pperiyasamy
Copy link
Member

/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@pperiyasamy: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/2460b670-b4df-11f0-8a53-1c8c74f9199a-0

@jluhrsen
Copy link
Contributor Author

@jcaamano we can /override e2e-metal-ipi-ovn-dualstack-bgp-local-gw here right? need to get lint too

@pperiyasamy
Copy link
Member

/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@pperiyasamy: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/52ac41f0-b54f-11f0-86d7-8f46367acd44-0

@jcaamano
Copy link
Contributor

/override ci/prow/lint
/override ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@jcaamano: Overrode contexts on behalf of jcaamano: ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw, ci/prow/lint

Details

In response to this:

/override ci/prow/lint
/override ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jcaamano
Copy link
Contributor

/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@jcaamano: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a388f780-b57c-11f0-8a13-9a1701500db7-0

@huiran0826
Copy link
Contributor

/remove-label verified

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@huiran0826: The label(s) /remove-label verified cannot be applied. These labels are supported: acknowledge-critical-fixes-only, platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, ux-approved, no-qe, downstream-change-needed, rebase/manual, cluster-config-api-changed, run-integration-tests, approved, backport-risk-assessed, bugzilla/valid-bug, cherry-pick-approved, jira/valid-bug, ok-to-test, stability-fix-approved, staff-eng-approved. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

Details

In response to this:

/remove-label verified

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jluhrsen
Copy link
Contributor Author

jluhrsen commented Oct 30, 2025

@martinkennelly @jcaamano I think we are good here.

@huiran0826 , why did you try to remove the verified label?

@jcaamano
Copy link
Contributor

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 30, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcaamano, jluhrsen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 30, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit f165407 into openshift:master Oct 30, 2025
31 of 33 checks passed
@openshift-ci-robot
Copy link
Contributor

@jluhrsen: Jira Issue Verification Checks: Jira Issue OCPBUGS-50709
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-50709 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

📑 Description

Fixes #

Additional Information for reviewers

✅ Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.21.0-0.nightly-2025-11-03-191704

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants