OCPBUGS-66139: Branch Sync release-4.20 to release-4.19 [12-03-2025] #2890

openshift-pr-manager · 2025-12-03T09:01:32Z

Automated branch sync: release-4.20 to release-4.19.

Currently when a pod is reconciled and IPs are requested, the ipAllocator's "already allocated" errors are ignored. In order to block IP conflicts with other pods, only ignore duplicate IP errors in cases when PreconfiguredUDNAddressesEnabled FG is enabled and: - the NAD's network annotation has already persiste/allocated, indicating that the pod was already reconciled and assigned with IPs. - the ipamClaim already has been updated with status.IPs, indicating that OVNK already allocated the IPs for this object requesting persistent IPs (pod/VM). Under these new terms, when ErrAllocated is discovered and not ignored - publish this as a pod event. Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 66298cf)

Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 9b16a7f)

Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 481e954)

Enable tracking used MAC addresses with owner identification. Enabling MAC addresses conflict detection when multiple entities try to use the same address within the same network. The MAC manager will be integrated with cluster-manager's pod-allocator code in follow-up commits. Since pod-allocator run by multiple Goroutines, use Mutex to prevent race conditions on reserve and release MACs. Signed-off-by: Ram Lavi <ralavi@redhat.com> Co-authored-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit c8b18bd)

Integrate the MAC manager to podAllocator, instantiate the MAC manager on primary L2 networks UDNs with persistentIPs enabled, when EnablePreconfiguredUDNAddresses is enabled. The pod-allocator is instantiated for each network, thus network isolation is maintained. MACs can reused in different UDNs. On pod allocation, record the used MAC address and its owner-id, if already used raise MAC conflict error. Compose the owner-id in the following format: <pod.metadata.namespace>/<metadata.name> E.g: Given pod namespace=blue, name=mypod, owner-id is blue/mypod To allow VM migration scenario, where two pods should use the same MAC, relax MAC conflicts by composing the owner-id from the associated VM name: <pod.metadata.namespace>/<VM name label value> E.g: Given pod namespace=blue, name=virt-launcher-myvm-abc123 VM name=myvm, owner id is "blue/mypod". The VM name is reflected by the "vm.kubevirt.io/name" label In addition, in a scenario of repeated request (same mac & owner) that was already handled, being rollback due to failure (e.g.: pod update failure), do not release the reserved MAC as part of the pod-allocation rollback. MAC addresses release on pod deletion, and initializing the MAC manager on start up will be done in follow-up commits. Signed-off-by: Ram Lavi <ralavi@redhat.com> Co-authored-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit 1d5045a)

Emit pod event when MAC conflict is detected during pod allocation process. Avoid user-defined network name leak to pod events, as they are visible by non cluster-admin users. Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit 0b38dd8)

On pod deletion, remove the MAC address used by the pod from the MAC manager store. To allow VM migration scenario, do not release the MAC when there is at least one VM pod that is not in complete state. Resolve the VM pod owner-id by composing the owner-id from the associated VM name. Initializing the MAC manager on start up will be done in follow-up commits. Signed-off-by: Ram Lavi <ralavi@redhat.com> Co-authored-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit b6965d9)

Initialize the pod allocator MAC manager MACs of the network GW and management ports, preventing conflicts with new pods requesting those MACs. The MAC manager is instantiated on primary L2 UDNs with persistent IPs enabled, when EnablePreconfiguredUDNAddresses. The network logical switch has GW (.1) and management (.2) ports. Their MAC address is generated from the IP address. Calculate the GW and management MAC addresses from their IP addresses. Signed-off-by: Or Mergi <ormergi@redhat.com> Co-authored-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit eb4789b)

Initialize the pod-allocator MAC manager with MACs of existing pods in the network. Preventing unexpected conflicts in scenarios where the control-plane restarts. The MAC manager is instantiated on primary L2 UDNs with persistent IPs enabled, when EnablePreconfiguredUDNAddresses. VMs can have multiple associated pods with the same MAC address (migration scenario). Allow VM associated pods have the same MAC, by composing the owner-id from the associated VM name. Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 89d1696)

Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit f46a21c)

Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit e4835ab)

In a scenario of primary CUDN where multiple NAD exist all with the same spec, NetworkInfo.GetNADs return multiple NADs of the selected namespaces. The GetPodNADToNetworkMappingWithActiveNetwork helper, assume the active-network (NetworkInfo{}) consist of single NAD, and return the mapping with the first NAD of the active-network it found. This approach fall short when the given pod is connected to CUDN that span over multiple namespaces, i.e.: active network consist of multiple NADs. The helper return inconsistent mapping where the NAD key doesn't match the pod namespace (NAD of another namespaces). Chagne the helper to find the active-network matching NAD; the NAD that reside at the same namespace as the given pod (matching namespace) Change test to always set an appropriate namespace to the tested pod. Extend the test suite to allow injecting multiple NADs for the active-network, and simulating the CUDN use-case. Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit cedfc13)

EgressFirewall objects were retaining managedFields entries for nodes that had been deleted. When a node was deleted, the cleanup logic would apply an empty status object using the deleted node as the field manager. This incorrectly signaled to the API server that the manager was now managing an empty status, leaving a stale entry in managedFields. This change corrects the cleanup logic in cleanupStatus. The manager for the deleted node now applies an EgressFirewall configuration that completely omits the status field. This correctly signals that the manager is giving up ownership of the field to the server-side apply mechanism, causing the API server to remove the manager's entry from managedFields. This prevents the buildup of stale data in etcd for large clusters with frequent node churn. Applying the same logic to the other resource types using status manager: ANP, APBRoute, EgressQoS, NetworkQoS, EgressService. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit e3863d9)

Resources with a condition-based status (EgressQoS, NetworkQoS) store the zone name in the condition Type field ("Ready-In-Zone-$zoneName"), but not in the message field. This caused cleanup to fail because GetZoneFromStatus() couldn't extract the zone name from the message. Fix this by transforming the output of getMessages() by extracting the zone from the condition and prepending it to the returned message: "$zoneName: message", matching the format used by message-based resources (EgressFirewalls, AdminPolicyBasedExternalRoutes). This also fixes needsUpdate(), which now properly detects zone-specific changes, since it compares messages that include the zone name. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit d9ae873)

When zones are deleted, empty ApplyStatus patches are sent to remove status ownership. Due to a previous bug, these patches left behind stale managedFields entries with signature {"f:status":{}}. This commit adds a one-time startup cleanup that detects and removes these stale entries by checking if managedFields have an empty status and belong to zones that no longer exist. The purpose is to distinguish managedFields that belong to deleted zones from managedFields that belong to external clients (e.g. kubectl). The cleanup runs once when the status manager starts and zones are first discovered. Also added unit test to verify the startup cleanup logic. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit 5597bf8)

ANP/BANP don't use a typed status manager, let's add a startup clean up explicitly to remove any stale managed fields that might be present from previous versions. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit 0a004f3)

OCPBUGS-65514: [4.20] status manager: remove managedFields for deleted zone upon zone deletion

[release-4.20] OCPBUGS-64836: back-port IP & MAC conflict detection

…4.20-to-release-4.19-12-03-2025

openshift-pr-manager · 2025-12-03T09:01:33Z

/ok-to-test
/payload 4.19 ci blocking
/payload 4.19 nightly blocking

openshift-ci-robot · 2025-12-03T09:01:36Z

@openshift-pr-manager[bot]: This pull request explicitly references no jira issue.

Details

In response to this:

Automated branch sync: release-4.20 to release-4.19.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-12-03T09:01:48Z

@openshift-pr-manager[bot]: trigger 5 job(s) of type blocking for the ci release of OCP 4.19

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.19-e2e-gcp-ovn-upgrade
periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aks
periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/aa814aa0-d026-11f0-9026-1a02617a781a-0

trigger 14 job(s) of type blocking for the nightly release of OCP 4.19

periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-azure-aks-ovn-conformance
periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn-conformance
periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-serial
periodic-ci-openshift-release-master-ci-4.19-e2e-aws-upgrade-ovn-single-node
periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview
periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview-serial
periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-upgrade-fips
periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm
periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6
periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance
periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/aa814aa0-d026-11f0-9026-1a02617a781a-1

ricky-rav · 2025-12-03T09:25:29Z

/retitle OCPBUGS-66139: Branch Sync release-4.20 to release-4.19 [12-03-2025]

openshift-ci-robot · 2025-12-03T09:25:39Z

@openshift-pr-manager[bot]: This pull request references Jira Issue OCPBUGS-66139, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.19.z) matches configured target version for branch (4.19.z)
bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
release note text is set and does not match the template
dependent bug Jira Issue OCPBUGS-65514 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
dependent Jira Issue OCPBUGS-65514 targets the "4.20.z" version, which is one of the valid target versions: 4.20.0, 4.20.z
bug has dependents

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Automated branch sync: release-4.20 to release-4.19.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jluhrsen · 2025-12-03T16:27:24Z

/test 4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-dualstack-bgp
/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw
/test e2e-metal-ipi-ovn-ipv6

jluhrsen · 2025-12-03T16:27:54Z

/payload-job periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-serial
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6
/payload-job periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade

openshift-ci · 2025-12-03T16:28:05Z

@jluhrsen: trigger 6 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm
periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn
periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-serial
periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6
periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/05c3fcd0-d065-11f0-9a38-02bc6f28e65c-0

jluhrsen · 2025-12-09T22:17:56Z

/test e2e-aws-ovn-windows

jluhrsen · 2025-12-09T22:18:03Z

/payload-job periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn
/payload-job periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance

openshift-ci · 2025-12-09T22:18:06Z

@jluhrsen: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn
periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ee64f570-d54c-11f0-899b-ecc6399fe619-0

jluhrsen · 2025-12-11T22:52:16Z

/payload-job periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial /payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn /payload-job periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance

These aren't expected to pass. they actually don't even run. ignore them.

and I thought windows was better, but maybe not:

/test e2e-aws-ovn-windows

jluhrsen · 2025-12-18T22:11:37Z

looks like the backport we might need to get the windows job to work has merged. let's see:

/test e2e-aws-ovn-windows

openshift-ci · 2025-12-19T00:57:24Z

@openshift-pr-manager[bot]: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

tssurya · 2025-12-29T11:10:15Z

/lgtm

openshift-ci · 2025-12-29T11:10:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-pr-manager[bot], tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [tssurya]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tssurya · 2025-12-29T11:10:56Z

/label backport-risk-assessed

jluhrsen · 2025-12-31T03:44:22Z

/verified by e2e and payload jobs

openshift-ci-robot · 2025-12-31T03:44:34Z

@jluhrsen: This PR has been marked as verified by e2e and payload jobs.

Details

In response to this:

/verified by e2e and payload jobs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2025-12-31T03:48:19Z

@openshift-pr-manager[bot]: Jira Issue Verification Checks: Jira Issue OCPBUGS-66139
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-66139 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Automated branch sync: release-4.20 to release-4.19.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

RamLavi and others added 19 commits November 13, 2025 13:39

test/e2e/kubevirt: Add duplicate IP Validation test

a1686d3

Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 9b16a7f)

test/e2e/preconfigured_layer2: Add duplicate IP Validation pod tests

4b476e3

Signed-off-by: Ram Lavi <ralavi@redhat.com> (cherry picked from commit 481e954)

e2e,net-seg default-net annot: Test MAC conflict detection

2d2a34a

Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit f46a21c)

e2e,kubevirt: Test MAC conflict detection

7db91fc

Signed-off-by: Or Mergi <ormergi@redhat.com> (cherry picked from commit e4835ab)

Merge pull request #2855 from ricky-rav/OCPBUGS-62262_420

1b4dc2f

OCPBUGS-65514: [4.20] status manager: remove managedFields for deleted zone upon zone deletion

Merge pull request #2827 from ormergi/bp-ip-mac-conflict-detection

6c78b6d

[release-4.20] OCPBUGS-64836: back-port IP & MAC conflict detection

Merge remote-tracking branch 'origin/release-4.20' into sync-release-…

441910e

…4.20-to-release-4.19-12-03-2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 3, 2025

openshift-ci bot requested review from jcaamano and martinkennelly December 3, 2025 09:04

openshift-ci bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Dec 3, 2025

openshift-ci bot changed the title ~~NO-JIRA: Branch Sync release-4.20 to release-4.19 [12-03-2025]~~ OCPBUGS-66139: Branch Sync release-4.20 to release-4.19 [12-03-2025] Dec 3, 2025

openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Dec 3, 2025

openshift-ci bot assigned tssurya Dec 29, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 29, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 29, 2025

openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Dec 29, 2025

openshift-ci bot assigned anuragthehatter, asood-rh, barbora137, huiran0826, jechen0648, Meina-rh, mffiedler, qiowang721 and rbbratta Dec 29, 2025

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Dec 31, 2025

openshift-merge-bot bot merged commit 7533eec into release-4.19 Dec 31, 2025
28 of 29 checks passed

OCPBUGS-66139: Branch Sync release-4.20 to release-4.19 [12-03-2025] #2890

OCPBUGS-66139: Branch Sync release-4.20 to release-4.19 [12-03-2025] #2890

Uh oh!

Conversation

openshift-pr-manager bot commented Dec 3, 2025

Uh oh!

openshift-pr-manager bot commented Dec 3, 2025

Uh oh!

openshift-ci-robot commented Dec 3, 2025

Uh oh!

openshift-ci bot commented Dec 3, 2025

Uh oh!

ricky-rav commented Dec 3, 2025

Uh oh!

openshift-ci-robot commented Dec 3, 2025

Uh oh!

jluhrsen commented Dec 3, 2025

Uh oh!

jluhrsen commented Dec 3, 2025

Uh oh!

openshift-ci bot commented Dec 3, 2025

Uh oh!

jluhrsen commented Dec 9, 2025

Uh oh!

jluhrsen commented Dec 9, 2025

Uh oh!

openshift-ci bot commented Dec 9, 2025

Uh oh!

jluhrsen commented Dec 11, 2025

Uh oh!

jluhrsen commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Dec 19, 2025

Uh oh!

tssurya commented Dec 29, 2025

Uh oh!

openshift-ci bot commented Dec 29, 2025

Uh oh!

tssurya commented Dec 29, 2025

Uh oh!

jluhrsen commented Dec 31, 2025

Uh oh!

openshift-ci-robot commented Dec 31, 2025

Uh oh!

Uh oh!

openshift-ci-robot commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

jluhrsen commented Dec 18, 2025 •

edited

Loading