OCPBUGS-57179, OCPBUGS-49824: DownStream Merge [07-09-2025]#2659
OCPBUGS-57179, OCPBUGS-49824: DownStream Merge [07-09-2025]#2659openshift-merge-bot[bot] merged 17 commits intoopenshift:masterfrom
Conversation
The FDB lookup is only used for non-destined shared MAC traffic. When OVN or the host send a packet that hits a NORMAL action it will initate MAC learning and can drive up the CPU of OVS. We still need NORMAL action to account for sending to unknown ports like localnet ports, but we do not want to learn the shared MAC. Therefore create a static entry binding it to the LOCAL port. Signed-off-by: Tim Rozet <trozet@redhat.com>
Commit f978967 caused a regression in performance. As the below issue describes, the egress traffic from OVN will now use NORMAL action, which will cause an FDB lookup and then FLOOD if not found. This always ends up being the case because the reply ARP packet from the physical port is flooded to the patch port and the LOCAL port. This causes an increase in CPU and unnecessarily flooding packets. We need layer 2 packets destined to the shared gateway mac to go to both the host and OVN. This is so both can receive ARP replies, etc. However, we also need the FDB entry in OVS to get updated, for our new functionality with using the NORMAL action. To fix this, add a static FDB entry for LOCAL, then modify the layer 2 flooding flow actions from "output:patch,LOCAL" to "output:patch,NORMAL". Since the FDB entry is bound in the table to LOCAL, it is effectively forwarding the packets the same as before, but with the added bonus of FDB learning on ingress. Fixes: #5318 Signed-off-by: Tim Rozet <trozet@redhat.com>
This allows a localnet VM arp reply to go to OVN, rather than a lookup that only hits the LOCAL port in the fdb table. Signed-off-by: Tim Rozet <trozet@redhat.com>
When using Docker, push image command fails because the push_args var is interpreted as empty string, Docker reject it as invalid variable and fails with the following error: $ docker push '' localhost:5000/ovn-daemonset-fedora:latest docker: 'docker push' requires 1 argument Remove the push_args wrapping quotes. Signed-off-by: Or Mergi <ormergi@redhat.com>
Since CanServeNamespace filters out namespace events for namespaces unknown to be served by this primary network, we need to reconcile namespaces once the network is reconfigured to serve a namespace. Hence this commit reconciles those namespaces and also reconciles each network policy if it contains only peer namespace selector. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
This commits exports FilterFunc from handler and uses it while reconciling network policy for UDN peer namespaces. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
This commit makes network reconcilation loop to sync only namespace object and network policies sync to happen from namespace reconcilation loop. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
The diff between v0.7.0 and v0.8.0 is simply a rename from ovn-org/libovsdb to ovn-kubernetes/libovsdb. Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
kind: Rm push_args variable quotes
Initial implementations erroneously assumed a CIDR for NATs logicalIP. Also, eip controller expects all OVN constructs that support EIP to have this metadata so if we cannot build this metadata then add dummy data so its cleaned up later by EIP controller. This was not caught by unit tests because the unit test also contained the assumption of only logical IP with no mask. It was not caught by upstream CI because we have no reboot tests. Signed-off-by: Martin Kennelly <mkennell@redhat.com>
The startup syncer was removing OVN constructs due to logic bugs introduced when EIP code was refactored for UDN. The are added again when eip controller syncs but this causes interruption. 1. Due to poor naming, enforcement of types and programmer error we were mixing up variables between a pod IP and an EIP IP. See: nodeName, ok := cache.egressIPIPToNodeCache[parsedLogicalIP.String()] parsedLogicalIP is a pod IP and not an EIP IP. 2. When iterating over the existing config for an EIP, we should delete config for LRPs where an EIP doesn't exist. 3. Remove LRPs when a network isnt found Signed-off-by: Martin Kennelly <mkennell@redhat.com>
…readability No func changes. Check if obj is nil post parsing IP. Improve logging of stale OVN config. Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Removes config for deleted nodes/pods while controller was down and ensures ovn config is removed while preserving valid config. Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Fixes FDB learning and usage of NORMAL action
chore: bump libovsdb to v0.8.0
EgressIP: fix startup syncer
|
@martinkennelly: This pull request references Jira Issue OCPBUGS-57179, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (jechen@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-49824, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload 4.20 nightly blocking |
|
@martinkennelly: trigger 11 job(s) of type blocking for the nightly release of OCP 4.20
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/33b15de0-5cc5-11f0-9598-ed265b50f5d0-0 |
|
/payload 4.20 ci blocking |
|
@jluhrsen: trigger 4 job(s) of type blocking for the ci release of OCP 4.20
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8cefe1d0-5ce6-11f0-84cf-82dbf5953384-0 |
|
/retest |
|
/retitle OCPBUGS-57179, OCPBUGS-49824: DownStream Merge [07-09-2025] |
TBH, this job is NOT healthy right now. I don't think we should care if it passes or not |
|
Looks good to me now @jluhrsen i checked the single node job and its unrelated. |
|
/test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade-ipsec |
|
/test e2e-aws-ovn-hypershift-kubevirt |
|
/test e2e-aws-ovn-fdp-qe |
|
just need some labels here please |
|
perf scale is failing do we know why? |
did it just time out? I see this: |
|
/lgtm |
|
The perf/scale lane is failing because of known bug: https://redhat-internal.slack.com/archives/GQ0CU2623/p1752603854333399?thread_ts=1752571219.301049&cid=GQ0CU2623 |
|
hypershift-kubevirt has such a low pass rate ugh and uprade-ipsec is perma failing cc @pperiyasamy |
|
/approve returning favor to Jaime who took care of the 4.19 one last time |
|
fdp-qe job didn't run - failures unrelated |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jluhrsen, martinkennelly, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
|
/retest Unrelated failures |
|
/test e2e-aws-ovn-serial |
although this job looks to have taken a turn for the worse recently. we may not get lucky here and will need to override |
|
@martinkennelly: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
272896f
into
openshift:master
|
@martinkennelly: Jira Issue OCPBUGS-57179: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-57179 has been moved to the MODIFIED state. Jira Issue OCPBUGS-49824: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-49824 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Do we get a downstream merge to versions 4.20 and 4.19 to fix OCPBUGS-57179 there as well? |
yes, we will continuously do these kind of downstream merges from upstream ovnk in to openshift master. then sync those changes to 4.19 and down to 4.18 in time. There will likely just be a continuous flow of these merge PRs happening and as soon as one gets in a new one with new changes will be opened. currently we have: |
|
/jira refresh |
|
@jechen0648: Jira Issue OCPBUGS-57179 is in an unrecognized state (Verified) and will not be moved to the MODIFIED state. Jira Issue OCPBUGS-49824 is in an unrecognized state (ON_QA) and will not be moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
u/s to d/s merge to main
cc @jcaamano @pperiyasamy @trozet