OCPBUGS-62859, OCPBUGS-59680: [release-4.18] DownStream Merge Sync from 4.19 [09-29-2025]#2768
Conversation
Signed-off-by: Jitse Klomp <jitse.klomp@conclusionxforce.nl>
Fix node update check for network cluster controller
During update node events, local and remote addOrUpdate functions are called. There are a series of sync checks used to know what to configure. However, in some cases log messages were being printed no matter what, and hybrid overlay was being processed on every node event. This cleans things up so that hybrid overlay is only sync'ed when necessary, and logs are only printed when work is being done to add the local or remote node. Also, removes an old test case for hybrid overlay where the node-subnets annotation of a node was being removed. First introduced here: ovn-kubernetes/ovn-kubernetes@aef135c#diff-9ab180ea9a39f81dc8334a00ca8ea5e4cd04f9491c27dcfd910b07929c9ddbb5R193 It's not totally clear what the purpose of this test was, but we do not support clearing OVN configuration when OVNK assigned annotations are removed by the user. The node-subnets annotation should not be removed, and if is removed, it should be configured back onto the node by cluster-manager. Signed-off-by: Tim Rozet <trozet@redhat.com>
When remote nodes are added (as new UDNs are created) the first remote add always fails. This is because the controller is waiting for the subnets annotation to be updated for the network. However, it only partially fails. It fails when the routes are attempting to be added, but this is after the logical switch port logic and some other parsing has already been done. Rather than execute this work twice, just bail early if the node does not have all of the annotations yet. This way we can execute the majority of the work only one time. With this change, only once all annotations are present will you see: "Creating interconnect resources for remote zone node" Signed-off-by: Tim Rozet <trozet@redhat.com>
Just execute the 2 route adds in the same txn Signed-off-by: Tim Rozet <trozet@redhat.com>
When a CUDN/UDN is create with joinSubnets field configured it should generate the net-attach-def with `joinSubnet` field, the code was using `joinSubnets` wich is not undertood by ovn-kubernetes. Signed-off-by: Enrique Llorente <ellorent@redhat.com>
Configures ephemeral port range for OVN SNAT'ing
udn: Fix NAD template for join subnets field
… module Signed-off-by: Alin Gabriel Serdean <aserdean@nvidia.com>
workflow: Add fix missing and apt update before trying to install VRF…
So that ginkgo times out first and we get useful output. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
We have a flow [1] to prevent leaking traffic towards a ClusterIP. However we also have a flow to prevent EIP traffic to egress before being SNATed and an additional flow to actually allow the traffic to egress in ICNI/BGP scenarios for pods on the nodes subnet [2]. The higher priority of flow [2] prevents flow [1] to be in effect. Bump priority of flow [1] since there is no case where we should leak traffic towards ClusterIPs. [1] cookie=0xdeff105, duration=492.235s, table=0, n_packets=0, n_bytes=0, priority=105,ipv6,in_port="patch-breth0_ov",ipv6_dst=fd00:10:96::/112 actions=drop [2] cookie=0xdeff105, duration=2308.615s, table=0, n_packets=4, n_bytes=376, priority=109,ipv6,in_port="patch-breth0_ov",dl_src=96:b0:34:18:12:7c,ipv6_src=fd00:10:244:1::/64 actions=ct(commit,zone=64000,exec(load:0x1->NXM_NX_CT_MARK[])),output:eth0 cookie=0xdeff105, duration=1991.854s, table=0, n_packets=0, n_bytes=0, priority=104,ipv6,in_port="patch-breth0_ov",ipv6_src=fd00:10:244::/48 actions=drop Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Change configuration in preparation for running all control plane tests: * Make both dualstack, not much value testing IPv4 single stack * Make one of the lanes noSnatGW to get signal from that as well * Enable multicast and empty LB events * Configure host to be able to route to networks from the external world * Ensure frr container is not able to route through the host/runner Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Skip those test that wouldn't be supported or otherwise require additional work. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Run all control plane tests for bgp lane
This PR adds rules to prevent SNAT if source IP belongs to the mgmtport-no-snat-subnets-v4 or mgmtport-no-snat-subnets-v6 sets, which store IPv4 and IPv6 subnets, respectively. Signed-off-by: Yossi Boaron <yboaron@redhat.com>
Currently traffic gets SNATed at ovn-k8s-mp0 within the mgmtport-snat chain. Since OVNK has transitioned to nftables, this behavior can no longer be overridden. Previously, with iptables, SNAT could be avoided by adding a higher-priority rule in the POSTROUTING chain. However, with nftables, all rules are evaluated before making a final decision, making it impossible to skip SNAT. Some applications, like Submariner, need to preserve the source IP when traffic reaches the destination pod, as certain use cases depend on it. This PR Update mgmtport-no-snat-subnets-v4 and mgmtport-no-snat-subnets-v6 nftables set based on node's annotation values. Signed-off-by: Yossi Boaron <yboaron@redhat.com>
Signed-off-by: Yossi Boaron <yboaron@redhat.com>
Signed-off-by: thisisobate <obasiuche62@gmail.com>
Some quality of life improvements for layer 3 controllers node handling
Everytime the node updates it is triggering addEgressNode, which does a route add operation libovsdb txn for default network and every UDN, initiated from the default controller egress node logic. Only runs when needed now. Signed-off-by: Tim Rozet <trozet@redhat.com>
This is unnecessary because there is another UDN path that will call this code: secondary_layer2/3_controller -> addUpdateLocalNodeEvent -> ensureRouterPoliciesForNetwork -> CreateDefaultRouteToExternal Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
This function is called from many different threads. Relying on nbdb for the GR IP is not safe here, as the GR IP could be changing due to a k8s event, and the route will be wrongly configured with an old IP still in OVN NBDB. Signed-off-by: Tim Rozet <trozet@redhat.com>
chore: update footer with new LF trademark disclaimer
Optimize egress ip performance with UDNs
Enable SNAT bypass in mgmtport-snat chain for specified subnets
OCPBUGS-55098: DownStream Merge [06-04-2025]
|
/label backport-risk-assessed |
|
@kyrtapz: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retitle OCPBUGS-59680: [release-4.18] DownStream Merge Sync from 4.19 [09-29-2025] |
|
@kyrtapz: This pull request references Jira Issue OCPBUGS-59680, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
there was a bug exposed w the small merge that we did not want to introduce in to 4.18. the fix had to go upstream
I will track the nightlies
honestly the only one I really know is the GCP bug and I added that to the title |
|
/verified |
|
@jluhrsen: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified later @jluhrsen not sure what I'm doing here, but seeing if this works. |
|
@jluhrsen: This PR has been marked to be verified later by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
82307af
into
openshift:release-4.18
|
@kyrtapz: Jira Issue OCPBUGS-59680: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged:
All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-59680 has not been moved to the MODIFIED state. This PR is marked as verified-later. Jira issue(s) in the title of this PR will require post-merge verification. After testing, it must be manually moved to the DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retitle OCPBUGS-62859, OCPBUGS-59680: [release-4.18] DownStream Merge Sync from 4.19 [09-29-2025] |
|
@kyrtapz: Jira Issue OCPBUGS-62859 is in an unrecognized state (Verified) and will not be moved to the MODIFIED state. Jira Issue OCPBUGS-59680 is in an unrecognized state (Verified) and will not be moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This is a sync from 4.19 up to #2704 with a manually cherry-picked commit to fix the GCP issue.