MERGE CONFLICT! NO-JIRA: DownStream Merge [08-16-2025] by openshift-pr-manager[bot] · Pull Request #2722 · openshift/ovn-kubernetes

openshift-pr-manager · 2025-08-16T15:47:53Z

Automated merge of upstream/master → master.

If the `k8s.ovn.org/zone-name` label is not set on the node, the fallback logic now uses the node name as the zone when `ovn_enable_interconnect` is true. Otherwise, the zone defaults to "global" as before. Also updated the empty string check to use `-z`, which is more idiomatic in Bash. Signed-off-by: Flavio Fernandes <ffernandes@nvidia.com>

Update unit tests to check that the returned error contains the expected message, not just that an error occurred. This ensures the renderer fails for the right reasons, ensuring tests precisely validate failures. Signed-off-by: Lei Huang <leih@nvidia.com>

Today when default network or UDN networks are advertised using RAs the nodes also learn the routes of other nodes' pod subnets in the cluster. Example snippet of exposing a UDN network on non-vrflite usecase: root@ovn-worker2:/# ip r show table 1048 default via 172.18.0.1 dev breth0 mtu 1400 10.96.0.0/16 via 169.254.0.4 dev breth0 mtu 1400 10.244.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 10.244.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 103.103.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 103.103.1.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 169.254.0.3 via 203.203.1.1 dev ovn-k8s-mp12 169.254.0.34 dev ovn-k8s-mp12 mtu 1400 172.26.0.0/16 nhid 41 via 172.18.0.5 dev breth0 proto bgp metric 20 203.203.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 203.203.0.0/16 via 203.203.1.1 dev ovn-k8s-mp12 203.203.1.0/24 dev ovn-k8s-mp12 proto kernel scope link src 203.203.1.2 local 203.203.1.2 dev ovn-k8s-mp12 proto kernel scope host src 203.203.1.2 broadcast 203.203.1.255 dev ovn-k8s-mp12 proto kernel scope link src 203.203.1.2 203.203.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 root@ovn-worker2:/# ip r show table 1046 default via 172.18.0.1 dev breth0 mtu 1400 10.96.0.0/16 via 169.254.0.4 dev breth0 mtu 1400 10.244.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 10.244.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 103.103.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 103.103.0.0/16 via 103.103.2.1 dev ovn-k8s-mp11 103.103.1.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 103.103.2.0/24 dev ovn-k8s-mp11 proto kernel scope link src 103.103.2.2 local 103.103.2.2 dev ovn-k8s-mp11 proto kernel scope host src 103.103.2.2 broadcast 103.103.2.255 dev ovn-k8s-mp11 proto kernel scope link src 103.103.2.2 169.254.0.3 via 103.103.2.1 dev ovn-k8s-mp11 169.254.0.32 dev ovn-k8s-mp11 mtu 1400 172.26.0.0/16 nhid 41 via 172.18.0.5 dev breth0 proto bgp metric 20 203.203.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 203.203.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 root@ovn-worker2:/# this happens because we import routes from the default VRF: prefixes: - 103.103.0.0/24 - 2014:100:200::/64 - 2016:100:200::/64 - 203.203.0.0/24 - asn: 64512 imports: - vrf: default vrf: mp11-udn-vrf - asn: 64512 imports: - vrf: default vrf: mp12-udn-vrf nodeSelector: matchLabels: kubernetes.io/hostname: ovn-worker raw: {} root@ovn-worker2:/# ip r default via 172.18.0.1 dev breth0 mtu 1400 10.96.0.0/16 via 169.254.0.4 dev breth0 mtu 1400 10.244.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 10.244.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 103.103.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 103.103.1.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 169.254.0.3 via 203.203.1.1 dev ovn-k8s-mp12 169.254.0.34 dev ovn-k8s-mp12 mtu 1400 172.26.0.0/16 nhid 41 via 172.18.0.5 dev breth0 proto bgp metric 20 203.203.0.0/24 nhid 39 via 172.18.0.4 dev breth0 proto bgp metric 20 203.203.0.0/16 via 203.203.1.1 dev ovn-k8s-mp12 203.203.1.0/24 dev ovn-k8s-mp12 proto kernel scope link src 203.203.1.2 local 203.203.1.2 dev ovn-k8s-mp12 proto kernel scope host src 203.203.1.2 broadcast 203.203.1.255 dev ovn-k8s-mp12 proto kernel scope link src 203.203.1.2 203.203.2.0/24 nhid 40 via 172.18.0.3 dev breth0 proto bgp metric 20 which directly breaks UDN isolation. In this commit we are going to remove the support for receiving routes. So advertising routes will only advertise routes and we will no longer make the nodes receive these routes. However in the future when we support overlay-mode with BGP, we will need to re-add these routes and design a better isolation model with UDNs within the cluster if that is desired. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

This is a temporary commit - we need a proper followup. Please see ovn-kubernetes/ovn-kubernetes#5407 for details. As of today all NATs created by OVN-Kubernetes are unique using the existing 5 tuple algo in IsEquivalentNAT - uuid, type of snat, logicalIP, logicalPort, externalIP, externalIDs. So its OK to get rid of match. But its not the correct way to fix this - in future we might have two NATs with all other fields except match being the same. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

This PR is adding SNAT for advertised UDNs and CDN if the destination of the traffic is towards other nodes in the cluster. This is a design change for BGP from before (where pod->node was not SNATed and podIP was preserved). For normal UDNs we have 2 SNATs: L3 UDN SNATs: 1) this cSNAT is added to ovn_cluster_router for LGW egress traffic and SGW KAPI/DNS traffic: _uuid : 5485a25f-7a83-4dc0-840c-bbfbd0784aad allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-green-network, "k8s.ovn.org/topology"=layer3} external_ip : "169.254.0.38" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "203.203.0.0/24" logical_port : rtos-cluster_udn_tenant.green.network_ovn-control-plane match : "eth.dst == 0a:58:cb:cb:00:02" options : {stateless="false"} priority : 0 type : snat 2) this SNAT is added to GR for SGW egress traffic: _uuid : d85fd65f-e3f3-4d52-95f9-5f88c925aa5a allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-green-network, "k8s.ovn.org/topology"=layer3} external_ip : "169.254.0.37" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "203.203.0.0/16" logical_port : [] match : "" options : {stateless="false"} priority : 0 type : snat for L2, we have the following two SNATs both on GR: _uuid : a4b9942f-ec1a-42ca-81d9-3e4885ff2470 allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-blue-network, "k8s.ovn.org/topology"=layer2} external_ip : "169.254.0.36" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "93.93.0.0/16" logical_port : rtoj-GR_cluster_udn_tenant.blue.network_ovn-control-plane match : "eth.dst == 0a:58:5d:5d:00:02" options : {stateless="false"} priority : 0 type : snat and _uuid : 24164866-da95-4b6f-9c65-8b16fa202758 allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-blue-network, "k8s.ovn.org/topology"=layer2} external_ip : "169.254.0.35" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "93.93.0.0/16" logical_port : [] match : "outport == \"rtoe-GR_cluster_udn_tenant.blue.network_ovn-control-plane\"" options : {stateless="false"} priority : 0 type : snat now with advertised networks these will change to: _uuid : a4b9942f-ec1a-42ca-81d9-3e4885ff2470 allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-blue-network, "k8s.ovn.org/topology"=layer2} external_ip : "169.254.0.36" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "93.93.0.0/16" logical_port : rtoj-GR_cluster_udn_tenant.blue.network_ovn-control-plane match : "eth.dst == 0a:58:5d:5d:00:02 && (ip4.dst == $a712973235162149816)" options : {stateless="false"} priority : 0 type : snat _uuid : 24164866-da95-4b6f-9c65-8b16fa202758 allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-blue-network, "k8s.ovn.org/topology"=layer2} external_ip : "169.254.0.35" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "93.93.0.0/16" logical_port : [] match : "outport == \"rtoe-GR_cluster_udn_tenant.blue.network_ovn-control-plane\" && ip4.dst == $a712973235162149816" options : {stateless="false"} priority : 0 type : snat _uuid : d85fd65f-e3f3-4d52-95f9-5f88c925aa5a allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-green-network, "k8s.ovn.org/topology"=layer3} external_ip : "169.254.0.37" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "203.203.0.0/16" logical_port : [] match : "ip4.dst == $a712973235162149816" options : {stateless="false"} priority : 0 type : snat _uuid : 5485a25f-7a83-4dc0-840c-bbfbd0784aad allowed_ext_ips : [] exempted_ext_ips : [] external_ids : {"k8s.ovn.org/network"=cluster_udn_tenant-green-network, "k8s.ovn.org/topology"=layer3} external_ip : "169.254.0.38" external_mac : [] external_port_range : "32768-60999" gateway_port : [] logical_ip : "203.203.0.0/24" logical_port : rtos-cluster_udn_tenant.green.network_ovn-control-plane match : "eth.dst == 0a:58:cb:cb:00:02 && (ip4.dst == $a712973235162149816)" options : {stateless="false"} priority : 0 type : snat so basically we add this extra match for destination IPs to SNAT to masqueradeIP for that UDN note: with this PR we will break hardware offload for assymmetry traffix for BGP L2 As for the CDN, we have 1 SNAT with no matches on GR and that is being changed to now have a cSNAT in case the default network is advertised. NOTE: In -ds flag mode, the per-pod SNAT will have this match set. NOTE2: For all deleteNAT scenarios we purposefully don't pass snat as a match criteria Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Given that some traffic like pod->node and pod->nodeport will be SNATed to nodeIP for UDNs, we will need iprules for both masqueradeIP and nodeIP to be present when networks are advertised. This is nothing complicated as keeping the masqueradeIP dangling around doesn't hurt anything (I hope :)) so for pod->node it follows the normal UDN LGW egress traffic flow: 1) pod->switch->ovn_cluster_router 2) SNAT at the router to masIP 3) ovn_cluster_router->switch->mpX 4) goes out and then reply coming from outside will hit these masqueradeIP rules to come back in since we snated to masqueradeIP on the way out, so we need both podsubnet and masqueradeIP rules for advertised networks for all other traffic no SNATing is done Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

This commit is a prep-commit that converts the LGW POSTROUTING chain rules from IPT to NFT. Why do we need to do this now? It's because for BGP we want to use the PMTUD remote nodeIP NFT sets to also do conditional masquerading in Local Gateway mode for BGP when traffic leaves UDNs towards other nodes in the cluster or other nodeports. Given PMTUD rules are in NFT but the lgw and udn masquerade rules are in IPT - we'd need to pick one to express all - since we want to move to NFT, its better to go that route. Below is how the rules look like. chain ovn-kube-local-gw-masq { comment "OVN local gateway masquerade" type nat hook postrouting priority srcnat; policy accept; ip saddr 169.254.0.1 masquerade ip6 saddr fd69::1 masquerade jump ovn-kube-pod-subnet-masq jump ovn-kube-udn-masq } chain ovn-kube-pod-subnet-masq { ip saddr 10.244.2.0/24 masquerade ip6 saddr fd00:10:244:1::/64 masquerade } chain ovn-kube-udn-masq { comment "OVN UDN masquerade" ip saddr != 169.254.0.0/29 ip daddr != 10.96.0.0/16 ip saddr 169.254.0.0/17 masquerade ip6 saddr != fd69::/125 ip daddr != fd00:10:96::/112 ip6 saddr fd69::/112 masquerade } This commit was AI-Cursor-gemini/claude assissted under my supervision/prompting/reviewing/back-forth iterations Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

let's reuse the pmtud address-set ips of the remote nodes ips also for bgp advertised networks cSNAT Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

This commit is valid only for default networks as mentioned in title. It's because unlike in UDNs where we do cSNATs in OVN on router at the edge before it leaves to node, for CDN everything happens on the node side already - so we can leverage the nodeIP masquerade bits. if network is advertised: chain ovn-kube-pod-subnet-masq { ip saddr 10.244.2.0/24 ip daddr @remote-node-ips-v4 masquerade ip6 saddr fd00:10:244:3::/64 ip6 daddr @remote-node-ips-v6 masquerade } else: chain ovn-kube-pod-subnet-masq { ip saddr 10.244.2.0/24 masquerade ip6 saddr fd00:10:244:3::/64 masquerade } Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

1) remove the l2 failure limitation since we now use nodeIPs reply knows how to go back to src node since we have routes for that 2) add udn pod -> default network nodeport service (same and diff node) 3) add udn pod -> udn network nodeport service (same and diff node) - same network 4) add udn pod -> udn network nodeport service (same and diff node) - different network Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

In the previous commits we added SNATing to nodeIP for the following traffic flows: pod -> nodes pod -> nodeports when pods are part of advertised networks. Prior to SNATing to nodeIPs they are SNATed at the ovn_cluster_router to masqueradeIP before being sent into the host. In commit ovn-kubernetes/ovn-kubernetes@75dd73f we had converted all UDN flows that matched on masqueradeIP as the source on breth0 for UDN pods to services traffic flow to instead match on the podsubnets. However given we have pod to node and pod to nodeport traffic flows using masqueradeIP as the SNAT we need to now re-add the masqueradeIP flows as well to ensure that nodeports isolation between UDNs work correctly. Before this commit: In LGW/SGW flow is: UDN pod -> samenodeIP:nodeport in default network -> SNATed to masqueradeIP of that UDN -> sent to host -> SNATed to clusterIP -> hits the default flow in table=2 in br-ex: cookie=0xdeff105, duration=15690.053s, table=2, n_packets=0, n_bytes=0, idle_age=15690, priority=100 actions=mod_dl_dst:6e:4d:97:c0:3c:97,output:2 and sends to patch port of default network and this traffic starts working when it shouldn't. (I mean eventually we want this to work, see ovn-kubernetes/ovn-kubernetes#5410 but that's a future issue - outside my PR's scope) In case of L3 UDN advertised pod -> nodeport service in default or other UDN network: ovn-kubernetes/ovn-kubernetes@d63887e is the commit where we added logic to match on srcIP of the traffic and accordingly route it into the respective UDN patchports. So there we use the masqueradeIP of a particular UDN to determine what the source of the traffic was and route it into that particular UDN's patchport where it would backhole if there was no matching clusterIP NAT entry there, and this is how isolation was guaranteed. Recently this was changed to a hard drop: ovn-kubernetes/ovn-kubernetes@dcc403c For l2 topology the logic is same as above for clusterIPs but for nodeports the GR itself drops the packets destined towards the other networks as there is no LB entry present on the GR as the destination IP is that of the router itself. That's how isolation works there: sample trace: next; 10. ls_out_apply_port_sec (northd.c:6039): 1, priority 0, uuid 2aa6ebd5 output; /* output to "stor-cluster_udn_tenant.blue.network_ovn_layer2_switch", type "l3gateway" */ ingress(dp="GR_cluster_udn_tenant.blue.network_ovn-worker2", inport="rtos-cluster_udn_tenant.blue.network_ovn_layer2_switch") ----------------------------------------------------------------------------------------------------------------------------- 0. lr_in_admission (northd.c:13232): eth.dst == 0a:58:64:41:00:03 && inport == "rtos-cluster_udn_tenant.blue.network_ovn_layer2_switch", priority 50, uuid 7f9af183 reg9[1] = check_pkt_larger(1414); xreg0[0..47] = 0a:58:64:41:00:03; next; 1. lr_in_lookup_neighbor (northd.c:13420): 1, priority 0, uuid d2672052 reg9[2] = 1; next; 2. lr_in_learn_neighbor (northd.c:13430): reg9[2] == 1 || reg9[3] == 0, priority 100, uuid 84ca0ef4 mac_cache_use; next; 3. lr_in_ip_input (northd.c:12824): ip4.dst == {172.18.0.4}, priority 60, uuid ea41c4e7 drop; Without this fix: [FAIL] BGP: isolation between advertised networks Layer3 connectivity between networks [It] pod in the UDN should not be able to access a default network service the above test will work in LGW when it should not work like is the case for non-advertised UDNs. This commit adds back the masqueradeIP flow as well for advertised networks that drops all packets that didn't get routed on the higher priority pkt_mark flows at 250. when 2 UDNs are advertised: this PR added back these two flows with masqueradeIP match: cookie=0xdeff105, duration=127.593s, table=2, n_packets=0, n_bytes=0, priority=200,ip,nw_src=169.254.0.12 actions=drop cookie=0xdeff105, duration=127.534s, table=2, n_packets=0, n_bytes=0, priority=200,ip,nw_src=169.254.0.14 actions=drop Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Currently there are two bugs around using priority 100 for ovn-kube-local-gw-masq chain. EgressIPs multinic rules are still in legacy IPT: [0:0] -A OVN-KUBE-EGRESS-IP-MULTI-NIC -s 10.244.2.6/32 -o eth1 -j SNAT --to-source 10.10.10.105 [0:0] -A OVN-KUBE-EGRESS-IP-MULTI-NIC -s 10.244.0.3/32 -o eth1 -j SNAT --to-source 10.10.10.105 [1:60] -A OVN-KUBE-EGRESS-IP-MULTI-NIC -s 10.244.1.3/32 -o eth1 -j SNAT --to-source 10.10.10.105 and in netfilter the priority of NAT POSTROUTNG HOOK is 100 and not configurable. NF_IP_PRI_NAT_SRC in netfilter and for NFTables its the same value 100 for NAT POSTROUTING hook and its called "srcnat" in knftables and set to 100. and this is the priority used by egress service feature since that is already converted to NFT: chain egress-services { type nat hook postrouting priority srcnat; policy accept; meta mark 0x000003f0 return comment "DoNotSNAT" snat ip to ip saddr map @egress-service-snat-v4 snat ip6 to ip6 saddr map @egress-service-snat-v6 } and now that we have converted POSTROUTING rules for local-gw as well to NFT, those rules were already at priority 100. Unlike IPT rules where we could jump to EIP and ESVC chains before masquerade rules got hit, here those chains in NFT are all parallel at same priority 100 and we don't know which one will be hit first. Hence we need to change the priority of ovn-kube-local-gw-masq so that EIP/ESVC rules are hit before the default masquerade rules W/O this change EIP/ESVC tests fail in CI Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Prior to this change, the remote PMTUD address sets were only considering the primary IP of the node. While that was OK for PMTUD use case perhaps but for BGP now that we reuse this address set in NFT we need to consider all the IPs on the remote nodes. So this commit changes code from using node internal IPs to using the HostCIDRs annotation Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

When using the onModelUpdatesAllNonDefault() from NAT updates, it wasn't updating match value when we wanted to reset it. So when we went from advertised network to non-advertised network, we were not changing the SNAT match and hence traffic was still going out with podIP instead of nodeIP. This commit fixes that. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

See ovn-kubernetes/ovn-kubernetes#5419 for details But the traffic flow looks like this for Layer3(v4 and v6) and Layer2(v4): pod in UDN A -> sameNodeIP:NodePort i.e 172.18.0.2:30724 pod (102.102.2.4)-> ovn-switch->ovn-cluster-router (SNAT to masqueradeIP 169.254.0.14)-> LRP send to mpX -> in the host (IPTable DNAT from nodePort to clusterIP 10.96.96.233:8080) send to breth0 breth0 flows reroute packet to UDN B's patchport hits the GR of UDNB and DNATs from clusterIP to backend pod that lives on another node (103.103.1.5) at the same time SNAT to joinIP in OVN router i.e 100.65.0.4 reponse comes back from remote pod and then we see ARP requests trying to understand how to reach the masqueradeIP of the other network which makes total sense - so reply fails NetworkB doesn't know how to reach back to NetworkA's masqueradeIP which is the srcIP. root@ovn-control-plane:/# tcpdump -i any -nneev port 36363 or port 30724 or host 102.102.2.4 or host 169.254.0.14 or host 100.65.0.4 tcpdump: data link type LINUX_SLL2 tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:55:14.083364 865a53b516350_3 P ifindex 19 0a:58:66:66:02:04 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 53100, offset 0, flags [DF], proto TCP (6), length 60) 102.102.2.4.42720 > 172.18.0.2.30724: Flags [S], cksum 0x14ad (incorrect -> 0x5e6c), seq 432663101, win 65280, options [mss 1360,sackOK,TS val 1239378349 ecr 0,nop,wscale 7], length 0 08:55:14.084049 ovn-k8s-mp2 In ifindex 14 0a:58:66:66:02:01 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 53100, offset 0, flags [DF], proto TCP (6), length 60) 169.254.0.14.42826 > 172.18.0.2.30724: Flags [S], cksum 0x1c60 (correct), seq 432663101, win 65280, options [mss 1360,sackOK,TS val 1239378349 ecr 0,nop,wscale 7], length 0 08:55:14.084069 breth0 Out ifindex 6 6a:ed:17:fb:28:bd ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 62, id 53100, offset 0, flags [DF], proto TCP (6), length 60) 169.254.0.14.42826 > 10.96.96.233.8080: Flags [S], cksum 0xb59f (correct), seq 432663101, win 65280, options [mss 1360,sackOK,TS val 1239378349 ecr 0,nop,wscale 7], length 0 08:55:14.084470 genev_sys_6081 Out ifindex 7 0a:58:64:58:00:04 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 60, id 53100, offset 0, flags [DF], proto TCP (6), length 60) 100.65.0.4.42826 > 103.103.1.5.8080: Flags [S], cksum 0xfe43 (correct), seq 432663101, win 65280, options [mss 1360,sackOK,TS val 1239378349 ecr 0,nop,wscale 7], length 0 08:55:14.085494 genev_sys_6081 P ifindex 7 0a:58:64:58:00:02 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 103.103.1.5.8080 > 100.65.0.4.42826: Flags [S.], cksum 0x1f4f (correct), seq 3390013464, ack 432663102, win 64704, options [mss 1360,sackOK,TS val 1866737591 ecr 1239378349,nop,wscale 7], length 0 08:55:14.086130 eth0 Out ifindex 2 6a:ed:17:fb:28:bd ethertype ARP (0x0806), length 48: Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.0.14 tell 169.254.0.15, length 28 08:55:14.086172 breth0 B ifindex 6 6a:ed:17:fb:28:bd ethertype ARP (0x0806), length 48: Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.0.14 tell 169.254.0.15, length 28 08:55:15.100558 genev_sys_6081 P ifindex 7 0a:58:64:58:00:02 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 103.103.1.5.8080 > 100.65.0.4.42826: Flags [S.], cksum 0xccdf (incorrect -> 0x1b57), seq 3390013464, ack 432663102, win 64704, options [mss 1360,sackOK,TS val 1866738607 ecr 1239378349,nop,wscale 7], length 0 08:55:15.101090 eth0 Out ifindex 2 6a:ed:17:fb:28:bd ethertype ARP (0x0806), length 48: Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.0.14 tell 169.254.0.15, length 28 08:55:15.101124 breth0 B ifindex 6 6a:ed:17:fb:28:bd ethertype ARP (0x0806), length 48: Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.0.14 tell 169.254.0.15, length 28 ^ its the same for Layer3 v6 as well and same for Layer2 v4 ^^ but Layer2 v6 is weird thanks to: // cookie=0xdeff105, duration=173.245s, table=1, n_packets=0, n_bytes=0, idle_age=173, priority=14,icmp6,icmp_type=134 actions=FLOOD // cookie=0xdeff105, duration=173.245s, table=1, n_packets=8, n_bytes=640, idle_age=4, priority=14,icmp6,icmp_type=136 actions=FLOOD these two flows on breth0 - these seem to be flooding the NDP requests between the GR's of all networks somehow and v6 works. So test is acknowledging this inconsistency and calling this out. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Bump OVN to 25.03

Makes this EMEA/US friendly. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

The "pre assigned port net ids" feature requires a NAD for the `default` network to be provisioned. This commit pre-provisions that NAD whenever the feature - EnableCustomNetworkConfig - is enabled, upon starting the cluster manager. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

Change OVN-Kubernetes community meeting time

…-vms-with-ip-requests udn, pre assigned port net ids: provision the default net NAD CR

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>

Signed-off-by: arkadeepsen <arsen@redhat.com>

UDN Isolation with BGP: Remove support for receiving advertised routes from remote nodes

chore: Update libovsdb bindings to ovn 25.03

Signed-off-by: nithyar <nithyar@nvidia.com>

Bump ubuntu version to 25.04

…TL expiry Signed-off-by: arkadeepsen <arsen@redhat.com>

This commit fixes several issues when allocating the management port with a SR-IOV resource instead of hard coding the SR-IOV netdev name. Such issues come from the fact that the gateway and management port interfaces need to be present during Init. Otherwise other parts of the code will not have the proper interface. Signed-off-by: William Zhao <wizhao@redhat.com> Co-Authored-By: Igal Tsoiref <itsoiref@redhat.com>

Just as we currently do with traffic towards nodes. Specifically this allows for networks advertised with a VRF-Lite configuration with a subnet overlap to reach these services. Otherwise the return path could hit an ip rule corresponding to a different advertised network forwarding it to an inappropriate destination. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

…default VRF" This reverts commit 1ea2739. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

Adds logging for informer waitForCacheSync

Fix mgmt port allocation with VFs provided as resource for DPU

…-set Fix dnsnameresolver address set

SNAT traffic from advertised UDNs towards UDN enabled default services

This annotation was always evaluated based on node-id, and not actually allocated. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

This helps to avoid circular dependency with generator/udn Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

Some tests were making up random JoinIPs and it worked because they were annotated from the test. Now when the IPs are coming from the real JoinSubnet configuration for both default network and UDNs, some ips have to change. Default network Join subnets: "100.64.0.0/16", "fd98::/64" UDN default join subnets: "100.65.0.0/16", "fd99::/64" Node will have the same IP "index" for all join subnets (that means node1 can't have 100.64.0.1 in default network, but 100.65.0.2 in the UDN, it has to be 100.65.0.1) Delete cluster manager tests for stale annotation. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

Interconnect resources (trasit switch-related ports and remote static routes) were not created before because node-id annotation wasn't set, and interconnect handler was simply failing. Now when node-id is set for GW router IP allocation, the resources are successfully created. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

To only execute it once, run for default network NodeAllocator only, since it is always created. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

Remove OVNNodeGRLRPAddrs annotation

openshift-pr-manager · 2025-08-16T15:47:55Z

/ok-to-test
/payload 4.20 ci blocking
/payload 4.20 nightly blocking

openshift-pr-manager · 2025-08-16T15:47:56Z

/hold
needs conflict resolution

openshift-ci-robot · 2025-08-16T15:47:57Z

@openshift-pr-manager[bot]: This pull request explicitly references no jira issue.

Details

In response to this:

Automated merge of upstream/master → master.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-08-16T15:48:27Z

@openshift-pr-manager[bot]: user openshift-pr-manager[bot] is not trusted for pull request #2722

openshift-ci · 2025-08-16T15:48:41Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: openshift-pr-manager[bot]
Once this PR has been reviewed and has the lgtm label, please assign kyrtapz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-08-16T16:25:48Z

@openshift-pr-manager[bot]: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/4.20-upgrade-from-stable-4.19-images	`03c55f7`	link	true	`/test 4.20-upgrade-from-stable-4.19-images`
ci/prow/e2e-metal-ipi-ovn-ipv6	`03c55f7`	link	true	`/test e2e-metal-ipi-ovn-ipv6`
ci/prow/e2e-aws-ovn-edge-zones	`03c55f7`	link	true	`/test e2e-aws-ovn-edge-zones`
ci/prow/e2e-vsphere-ovn	`03c55f7`	link	false	`/test e2e-vsphere-ovn`
ci/prow/e2e-aws-ovn-windows	`03c55f7`	link	true	`/test e2e-aws-ovn-windows`
ci/prow/e2e-vsphere-ovn-techpreview	`03c55f7`	link	false	`/test e2e-vsphere-ovn-techpreview`
ci/prow/lint	`03c55f7`	link	true	`/test lint`
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp	`03c55f7`	link	true	`/test e2e-metal-ipi-ovn-dualstack-bgp`
ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration	`03c55f7`	link	true	`/test e2e-aws-ovn-shared-to-local-gateway-mode-migration`
ci/prow/okd-scos-images	`03c55f7`	link	true	`/test okd-scos-images`
ci/prow/images	`03c55f7`	link	true	`/test images`
ci/prow/e2e-metal-ipi-ovn-techpreview	`03c55f7`	link	false	`/test e2e-metal-ipi-ovn-techpreview`
ci/prow/e2e-aws-ovn-serial	`03c55f7`	link	true	`/test e2e-aws-ovn-serial`
ci/prow/okd-scos-e2e-aws-ovn	`03c55f7`	link	false	`/test okd-scos-e2e-aws-ovn`
ci/prow/e2e-metal-ipi-ovn-ipv6-techpreview	`03c55f7`	link	false	`/test e2e-metal-ipi-ovn-ipv6-techpreview`
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw	`03c55f7`	link	true	`/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw`
ci/prow/e2e-ovn-hybrid-step-registry	`03c55f7`	link	false	`/test e2e-ovn-hybrid-step-registry`
ci/prow/e2e-aws-ovn-local-gateway	`03c55f7`	link	true	`/test e2e-aws-ovn-local-gateway`
ci/prow/e2e-azure-ovn	`03c55f7`	link	false	`/test e2e-azure-ovn`
ci/prow/e2e-aws-ovn-serial-ipsec	`03c55f7`	link	false	`/test e2e-aws-ovn-serial-ipsec`
ci/prow/e2e-aws-ovn-single-node-techpreview	`03c55f7`	link	false	`/test e2e-aws-ovn-single-node-techpreview`
ci/prow/e2e-metal-ipi-ovn-dualstack-techpreview	`03c55f7`	link	false	`/test e2e-metal-ipi-ovn-dualstack-techpreview`
ci/prow/security	`03c55f7`	link	false	`/test security`
ci/prow/qe-perfscale-aws-ovn-small-udn-density-l3	`03c55f7`	link	false	`/test qe-perfscale-aws-ovn-small-udn-density-l3`
ci/prow/4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade	`03c55f7`	link	true	`/test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade`
ci/prow/e2e-aws-ovn-hypershift-kubevirt	`03c55f7`	link	false	`/test e2e-aws-ovn-hypershift-kubevirt`
ci/prow/qe-perfscale-aws-ovn-small-udn-density-churn-l3	`03c55f7`	link	false	`/test qe-perfscale-aws-ovn-small-udn-density-churn-l3`
ci/prow/e2e-aws-ovn-upgrade-ipsec	`03c55f7`	link	false	`/test e2e-aws-ovn-upgrade-ipsec`
ci/prow/e2e-aws-ovn-upgrade-local-gateway	`03c55f7`	link	true	`/test e2e-aws-ovn-upgrade-local-gateway`
ci/prow/gofmt	`03c55f7`	link	true	`/test gofmt`
ci/prow/e2e-aws-ovn-upgrade	`03c55f7`	link	true	`/test e2e-aws-ovn-upgrade`
ci/prow/e2e-aws-ovn-hypershift	`03c55f7`	link	true	`/test e2e-aws-ovn-hypershift`
ci/prow/e2e-openstack-ovn	`03c55f7`	link	false	`/test e2e-openstack-ovn`
ci/prow/qe-perfscale-payload-control-plane-6nodes	`03c55f7`	link	true	`/test qe-perfscale-payload-control-plane-6nodes`
ci/prow/openshift-e2e-gcp-ovn-techpreview-upgrade	`03c55f7`	link	false	`/test openshift-e2e-gcp-ovn-techpreview-upgrade`
ci/prow/e2e-azure-ovn-techpreview	`03c55f7`	link	false	`/test e2e-azure-ovn-techpreview`
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview	`03c55f7`	link	false	`/test e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview`
ci/prow/4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade-ipsec	`03c55f7`	link	false	`/test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade-ipsec`
ci/prow/e2e-gcp-ovn-techpreview	`03c55f7`	link	true	`/test e2e-gcp-ovn-techpreview`
ci/prow/e2e-aws-ovn-hypershift-conformance-techpreview	`03c55f7`	link	false	`/test e2e-aws-ovn-hypershift-conformance-techpreview`
ci/prow/e2e-metal-ipi-ovn-dualstack	`03c55f7`	link	true	`/test e2e-metal-ipi-ovn-dualstack`
ci/prow/e2e-aws-ovn	`03c55f7`	link	true	`/test e2e-aws-ovn`
ci/prow/4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-rt-upgrade	`03c55f7`	link	true	`/test 4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-rt-upgrade`
ci/prow/e2e-azure-ovn-upgrade	`03c55f7`	link	true	`/test e2e-azure-ovn-upgrade`
ci/prow/e2e-gcp-ovn	`03c55f7`	link	true	`/test e2e-gcp-ovn`
ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration	`03c55f7`	link	true	`/test e2e-aws-ovn-local-to-shared-gateway-mode-migration`
ci/prow/e2e-aws-ovn-techpreview	`03c55f7`	link	false	`/test e2e-aws-ovn-techpreview`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

jluhrsen · 2025-08-18T16:32:38Z

/close
going with #2724 which was needed to fix a merge conflict

openshift-ci · 2025-08-18T16:32:48Z

@jluhrsen: Closed this PR.

Details

In response to this:

/close
going with #2724 which was needed to fix a merge conflict

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

flavio-fernandes and others added 30 commits June 30, 2025 05:01

rename/reuse pmtud nft sets to remote-node-ips

a67872d

let's reuse the pmtud address-set ips of the remote nodes ips also for bgp advertised networks cSNAT Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

cleanupStalePodSNATs: Don't blow all SNATs for advertised Networks

0635cae

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Bump OVN to 25.03

bcd0656

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Merge pull request #5420 from tssurya/bump-ovn-25.03

515b984

Bump OVN to 25.03

Change OVN-Kubernetes community meeting time

9b21fc0

Makes this EMEA/US friendly. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Merge pull request #5424 from tssurya/change-ovnk-upstream-meeting-time

fd29332

Change OVN-Kubernetes community meeting time

Merge pull request #5320 from maiqueb/create-default-net-nad-creating…

ee8088c

…-vms-with-ip-requests udn, pre assigned port net ids: provision the default net NAD CR

chore: Update libovsdb bindings to ovn 25.03

b85c0f5

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>

dnsnameresolver: fix ever growing address set

575a08c

Signed-off-by: arkadeepsen <arsen@redhat.com>

dnsnameresolver: add unit test for DNSNameResolver resource update

4780a5e

Signed-off-by: arkadeepsen <arsen@redhat.com>

Merge pull request #5140 from tssurya/bgp-isolation-part1

0a88ff7

UDN Isolation with BGP: Remove support for receiving advertised routes from remote nodes

Merge pull request #5432 from dave-tucker/bindings-up

d82b233

chore: Update libovsdb bindings to ovn 25.03

Bump ubuntu to 25.04

03ccdf9

Signed-off-by: nithyar <nithyar@nvidia.com>

Merge pull request #5427 from crnithya/ubuntu_25_04

6082160

Bump ubuntu version to 25.04

dnsnameresolver: add e2e test to verify connectivity after DNS name T…

6241b27

…TL expiry Signed-off-by: arkadeepsen <arsen@redhat.com>

wizhaoredhat and others added 15 commits August 13, 2025 15:12

Reapply "Add the IP rule for a UDN only when it is advertised to the …

7db6c99

…default VRF" This reverts commit 1ea2739. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

Merge pull request #5051 from trozet/print_informer_sync_time

75730a4

Adds logging for informer waitForCacheSync

Merge pull request #5481 from wizhaoredhat/fix_dpu_host

30176f6

Fix mgmt port allocation with VFs provided as resource for DPU

Merge pull request #5429 from arkadeepsen/fix-dnsnameresolver-address…

5155c91

…-set Fix dnsnameresolver address set

Merge pull request #5463 from jcaamano/udn-snat-kapi-dns

da01d12

SNAT traffic from advertised UDNs towards UDN enabled default services

[node/anno] Stop using OVNNodeGRLRPAddrs annotaion.

6cbf833

This annotation was always evaluated based on node-id, and not actually allocated. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

[podannotation] Move AddRoutesGatewayIP to the allocator/pod.

f0dd327

This helps to avoid circular dependency with generator/udn Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

[e2e fix] Parse JoinIPs from the NAD spec instead of annotation.

0bbfdd8

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

[node_allocator] Add stale annotation cleanup.

e4d2b28

To only execute it once, run for default network NodeAllocator only, since it is always created. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

Merge pull request #5396 from npinaeva/remove-joinip-anno

4a06d4b

Remove OVNNodeGRLRPAddrs annotation

Merge upstream/master into master with conflicts (08-16-2025)

03c55f7

openshift-pr-manager bot changed the title ~~NO-JIRA: DownStream Merge [08-16-2025]~~ MERGE CONFLICT! NO-JIRA: DownStream Merge [08-16-2025] Aug 16, 2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 16, 2025

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2025

openshift-ci bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Aug 16, 2025

openshift-ci bot requested review from martinkennelly and tssurya August 16, 2025 15:48

openshift-ci bot closed this Aug 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MERGE CONFLICT! NO-JIRA: DownStream Merge [08-16-2025]#2722

MERGE CONFLICT! NO-JIRA: DownStream Merge [08-16-2025]#2722
openshift-pr-manager[bot] wants to merge 116 commits intomasterfrom
d/s-merge-08-16-2025

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-ci-robot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

jluhrsen commented Aug 18, 2025

Uh oh!

openshift-ci bot commented Aug 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-pr-manager bot commented Aug 16, 2025

Uh oh!

openshift-ci-robot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

openshift-ci bot commented Aug 16, 2025

Uh oh!

jluhrsen commented Aug 18, 2025

Uh oh!

openshift-ci bot commented Aug 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments