[wip] lint fix by kyrtapz · Pull Request #2787 · openshift/ovn-kubernetes

kyrtapz · 2025-10-08T07:19:01Z

No description provided.

Adds check to ensure that the status on an EIP matches the IP we have allocated in our cache. If it doesn't consider the status invalid and attempt reallocation. There is a race exposed by the unit test "should update invalid assignments on duplicated node", where an EIP has 2 IPs assigned to node1, and a bogus IP assigned to node2. The test expects that EIP should rebalance the EIPs with one on node 1 and one on node 2. However a race can happen where EIP node allocator cache becomes corrupt due to lack of validation of status. This can happen: 1. EgressIP has node1: EIP1, EIP2 2. Test starts WatchEgressIP 3. EIP->Sync->Annotates 50000 mark on EIP 4. This triggers an update event with 50000 mark, node1: EIP1, EIP2 5. EIP initial add starts with original event of node1: EIP1, EIP2 6. EIP rebalances one EIP per node, patches EIP object, creates update event with 50000 mark, node1: EIP1, node2: EIP2 7. Next update event processed -> RACE happens, retry framework will grab "latest" object. If the informer was lagging, informer cache has object from step 4 8. EIP validation logic considers this a pass, because cache has the both nodes, each with 1 egress IP. No patch is done to the EIP object, but when validStatus is achieved the eNode cache is updated 9. enode cache is updated so now node1 has 2 allocations for EIP 10. Update event from 6 is processed. Validation state fails because node1 has 2 EIPs. During rebalancing it wont be able to find a suitable node for the EIP due to cache corruption, so nothing will be updated. Signed-off-by: Tim Rozet <trozet@nvidia.com>

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add OKEP for `ovn-kubernetes-mcp` repo

Enhances EIP status validation

Fixes: #5552 Signed-off-by: Tim Rozet <trozet@nvidia.com>

For backwards compatibility reasons, we cannot update the definition of all the attributes that use the CIDR type - only of the new, unreleased objects. Hence, this check is only performed for the `reservedSubnets` and `infrastructureSubnets` attributes. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

The infrastructure IPv4 subnet of the L2 UDN was mistakenly not masked. This commit changes that, effectively masking the provided infrastructure CIDR. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

OKEP-5552: Add support for dynamic UDN allocation per node

Signed-off-by: Yun Zhou <yunz@nvidia.com>

Since commit fa282e7, we test that the allocator cache matches the status of an EgressIP. This change updates both IPv4 and dual-stack test variants to properly validate cache-based stability. Each test now initializes the allocator cache with assignments matching the EgressIP status, then verifies neither the cache nor status change after running WatchEgressIP. A new egressIPsMatch helper performs comparison of EgressIP specs and status items. Signed-off-by: Andreas Karis <ak.karis@gmail.com>

These tests assert that unmasked reserved / infrastructure subnets are caught (both IPv4 and IPv6). We need to skip the test when it tries to run on an unsupported platform, otherwise, it will not fail as expected. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

configure IP/VRF rules only on full/dpu-host mode, and confiugre openflow rules only on full/dpu mode. Signed-off-by: Yun Zhou <yunz@nvidia.com>

EgressIP: Unit test IPv4 should not update valid assignment fix

api, udn: subnets must be masked

Fixes OF flows to be VLAN aware as well as add DPU checks for certain features

The k8s e2e utility functions AddOrUpdateLabelOnNode/RemoveLabelOffNode don't work for labels without a value. The incorrect handling of these labels caused an incorrect sequence of nodes whem migrating different than what the tests intended to test. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

There's two circumstances when IPs were being released incorrectly: * when a live migratable pod completed with no migration ongoing it was not being released due to IsMigratedSourcePodStale outright assuming a completed pod was stale. * when a live migratable pod completed on a different node than the VM's original as part of a migration it was being released when it shouldn't, we were simply not checking if it was a migration. It also improves the tests to check for IP release. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

Don't attempt to release IPs that are not managed by the local zone which can happen with live migratable pods, otherwise we would get distracting error logs on release. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

ConditionalIPRelease would always return false when checking IPs not tracked in the local zone so in that case we were not correctly checking for colliding pods. This was hidden by the fact that IsMigratedSourcePodStale was used just before instead of AllVMPodsAreCompleted until a very recent fix and that would always return false for a completed live migratable pod. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

Or completion of a failed target pod Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

As it is the most complex scenario and a superset of testing without it Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

new target lint-run-natively will now pull the golangci-lint binary based on the version in lint.sh locally and run the lint check. This is useful in the case that some system does not have a container runtime available. Signed-off-by: Jamo Luhrsen <jluhrsen@gmail.com>

openshift-ci · 2025-10-08T07:19:06Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

kyrtapz · 2025-10-08T07:19:12Z

/test lint

openshift-ci · 2025-10-08T07:19:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kyrtapz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [kyrtapz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-10-08T07:20:55Z

@kyrtapz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/lint	`8e457b1`	link	true	`/test lint`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

trozet and others added 24 commits September 15, 2025 15:35

Add OKEP for ovn-kubernetes-mcp repo

e384b37

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add OKEP to website

026546f

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Merge pull request #5496 from tssurya/okep-mcp-server-troubleshooter

f5931bf

Add OKEP for `ovn-kubernetes-mcp` repo

Merge pull request #5577 from trozet/fix_egress_ip_status_handling

b20fb84

Enhances EIP status validation

OKEP-5552: Add support for UDN node selector

3b22f41

Fixes: #5552 Signed-off-by: Tim Rozet <trozet@nvidia.com>

api, l2 udn, tests: mask the subnet of the preconfigured UDN

3450b35

The infrastructure IPv4 subnet of the L2 UDN was mistakenly not masked. This commit changes that, effectively masking the provided infrastructure CIDR. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

Merge pull request #5553 from trozet/udn_node_selector

2de4d0f

OKEP-5552: Add support for dynamic UDN allocation per node

udn host->ovn flows needs to be vlan aware

60404e5

Signed-off-by: Yun Zhou <yunz@nvidia.com>

make primary UDN node mode aware

a996442

configure IP/VRF rules only on full/dpu-host mode, and confiugre openflow rules only on full/dpu mode. Signed-off-by: Yun Zhou <yunz@nvidia.com>

Merge pull request #5608 from andreaskaris/issue5607

152d434

EgressIP: Unit test IPv4 should not update valid assignment fix

Merge pull request #5588 from maiqueb/udn-require-masked-subnets

ae113e7

api, udn: subnets must be masked

Merge pull request #5554 from cathy-zhou/upstream_udn_fix

0b248a2

Fixes OF flows to be VLAN aware as well as add DPU checks for certain features

kubevirt: prevent error log on IP release

7a155cc

Don't attempt to release IPs that are not managed by the local zone which can happen with live migratable pods, otherwise we would get distracting error logs on release. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

kubevirt: test OVN DB after completion of source pod

c1b02b5

Or completion of a failed target pod Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

kubevirt: test with per-pod SNATs

ef92f78

As it is the most complex scenario and a superset of testing without it Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>

Merge branch 'pr/5617' into lint_fix_ds

8e457b1

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 8, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 8, 2025

kyrtapz closed this Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[wip] lint fix#2787

[wip] lint fix#2787
kyrtapz wants to merge 24 commits intoopenshift:masterfrom
kyrtapz:lint_fix_ds

kyrtapz commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

kyrtapz commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Comments

Conversation

kyrtapz commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

kyrtapz commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants