OCPBUGS-55098: DownStream Merge [06-04-2025]#2618
OCPBUGS-55098: DownStream Merge [06-04-2025]#2618openshift-merge-bot[bot] merged 10 commits intoopenshift:masterfrom
Conversation
Related to investigating the root cause for: #5260. This commit removes adding pods that are not scheduled to the retry framework. When the pod is scheduled the controller will receive an event. Additionally these functions that add pods were using the kubeclient instead of informer cache. That means everytime a UDN was added we would issue kubeclient command to get all pods, which is really bad for performance. Signed-off-by: Tim Rozet <trozet@redhat.com>
There was a previous bug where when an egress packet would be SNAT'ed to the node IP, using a nodeport source port, it would cause reply traffic to get DNAT'ed to the nodeport load balancer. This happened because the egress connections were not conntracked correctly. This was fixed via: https://issues.redhat.com/browse/OCPBUGS-25889 https://issues.redhat.com/browse/FDP-291 However, that fix was not hardware offloadable. The ideal fix here would be to always commit to conntrack and have it be HW offloadable. Until we have a better solution, we can configure the port range for OVN to use on its SNAT. This applies to all SNATs for traffic that enters the local host or leaves the host. The new config option --ephemeral-port-range "<minPort>-<maxPort>" can be used to specify the port range to use with OVN. If not provided, this value will be automatically derived from the ephemeral port range in /proc/sys/net/ipv4/ip_local_port_range, which is typically set already to avoid nodeport range conflicts. Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
Kubeclient get for nodes and pods were being used in other places in the code. Removed all of their uses except for specific cases like the ovn db manager and windows, where we do not have full informer setups. While transitioning to use the factory, it created a cylical dependency between metrics and factory libraries, due to the configuration duration recorder. Split the configuration duration recorder into its own sub-package under metrics/recorders. Signed-off-by: Tim Rozet <trozet@redhat.com>
Introduced in 836ec36 This would just cause node updates to fire HandleAddUpdateNodeEvent everytime as the code prior to the aforementioned commit would have. Signed-off-by: Tim Rozet <trozet@redhat.com>
We have unit tests that check to see if only certain annotations were removed, rather than an all or nothing approach. Additionally this function was added as a failsafe in case a user did modify the annotations, or some other unforseen event where the annotations are now missing. Change the function to check each annotation (if it applies to the allocator). Signed-off-by: Tim Rozet <trozet@redhat.com>
Retry all pods smarter
Fix node update check for network cluster controller
Configures ephemeral port range for OVN SNAT'ing
|
/test e2e-aws-ovn-fdp-qe |
|
/retitle OCPBUGS-55098: DownStream Merge [06-04-2025] |
|
@jluhrsen: This pull request references Jira Issue OCPBUGS-55098, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
cc @trozet |
|
/retest |
|
/test e2e-aws-ovn-fdp-qe |
1 similar comment
|
/test e2e-aws-ovn-fdp-qe |
|
@jluhrsen: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@jechen0648 looking at OCP-70667 failure d31d171 commit adds the port range. I believe this is causing the failure. |
/hold @asood-rh do we need to get something fixed u/s and update this d/s merge? |
@jluhrsen It is just automation script that needs to be fixed so that it passes. It is not a product bug. |
thank you |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jluhrsen, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
36e5a1d
into
openshift:master
|
@jluhrsen: Jira Issue OCPBUGS-55098: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-55098 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] Distgit: ovn-kubernetes-base |
|
[ART PR BUILD NOTIFIER] Distgit: ovn-kubernetes-microshift |
|
[ART PR BUILD NOTIFIER] Distgit: ose-ovn-kubernetes |
📑 Description
Fixes #
Additional Information for reviewers
✅ Checks
How to verify it