-
Notifications
You must be signed in to change notification settings - Fork 173
[release-4.18] OCPBUGS-48710, SDN-4930: Downstream Merge [01-28-2025] #2428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.18] OCPBUGS-48710, SDN-4930: Downstream Merge [01-28-2025] #2428
Conversation
Signed-off-by: Flavio Fernandes <ffernandes@nvidia.com>
Signed-off-by: Flavio Fernandes <ffernandes@nvidia.com>
ShallowClone has to copy all factories. Signed-off-by: Patryk Diak <pdiak@redhat.com>
Commit 6dda0b5 ("factory: Bump the event queue size to 1K.") increased the event queue size to 1K events. However, in combination with fe17136 ("factory: Reduce contention on informer locks.") which configures 201 internal informers this might end up using too much memory in cases when controllers cannot consume events as fast as they're queued by the kube API. For each kubernetes API object type we consume: N_internal_informers x N_queues x N_events x sizeof(event) memory. That currently translates to: N_internal_informers = 201 N_queues = 15 N_events = 1000 sizeof(event) = 32B => ~92MB of memory per object type Given that ovn-kubernetes processes need to be informed about multiple object types this can grow to a significantly large number when controllers that are supposed to consume events from the internal informer queues are slow. Reduce the queue size, making it 100, in order to lower the worst case scenario memory usage: N_internal_informers = 201 N_queues = 15 N_events = 100 sizeof(event) = 32B => ~9.2MB of memory per object type Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Patryk Diak <pdiak@redhat.com>
Previously, if a new NAD was added to an existing network after a pod referencing it, the pod would never start. This is fixed by reconciling pending pods when the secondary network controller reconciles a new NAD. Signed-off-by: Patryk Diak <pdiak@redhat.com>
Fixes NPE seen at: openshift#2427 (comment) Certain network types may not have a pod handler or retry framework for cluster manager. Signed-off-by: Tim Rozet <trozet@redhat.com>
|
@openshift-cherrypick-robot: Ignoring requests to cherry-pick non-bug issues: SDN-4930 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/label backport-risk-assessed |
|
/approve |
|
not worried about these jobs as the failures did not seem related to this PR, but at least one is required and I think the others should pass anyway. let's re-run: /test 4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade two flakes:
|
|
/retitle [release-4.18] OCPBUGS-48710, SDN-4930: Downstream Merge [01-28-2025] |
|
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-48710, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. This pull request references SDN-4930 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jluhrsen, openshift-cherrypick-robot, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/label cherry-pick-approved |
|
/test e2e-aws-ovn-single-node-techpreview |
|
/test e2e-aws-ovn-single-node-techpreview |
|
/hold cancel |
|
@openshift-cherrypick-robot: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
f3d89b2
into
openshift:release-4.18
|
@openshift-cherrypick-robot: Jira Issue OCPBUGS-48710: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-48710 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] Distgit: ovn-kubernetes-base |
|
[ART PR BUILD NOTIFIER] Distgit: ovn-kubernetes-microshift |
|
[ART PR BUILD NOTIFIER] Distgit: ose-ovn-kubernetes |
|
Fix included in accepted release 4.18.0-0.nightly-2025-06-26-034047 |
|
Fix included in accepted release 4.18.0-0.nightly-2025-10-23-005402 |
|
Fix included in accepted release 4.18.0-0.nightly-2025-12-24-222251 |
This is an automated cherry-pick of #2427
/assign trozet