OCPBUGS-59645: [release-4.19] BGP related backports#2748
OCPBUGS-59645: [release-4.19] BGP related backports#2748openshift-merge-bot[bot] merged 9 commits intoopenshift:release-4.19from
Conversation
Some code (EgressSVC and BGP) in cluster-manager needs to know the gateway mode. Signed-off-by: Peng Liu <pliu@redhat.com> (cherry picked from commit bc5f08e)
frr-k8s will not listen to port 179, and BGP peering can only be established from OCP to external. Signed-off-by: Konstantinos Karampogias <karampok@gmail.com> (cherry picked from commit 401f7b4)
This should save up calls to the webhook and reduce latency when a NAD is updated to add labels or annotations. The root cause of the change is the OVNK BGP feature: when BGP is enabled for the cluster default network, reconfiguration might cause temporary disruptions. As part of this reconfiguration and necessary to complete it, OVNK depends on annotating an internal NAD. We want to avoid having to reach the webhook for this annotation because the temporary disruption might prevent it and in that case the reconfiguration won't complete. Another possibility would be to filter out from validation the specific internal NAD but this current approach might be more beneficial overall. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com> (cherry picked from commit 0dae0a0)
| ports: | ||
| - containerPort: 7572 | ||
| name: monitoring | ||
| name: frr-k8s-webhook-server |
There was a problem hiding this comment.
I think we'd still need - --webhook-port=9443 here as we are hostnetworking the pod and it needs to be a reserved one (for which I still have openshift/enhancements#1815 dangling)
There was a problem hiding this comment.
9123 you mean?
Also, you are saying passing --webhook-port arg would work for this PR?
There was a problem hiding this comment.
yup, it's supported by 4.19: https://github.com/openshift/frr/blob/release-4.19/cmd/main.go#L96
ead0114 to
4eea398
Compare
Instead of relying on the metrics port to understand when the webhook is ready, we try to hit the endpoint of the webhook itself, so the webhook pod is going to be ready only when it is effectively able to serve. This is aligned with the frr-k8s binary change where we add a new endpoint "healtz" to the webhook server. Also, pinning the webhook port parameter and not relying on the default, making the change more convenient. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com> (cherry picked from commit acdd043)
Aligning to upstream and: - move the webhook deployment to hostnetworked, so that the api can still be served if an offending FRRConfiguration is applied - openshift only: change the webhook port to one in the allowed range - remove the metrics listening port, as no service monitor was deployed Signed-off-by: Federico Paolinelli <fpaoline@redhat.com> (cherry picked from commit 0b29885)
Aligning to upstream and bringing the deprecation of the disableMP flag and the introduction of a "dualStackAddressFamily" flag to bring the behavior back to allow backward compatibility. The default behavior was inconsistent in case of dual stack clusters, as frr was being configured to advertise both ip families over a single session, without being instructed properly to what next hop set for the ip family not corresponding to the ip family of the session. The dualStackAddressFamily flag is introduced to allow users relying on that behavior to keep working. Note that both flags are not documented nor supported d/s. More details in the upstream metallb issue metallb/metallb#2704 Signed-off-by: Federico Paolinelli <fpaoline@redhat.com> (cherry picked from commit 7056e67)
We need to run the webhook as hostnetworked, so it requires the privileged scc. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com> (cherry picked from commit 5cc28c6)
4eea398 to
28d53ad
Compare
|
fyi there is this #2732 (not sure if we should close it or merge it first) |
We need this to be able to backport [1] after which we don't need it anymore. 1 metallb/frr-k8s@963d15c Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Thanks for letting me know. Let's keep both open but give this one preference. If we hit a blocker here we can merge the other one. |
|
/test ? |
|
@jcaamano: The following commands are available to trigger required jobs: The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw-techpreview |
|
/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw-techpreview |
…enabled This is necessary to enable assymetric traffic for advertised Layer2 networks running in local gateway mode. Signed-off-by: Patryk Diak <pdiak@redhat.com> (cherry picked from commit 942c9c2)
|
/retitle OCPBUGS-59643: [release-4.19] BGP related backports |
|
@jcaamano: This pull request references Jira Issue OCPBUGS-59643, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retitle OCPBUGS-59645: [release-4.19] BGP related backports |
|
@jcaamano: This pull request references Jira Issue OCPBUGS-59645, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold cancel |
|
/retest |
|
/test hypershift-e2e-aks |
2 similar comments
|
/test hypershift-e2e-aks |
|
/test hypershift-e2e-aks |
|
/test hypershift-e2e-aks |
| - --namespace=$(NAMESPACE) | ||
| - --metrics-bind-address=:7572 | ||
| - --metrics-bind-address=0 | ||
| - --webhook-port=9123 |
There was a problem hiding this comment.
@fedepaol @jcaamano don't we need to add this here? https://github.com/openshift/enhancements/blob/master/dev-guide/host-port-registry.md
There was a problem hiding this comment.
yup, I have a dangling pr here, see my comment #2748 (comment)
Not sure who do I need to ask for merge though
There was a problem hiding this comment.
🤦 sorry, completely missed that.
|
/retest |
|
@jcaamano: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/test e2e-aws-ovn-single-node |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jcaamano, kyrtapz The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/label backport-risk-assessed |
fd80a4a
into
openshift:release-4.19
|
@jcaamano: Jira Issue OCPBUGS-59645: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-59645 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] Distgit: cluster-network-operator |
|
Fix included in accepted release 4.19.0-0.nightly-2025-07-28-070511 |
cherry-picked
/hold
waiting for #2714 adn/or #2735 unless we decide otherwise