Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Node restart breaks CNI when kube-ovn is deployed with Multus #4845

Open
cnvergence opened this issue Dec 17, 2024 · 2 comments
Open
Labels
bug Something isn't working

Comments

@cnvergence
Copy link
Contributor

Kube-OVN Version

latest

Kubernetes Version

v1.29.1

Operation-system/Kernel Version

Ubuntu 22.04

Description

Having deployed kube-ovn with multus on the cluster with configured multiple external subnets, we are running sometimes into the issue after node restarts Multus with kube-ovn does not start properly, probably due to the race condition mentioned in this issue.

Current workaround is to delete multus shim on node, restart multus pod and ovn-ovs pod.

Steps To Reproduce

  1. Deploy kube-ovn with multus on the cluster
  2. Create multiple external networks with network-attachment definitions
  3. Restart the node
  4. Observe the inability to start in ovn-ovs pod

Current Behavior

Node restart in a cluster with kube-ovn and multus sometimes results in CNI failure.

Expected Behavior

Node restart should be handled differently in kube-ovn/multus stack.

@cnvergence cnvergence added the bug Something isn't working label Dec 17, 2024
@cnvergence cnvergence changed the title [BUG] [BUG] Node restart breaks CNI when kube-ovn is deployed with Multus Dec 17, 2024
@oilbeater
Copy link
Collaborator

@cnvergence, could you provide detailed error logs? The issue you mentioned in Multus should not cause Kube-OVN to fail to start. If Kube-OVN failed to start due to a node restart, we need to investigate if something went wrong on the Kube-OVN side.

@cnvergence
Copy link
Contributor Author

cnvergence commented Dec 19, 2024

@oilbeater it does not happen all the time, just sometimes, I will try to collect the logs
I wonder if anyone of the community had observed the same issue, maybe there is a way to mitigate it from kube-ovn side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants