Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Antrea CNI not cleaned up after deletion #1659

Closed
jayunit100 opened this issue Dec 11, 2020 · 2 comments
Closed

Antrea CNI not cleaned up after deletion #1659

jayunit100 opened this issue Dec 11, 2020 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/duplicate Indicates an issue is a duplicate of other open issue.

Comments

@jayunit100
Copy link
Contributor

Describe the bug

Kubelet seems to still want to use antrea as its CNI provider, even after antrea is removed:

, error: exit status 1, failed to stop running pod ffa31025e18df8891f96a80b26cf800a371e8db798aff19fa24454ee487f2b6d: output: time="2020-12-11T14:09:00-05:00" level=fatal msg="stopping the pod sandbox \"ffa31025e18df8891f96a80b26cf800a371e8db798aff19fa24454ee487f2b6d\" failed: rpc error: code = Unknown desc = failed to destroy network for sandbox \"ffa31025e18df8891f96a80b26cf800a371e8db798aff19fa24454ee487f2b6d\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/antrea/cni.sock: connect: connection refused\""

To Reproduce

Uninstall antrea and then attempt to install a different CNI provider. Containers will land on nodes (nodes continue to be marked as schedulable) even though theres no cni.sock.

Expected

After the antrea cni is gone, the kubelet would no longer be aware of any antrea related sockets.

(NOTE: This might just be a race condition where you need to wait some time before you can finish saying youve succesfully uninstalled the CNI, and if so, maybe its more a kubelet issue then an antrea one).
Actual behavior
A clear and concise description of what's the actual behavior. If applicable, add screenshots, log messages, etc. to help explain the problem.

Versions:
Please provide the following information:

  • Antrea version (Docker image tag).
  • Kubernetes version (use kubectl version). If your Kubernetes components have different versions, please provide the version for all of them.
  • Container runtime: which runtime are you using (e.g. containerd, cri-o, docker) and which version are you using?
  • Linux kernel version on the Kubernetes Nodes (uname -r).
  • If you chose to compile the Open vSwitch kernel module manually instead of using the kernel module built into the Linux kernel, which version of the OVS kernel module are you using? Include the output of modinfo openvswitch for the Kubernetes Nodes.

Additional context
Add any other context about the problem here, such as Antrea logs, kubelet logs, etc.

(Please consider pasting long output into a GitHub gist or any other pastebin.)

@jayunit100 jayunit100 added the kind/bug Categorizes issue or PR as related to a bug. label Dec 11, 2020
@antoninbas
Copy link
Contributor

This is a duplicate of #181, so I will close this issue.
We are aware of this. This is an issue which is common to most (all?) CNIs AFAIK. I do not believe there is a good way to automate this cleanup when you delete Antrea K8s resources. However, we could and should provide a mechanism for cleanup, like other CNIs do, e.g. in the form of a K8s Job.

@antoninbas antoninbas added the triage/duplicate Indicates an issue is a duplicate of other open issue. label Dec 11, 2020
@Kaushal28
Copy link

So what are the steps for complete cleanup?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/duplicate Indicates an issue is a duplicate of other open issue.
Projects
None yet
Development

No branches or pull requests

3 participants