Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some pods not working properly with containerd runtime and CNI plugin. #1982

Closed
spothound opened this issue May 5, 2022 · 3 comments
Closed

Comments

@spothound
Copy link

What happened:
I am updating a development environment to use containerd runtime as a test before updating my production environments to this runtime. Sadly, although the cluster update works and I can run some pods with containerd... there are some core pods that are not working properly after the update. For example:

  1. Cert-manager service has the following error:
error retrieving resource lock kube-system/cert-manager-controller: Get "https://172.20.0.1:443/api/v1/namespaces/kube-system/configmaps/cert-manager-controller": dial tcp 172.20.0.1:443: i/o timeout
  1. kubed service has the following error:
Error: Get "https://172.20.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 172.20.0.1:443: i/o timeout

  1. nginx ingress controller has the following error:
│ W0503 11:08:48.909243       7 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This mig not work.│
│ I0503 11:08:48.909416       7 main.go:221] "Creating API client" host="https://172.20.0.1:443" 

So... seems like there is internal networking in the cluster when trying to run containerd runtime.

What you expected to happen:

The mentioned services should work with containerd as they did with docker runtime... and the mentioned timeout errors should not happen.

How to reproduce it (as minimally and precisely as possible):
Create an EKs cluster with docker runtime and CNI plugin and try to run some of the mentioned services, then update the runtime to containerd by adding the --container-runtime containerd flag to the cluster bootstrap command.

Anything else we need to know?:

Environment:

  • AWS Region: eu-west1
  • Instance Type(s): m5a.xlarge
  • EKS Platform version: eks.5
  • Kubernetes version: 1.21
  • AMI Version: AMI amazon-eks-ami AMI Release v20220429
  • Kernel: 5.4.188-104.359.amzn2.x86_64
  • Release information (run cat /etc/eks/release on a node):
BASE_AMI_ID="ami-040678b7c3f67e60e"
BUILD_TIME="Fri Apr 29 00:19:03 UTC 2022"
BUILD_KERNEL="5.4.188-104.359.amzn2.x86_64"
ARCH="x86_64

I have seen some closed issues in this topic:

But the info extracted from those didn't help. I have checked the nodes and they have a SL:

lrwxrwxrwx 1 root root 31 May  3 08:42 /run/dockershim.sock -> /run/containerd/containerd.sock

So it shouldn't be the same problem that in those issues... but maybe is something related.

I have also detected that AWS load balancer mark the nodes as out of service... as the kube proxy endpoint seems to return connection refused when pinged by anyone. Some pods not working properly with containerd runtime and CNI plugin. #911

Any idea, or suggestion on how to debug the issue?

Thanks!

@github-actions
Copy link

github-actions bot commented May 9, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@fernandoTorresan
Copy link

@spothound did you solve the problem?

We had the same issue yesterday using the latest EKS AMI released (AMI amazon-eks-ami AMI Release v20220429), and looks like there is something wrong with this AMI.

We had to rollback to a previous version (https://github.com/awslabs/amazon-eks-ami/releases/tag/v20220406) and everything started to work normally again.

@yongzhang
Copy link

Same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants