Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use default containerd socket path #724

Merged
merged 2 commits into from
Sep 17, 2021

Conversation

rafaelgaspar
Copy link
Contributor

Description of changes:

The new containerd option as runtime uses /run/dockershim.sock as the path for the socket, that is very confusing for people migrating to containerd just because Kubernetes is deprecating dockershim.

But the main issue is that the containerd doesn't live in the default place where it's expected to be, and that breaks for example Datadog integration. Even if I supply the correct path to CRI socket it doesn't realise it's a containerd socket and therefore doesn't collect containerd metrics.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@rafaelgaspar rafaelgaspar changed the title Use defualt containerd socket path Use default containerd socket path Aug 6, 2021
@rafaelgaspar rafaelgaspar force-pushed the fix-containerd-socket-path branch from fa4ee9e to 349cf74 Compare August 6, 2021 09:23
@rafaelgaspar
Copy link
Contributor Author

rafaelgaspar commented Aug 6, 2021

Testing this further it looks like the aws-cni-plugin still expects the socket to be in /run/dockershim.sock

@rafaelgaspar
Copy link
Contributor Author

Added a symlink in the bootstrap for aws-node(VPC CNI Plugin), or any other use case that still points to the /run/dockershim.sock.

@bubalush
Copy link

I have the same problem when up and running datadog agent in EKS with containerd.

@rafaelgaspar rafaelgaspar force-pushed the fix-containerd-socket-path branch from 9cebec2 to fd14027 Compare August 14, 2021 04:58
@rafaelgaspar
Copy link
Contributor Author

Rebased on upstream master, what would be the next steps to get this merged?

@derbauer97
Copy link

Thanks for the PR helped us fixing our Instana Monitoring. We told the AWS Support about that PR and they escalated it to the ServiceTeam. Hopefully it will be merged.

@ravisinha0506
Copy link
Contributor

Hi,

We are testing this changes and will merge this PR once this is done.

Thanks

@benmccown
Copy link

From my testing, you have your symbolic links backwards.

Your change:

ln -sf /run/containerd/containerd.sock /run/dockershim.sock

Man page for ln

NAME
       ln - make links between files

SYNOPSIS
       ln [OPTION]... [-T] TARGET LINK_NAME   (1st form)
       ln [OPTION]... TARGET                  (2nd form)
       ln [OPTION]... TARGET... DIRECTORY     (3rd form)
       ln [OPTION]... -t DIRECTORY TARGET...  (4th form)

Works for me:

ln -s /run/dockershim.sock /run/containerd/containerd.sock

@rafaelgaspar
Copy link
Contributor Author

rafaelgaspar commented Sep 4, 2021

From my testing, you have your symbolic links backwards.

Nope, that is the right order, the TARGET of the link is /run/containerd/containerd.sock and LINK_NAME is /run/dockershim.sock.

Please note that the PR also changes the contents of /etc/containerd/config.toml(actually the template that gets copied there) so that containerd actually creates the socket in the default path(/run/containerd/containerd.sock).

Maybe you are trying to just run this ln command on top of the current AMI, in that case you are right, because the current AMI changes de containerd config to open the socket at /run/dockershim.sock as a workaround to not break the AWS VPC CNI plugin.

Both ways would work for AWS VPC CNI plugin and other components that rely on that path for the socket, and maybe it's just my opinion, but I believe it would be a more elegant solution to keep containerd opening the socket on default path(as it is in this PR) and using the symbolic link for the workaround, instead of the other way around, and I believe this would provide a cleaner path later on when eventually the CNI plugin will have to be updated to support the default socket path.

@rafaelgaspar
Copy link
Contributor Author

Works for me:

ln -s /run/dockershim.sock /run/containerd/containerd.sock

If you want just commands to run on top of the current AMI to replicate what this PR does, this is what I have been running via cloud-init in our production clusters for the past month:

sed -i 's/dockershim.sock/containerd\/containerd.sock/' /etc/eks/containerd/containerd-config.toml
sed -i 's/dockershim.sock/containerd\/containerd.sock/' /etc/eks/containerd/kubelet-containerd.service
ln -sf /run/containerd/containerd.sock /run/dockershim.sock

/etc/eks/bootstrap.sh ...

Note that this 3 lines comes before executing the bootstrap.sh.

@abeer91
Copy link
Contributor

abeer91 commented Sep 17, 2021

@ravisinha0506 was able to test this internally for multiple versions and different scenarios.

@abeer91 abeer91 merged commit 9576786 into awslabs:master Sep 17, 2021
@benmccown
Copy link

From my testing, you have your symbolic links backwards.

Nope, that is the right order, the TARGET of the link is /run/containerd/containerd.sock and LINK_NAME is /run/dockershim.sock.

Please note that the PR also changes the contents of /etc/containerd/config.toml(actually the template that gets copied there) so that containerd actually creates the socket in the default path(/run/containerd/containerd.sock).

Maybe you are trying to just run this ln command on top of the current AMI, in that case you are right, because the current AMI changes de containerd config to open the socket at /run/dockershim.sock as a workaround to not break the AWS VPC CNI plugin.

Both ways would work for AWS VPC CNI plugin and other components that rely on that path for the socket, and maybe it's just my opinion, but I believe it would be a more elegant solution to keep containerd opening the socket on default path(as it is in this PR) and using the symbolic link for the workaround, instead of the other way around, and I believe this would provide a cleaner path later on when eventually the CNI plugin will have to be updated to support the default socket path.

My bad, I did not catch that you also edited config.toml. Your change makes sense in that context.

@chrissng
Copy link

chrissng commented Oct 11, 2021

This PR fixes the issue of missing metrics labels when containerd is used with amazon-eks-node-v20211004 AMIs.

But it is still broken for GPU-enabled AMIs (amazon-eks-gpu-node) as of v20211004. GPU instances running amazon-eks-gpu-node-1.21-v20211004 failed to join the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants