Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MEP: CRI: Containerd by default, Buildkitd on demand #9639

Closed
afbjorklund opened this issue Nov 8, 2020 · 10 comments · Fixed by #13138
Closed

MEP: CRI: Containerd by default, Buildkitd on demand #9639

afbjorklund opened this issue Nov 8, 2020 · 10 comments · Fixed by #13138
Labels
co/runtime/containerd co/runtime/crio CRIO related issues co/runtime/docker Issues specific to a docker runtime kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@afbjorklund
Copy link
Collaborator

afbjorklund commented Nov 8, 2020

Background:

Kubernetes is moving to deprecate the dockershim in k8s 1.10 1.20 and start using CRI by default (instead of Docker)

Docker is not supporting Boot2Docker or Machine anymore, the upstream is deprecated (for years) and unmaintained.

Thus, we don't need to start Docker by default...

History:

https://kubernetes.io/blog/2016/07/rktnetes-brings-rkt-container-engine-to-kubernetes/
https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/

1.5: "Kubelet does not yet use CRI by default, but we are actively working on making this happen. The first step is to re-integrate Docker with kubelet using CRI. In the 1.5 release, we extended kubelet to support CRI, and also added a built-in CRI shim for Docker. This allows kubelet to start the gRPC server on Docker’s behalf."

https://kubernetes.io/blog/2017/11/containerd-container-runtime-options-kubernetes/
https://kubernetes.io/blog/2018/05/24/kubernetes-containerd-integration-goes-ga/

1.10: "The containerd 1.1 integration uses the CRI plugin built into containerd; and the Docker 18.03 CE integration uses the dockershim."

Suggestion:

Instead, we should start a CRI (container runtime for Kubernetes) by default, and start the bigger daemon only when needed.

The default will be "containerd", with buildkitd socket-activated. With an alternative of "cri-o", with podman socket-activated.

This will result in a smaller footprint for users that "only" want to run Kubernetes, and not for instance build container images.

All previous features are still available, since the unix socket will be there and start the daemon when somebody accesses it.

Implications:

  1. Install CRI by default, including cri-tools (crictl)

  2. Don't start anything automatically at boot time

  3. Also deprecate the old Docker tcp port (2376)

  4. Complete the fork of the libmachine and drivers

Related issues:

MEP PR to follow

Note that for instance kind, microk8s and k3s already run with containerd by default. It is also a graduated CNCF project:

https://containerd.io/

We will continue to offer alternative runtimes (following the CRI/CNI specifications), even at considerable engineering effort.


Note: Local podman is "daemon-less", but remote podman runs as a service through systemd (so it works like dockerd).
Since we want to expose the container runtime to users, we want to run the remote versions. We also run as root, for now.

Users will need to upgrade to Docker 18.09 (or later) or Podman 2.0 (or later), if they want to access the service remotely.
Like before, there will be a client installed that can be access over SSH. Only downside is having to transfer the build context.

@afbjorklund afbjorklund added kind/feature Categorizes issue or PR as related to a new feature. co/runtime/docker Issues specific to a docker runtime co/runtime/crio CRIO related issues co/runtime/containerd labels Nov 8, 2020
@afbjorklund
Copy link
Collaborator Author

afbjorklund commented Nov 8, 2020

Unfortunately dockerd (and docker-containerd) does not share image storage with stand-alone containerd:

moby/moby#38043
containerd/containerd#2987

This means that any images built will need to be transported from /var/lib/docker to /var/lib/containerd.


Same thing already happens today when using BuildKit to build, but that is a different story (--output).

There might be some performance implications because of this, but at least it can all be transferred locally...
Other distributions have other solutions to this issue, including tar + copying the images over the network.

One alternative could be to use buildkitd (instead of dockerd), with a containerd worker and image store ?
This could be implemented as part of the needed minikube "build" abstraction mentioned earlier (#4868).

docker@minikube:~$ sudo buildkitd --containerd-worker true --oci-worker false
docker@minikube:~$ cat Dockerfile 
FROM busybox
RUN true
docker@minikube:~$ sudo buildctl build --frontend=dockerfile.v0 --local context=. --local dockerfile=. \
                                       --output type=image,name=local/test
[+] Building 11.4s (6/6) FINISHED                                                                                                                
 => [internal] load build definition from Dockerfile                                                                                        0.0s
 => => transferring dockerfile: 59B                                                                                                         0.0s
 => [internal] load .dockerignore                                                                                                           0.0s
 => => transferring context: 2B                                                                                                             0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                                                                           6.3s
 => [1/2] FROM docker.io/library/busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d                            4.5s
 => => resolve docker.io/library/busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d                            0.0s
 => => sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d 2.08kB / 2.08kB                                              0.0s
 => => sha256:c9249fdf56138f0d929e2080ae98ee9cb2946f71498fc1484288e6a935b5e5bc 527B / 527B                                                  0.0s
 => => sha256:9758c28807f21c13d05c704821fdd56c0b9574912f9b916c65e1df3e6b8bc572 764.62kB / 764.62kB                                          2.4s
 => => sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f 1.49kB / 1.49kB                                              0.0s
 => => unpacking docker.io/library/busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d                          0.1s
 => [2/2] RUN true                                                                                                                          0.2s
 => exporting to image                                                                                                                      0.2s
 => => exporting layers                                                                                                                     0.2s
 => => exporting manifest sha256:66aff0fa079bc2460957414dbd91064c3c518746d26323d39c62afad1bad23f2                                           0.0s
 => => exporting config sha256:b8b29efc4eb59c9f047d07a0c778ba8c116a3d4648d4b624db5c600fa109682f                                             0.0s
 => => naming to local/test                                                                                                                 0.0s
docker@minikube:~$ sudo ctr --namespace=buildkit images ls
REF        TYPE                                                 DIGEST                                                                  SIZE      PLATFORMS   LABELS 
local/test application/vnd.docker.distribution.manifest.v2+json sha256:66aff0fa079bc2460957414dbd91064c3c518746d26323d39c62afad1bad23f2 748.4 KiB linux/amd64 - 

For podman and cri-o the story is different since they are already both using the same /var/lib/containers.

One could include buildah too, if there was a particular use case for it not already covered by podman build.

@afbjorklund afbjorklund changed the title MEP: Containerd by default, Docker on demand MEP: CRI: Containerd by default, Docker on demand Nov 8, 2020
@sharifelgamal sharifelgamal added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Nov 17, 2020
@afbjorklund afbjorklund changed the title MEP: CRI: Containerd by default, Docker on demand MEP: CRI: Containerd by default, Buildkitd on demand Dec 12, 2020
@afbjorklund
Copy link
Collaborator Author

I changed the title, since docker storage is not compatible with containerd storage

@afbjorklund
Copy link
Collaborator Author

There will be an external cri-dockerd, so Docker will be supported for years yet

@afbjorklund
Copy link
Collaborator Author

Building using buildctl is awkward to say the least, and running containers via ctr is similarly user-hostile.
So as long as we want to export the full features of the container runtime to the user, Docker/Moby is the better choice.

We can still offer containerd/buildkitd as a fully supported CRI option of course, but it is unlikely to be the default.
Other projects that do this (like kind or microk8s), don't offer the "machine" or "build" functionally that minikube does.

Getting away from docker-machine is still needed, but it is a slightly different topic than changing the default CRI.

Will probably split it out into a separate GitHub issue, for the "naked VM" (item 2) and the "fork machine" (item 4).

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 13, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 12, 2021
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Apr 24, 2021
@AkihiroSuda
Copy link
Member

@afbjorklund Is there an ETA of this?

@afbjorklund
Copy link
Collaborator Author

@afbjorklund Is there an ETA of this?

The consensus was that we should continue to have docker as the default for some more releases.

We need to have stable fallbacks for workflows original based on docker-env, such as Skaffold.
But it should be available as a minikube alternative already today (--container-runtime=containerd)

And we keep on adding CI resources and tests, to make sure that containerd / buildkitd is tested.

@afbjorklund
Copy link
Collaborator Author

We will add Mirantis cri-dockerd, to keep supporting the Docker runtime also with Kubernetes 1.23+

@spowelljr spowelljr added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
co/runtime/containerd co/runtime/crio CRIO related issues co/runtime/docker Issues specific to a docker runtime kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants