Change cgroup driver from cgroupfs to systemd #6651

afbjorklund · 2020-02-15T11:55:51Z

The minikube iso is using systemd, so change the container runtime
to use the same cgroup manager instead of the default (cgroupfs).

Avoids kubeadm init message:

    [WARNING IsDockerSystemdCheck]:
        detected "cgroupfs" as the Docker cgroup driver.
        The recommended driver is "systemd".
        Please follow the guide at https://kubernetes.io/docs/setup/cri/

Also change the configuration for the containerd and cri-o runtimes.

Closes #4770

The minikube iso is using systemd, so change the container runtime to use the same cgroup manager instead of the default (cgroupfs). Avoids kubeadm init message: [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Also change the configuration for the containerd and cri-o runtimes.

k8s-ci-robot · 2020-02-15T11:56:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afbjorklund

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [afbjorklund]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

medyagh · 2020-02-16T03:02:55Z

/ok-to-test

minikube-pr-bot · 2020-02-16T03:10:21Z

Error: running mkcmp: exit status 1

afbjorklund · 2020-02-16T08:22:48Z

Seems like the crio restart is taking a really long time to complete.
Which is weird, since it wasn't running and nothing much changed...

Two minutes, just for the restart ?

I0216 09:21:31.255945   11439 ssh_runner.go:101] Run: sudo sysctl net.netfilter.nf_conntrack_count
I0216 09:21:31.265389   11439 ssh_runner.go:265] ! sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_count: No such file or directory
I0216 09:21:31.265679   11439 cruntime.go:172] couldn't verify netfilter by "sudo sysctl net.netfilter.nf_conntrack_count" which might be okay. error: sudo sysctl net.netfilter.nf_conntrack_count: Process exited with status 255
stdout:

stderr:
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_count: No such file or directory
I0216 09:21:31.265824   11439 ssh_runner.go:101] Run: sudo modprobe br_netfilter
I0216 09:21:31.319895   11439 ssh_runner.go:101] Run: sudo sh -c "echo 1 > /proc/sys/net/ipv4/ip_forward"
I0216 09:21:31.331856   11439 ssh_runner.go:101] Run: sudo systemctl restart crio
I0216 09:23:31.699791   11439 ssh_runner.go:141] Completed: sudo systemctl restart crio: (2m0.36787738s)
I0216 09:23:31.700109   11439 ssh_runner.go:101] Run: crio --version
I0216 09:23:31.751381   11439 ssh_runner.go:265] > crio version 1.17.0

afbjorklund · 2020-02-16T08:51:43Z

Apparently systemd thinks that the network is offline, and crio service depends on it.

Feb 16 08:21:31 minikube systemd[1]: Starting Wait for Network to be Configured.

Feb 16 08:21:31 minikube systemd-networkd-wait-online[2345]: ignoring: lo

Feb 16 08:21:31 minikube systemd[1]: Starting CRI-O Auto Update Script...
Feb 16 08:21:32 minikube systemd[1]: Started CRI-O Auto Update Script.

Feb 16 08:23:32 minikube systemd[1]: [[0;1;39m[[0;1;31m[[0;1;39msystemd-networkd
-wait-online.service: Main process exited, code=exited, status=1/FAILURE[[0m
Feb 16 08:23:32 minikube systemd[1]: [[0;1;39m[[0;1;31m[[0;1;39msystemd-networkd
-wait-online.service: Failed with result 'exit-code'.[[0m
Feb 16 08:23:32 minikube systemd[1]: [[0;1;39m[[0;1;31m[[0;1;39msystemd-networkd
-wait-online.service: Failed with result 'exit-code'.[[0m
Feb 16 08:23:32 minikube systemd[1]: [[0;1;31m[[0;1;39m[[0;1;31mFailed to start 
Wait for Network to be Configured.[[0m

Feb 16 08:23:32 minikube systemd[1]: Reached target Network is Online.

Feb 16 08:23:32 minikube systemd[1]: Starting Container Runtime Interface for OC
I (CRI-O)...
Feb 16 08:23:32 minikube systemd[1]: Started Container Runtime Interface for OCI
 (CRI-O).

So we are hitting a 120 second systemd timeout, before it gives up on the service...

static bool arg_quiet = false;
static usec_t arg_timeout = 120 * USEC_PER_SEC;

afbjorklund · 2020-02-16T09:09:27Z

Apparently systemd is either buggy, or needs to be informed better about eth0 and eth1:

$ sudo networkctl
IDX LINK             TYPE               OPERATIONAL SETUP     
  1 lo               loopback           carrier     unmanaged 
  2 eth0             ether              [[0;1;32mroutable   [[0m [[0;1;33mconfig
uring[[0m
  3 eth1             ether              [[0;1;32mroutable   [[0m [[0;1;33mconfig
uring[[0m
  4 sit0             sit                off         unmanaged 
  5 mybridge         bridge             [[0;1;32mroutable   [[0m unmanaged 
  6 veth11d292ea     ether              carrier     unmanaged 
  7 vethf775b924     ether              carrier     unmanaged

Possibly related to not liking the DHCP server much:

Feb 16 08:21:23 minikube systemd-networkd[2033]: eth0: Gained carrier
Feb 16 08:21:23 minikube systemd-networkd[2033]: eth1: DHCPv4 address 192.168.99
.100/24
Feb 16 08:21:23 minikube systemd-networkd[2033]: eth0: DHCPv4 address 10.0.2.15/
24 via 10.0.2.2
Feb 16 08:21:23 minikube systemd-networkd[2033]: eth1: DHCP: No gateway received
 from DHCP server: No data available

Possible workarounds: systemd/systemd#5154

afbjorklund · 2020-02-16T10:52:17Z

Opened #6655 about the startup being slow, I think that was the reason for the test failures ?

medyagh · 2020-02-16T18:44:56Z

@afbjorklund the kic docker tests are 70 min. They usually run in 20 min.

In podman also we explicitly specify the cgroups to be cgroups.

Does that mean we need to keep separate logic for vm and contrainers?

Could we make everything use same type of cgroups?

afbjorklund · 2020-02-16T18:59:23Z

Could we make everything use same type of cgroups?

As far as I know, kubernetes recommends using the same cgroup manager as the host OS.
But I heard there was some issues with it when running docker-in-docker, so I'm not sure...

Knee deep in systemd bugs already, even before it was trying to run inside a container.
But there should be some resources online, on how it can be achieved - maybe Red Hat ?

medyagh · 2020-02-17T10:47:40Z

Could we make everything use same type of cgroups?

As far as I know, kubernetes recommends using the same cgroup manager as the host OS.
But I heard there was some issues with it when running docker-in-docker, so I'm not sure...

Knee deep in systemd bugs already, even before it was trying to run inside a container.
But there should be some resources online, on how it can be achieved - maybe Red Hat ?

I wonder if there is a correlation between his PR and the docker tests running in 70 minutes (more than 3 times than normal times which is 20 mins)

afbjorklund · 2020-02-17T16:24:45Z

@medyagh : note that this PR only changes deploy/iso/minikube-iso

afbjorklund · 2020-02-23T19:05:32Z

@medyagh : did you find the issue for the slowdown ? probably not anything on the ISO, right ?

For me it seems like the "CI / docker_*" and "CI / podman_*" tests are always failing (timeout)

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 15, 2020

afbjorklund requested a review from medyagh February 15, 2020 11:55

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 15, 2020

k8s-ci-robot requested review from blueelvis and josedonizetti February 15, 2020 11:56

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 15, 2020

k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Feb 16, 2020

afbjorklund self-assigned this Feb 16, 2020

afbjorklund merged commit 5ee57d4 into kubernetes:master Feb 23, 2020

bharath-123 mentioned this pull request Dec 5, 2020

Use systemd as the cgroup driver for kubelet and CRI kubernetes/kops#10372

Closed

afbjorklund mentioned this pull request Jan 13, 2021

default the "cgroupDriver" setting of the kubelet to "systemd" kubernetes/kubeadm#2376

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change cgroup driver from cgroupfs to systemd #6651

Change cgroup driver from cgroupfs to systemd #6651

afbjorklund commented Feb 15, 2020

k8s-ci-robot commented Feb 15, 2020

medyagh commented Feb 16, 2020

minikube-pr-bot commented Feb 16, 2020

afbjorklund commented Feb 16, 2020 •

edited

Loading

afbjorklund commented Feb 16, 2020 •

edited

Loading

afbjorklund commented Feb 16, 2020

afbjorklund commented Feb 16, 2020

medyagh commented Feb 16, 2020

afbjorklund commented Feb 16, 2020

medyagh commented Feb 17, 2020

afbjorklund commented Feb 17, 2020

afbjorklund commented Feb 23, 2020

Change cgroup driver from cgroupfs to systemd #6651

Change cgroup driver from cgroupfs to systemd #6651

Conversation

afbjorklund commented Feb 15, 2020

k8s-ci-robot commented Feb 15, 2020

medyagh commented Feb 16, 2020

minikube-pr-bot commented Feb 16, 2020

afbjorklund commented Feb 16, 2020 • edited Loading

afbjorklund commented Feb 16, 2020 • edited Loading

afbjorklund commented Feb 16, 2020

afbjorklund commented Feb 16, 2020

medyagh commented Feb 16, 2020

afbjorklund commented Feb 16, 2020

medyagh commented Feb 17, 2020

afbjorklund commented Feb 17, 2020

afbjorklund commented Feb 23, 2020

afbjorklund commented Feb 16, 2020 •

edited

Loading

afbjorklund commented Feb 16, 2020 •

edited

Loading