none driver on openEuler #8420

gaozhekang · 2020-06-09T07:31:47Z

<!-- 请在报告问题时使用此模板，并提供尽可能详细的信息。否则可能导致响应延迟。谢谢！-->
我在arm64的vm中执行minikube start --vm-driver=none出现报错，可以确认的是docker-ce已经安装并且可以运行容器。

执行 docker run hello-world 可以看到打印:
Hello from Docker!
This message shows that your installation appears to be working correctly.

$ rpm -qa | grep docker
docker-ce-cli-19.03.11-3.el7.aarch64
docker-ce-19.03.11-3.el7.aarch64

$ rpm -qa | grep kubectl
kubectl-1.18.3-0.aarch64
kubeadm-1.18.3-0.aarch64
kubelet-1.18.3-0.aarch64

**重现问题所需的命令**：minikube start --vm-driver=none

**失败的命令的完整输出**：<details>
* minikube v1.11.0 on Openeuler 20.03 (arm64)
  - KUBECONFIG=/etc/kubernetes/admin.conf:config-demo:config-demo-2
* Using the none driver based on existing profile
* Starting control plane none minikube in cluster minikube
* Restarting existing none bare metal machine for "minikube" ...
* OS release is openEuler 20.03 (LTS)
* Preparing Kubernetes v1.18.3 on Docker 19.03.11 ...
! Unable to restart cluster, will reset it: getting k8s client: client config:  client config: context "minikube" does not exist
! initialization failed, will try again: run: /bin/bash -c "sudo env PATH=/var/lib/minikube/binaries/v1.18.3:$PATH kubeadm in it --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable-etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap": exit status 1
stdout:
[init] Using Kubernetes version: v1.18.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
......
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
......
</details>


**`minikube logs`命令的输出**： <details>


</details>

这个看起来像是因为kubelet运行异常导致，通过systemctl status kubelet可以看到kubelet服务为activating (auto-restart)状态，并且退出码为203

$ journalctl -xeu kubelet
Jun 09 15:26:07 localhost.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXEC
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- An ExecStart= process belonging to unit kubelet.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 203.

**使用的操作系统版本**：openEuler 20.03 (LTS)

The text was updated successfully, but these errors were encountered:

medyagh · 2020-06-09T16:47:26Z

@gaozhekang while I dont know abouy openEuler, I like to know is there a reason you choose none driver over docker ?

I wonder if have you tried out newest driver Docker Driver with latest version of minikube?
you could try these with normal user (not sudo)
minikube delete
minikube start --driver=docker

for more information on the docker driver checkout:
https://minikube.sigs.k8s.io/docs/drivers/docker/

afbjorklund · 2020-06-09T21:18:15Z

@medyagh : afaik, we don't support arm64 with docker yet.

gaozhekang · 2020-06-10T02:34:02Z

As what @afbjorklund said, when I tried to use --driver=docker. It showed that arm64 is not supported.

medyagh · 2020-06-10T04:51:16Z

As what @afbjorklund said, when I tried to use --driver=docker. It showed that arm64 is not supported.

ah ...sorry about that. you are right that suggestion wouldn't work

gaozhekang · 2020-06-10T06:28:54Z

I tried to reinstall my env and exec "minikube start --vm-driver=none --image-mirror-country=cn" again, it still reported the error before. But I found more info like this:

$ minikube start --vm-driver=none --image-mirror-country=cn
* minikube v1.11.0 on Openeuler 20.03 (arm64)
  - KUBECONFIG=/etc/kubernetes/admin.conf:config-demo:config-demo-2
* Using the none driver based on existing profile
* Starting control plane none minikube in cluster minikube
* Restarting existing none bare metal machine for "minikube" ...
* OS release is openEuler 20.03 (LTS)
* Preparing Kubernetes v1.18.3 on Docker 19.03.11 ...
! Unable to restart cluster, will reset it: getting k8s client: client config:  client config: context "minikube" does not exist
! initialization failed, will try again: run: /bin/bash -c "sudo env PATH=/var/lib/minikube/binaries/v1.18.3:$PATH kubeadm in it --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable-etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap": exit status 1
......
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause
                Once you have found the failing container, you can inspect its logs with：
                - 'docker logs CONTAINEDRID'


stderr:
W0610 11:12:19.072776 1110706 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
W0610 11:12:20.975400 1110706 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
W0610 11:12:20.976615 1110706 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

* Suggestion: Check output of 'journalctl -xeu kubelet', try passing --extra-config=kubelet.cgroup-driver=systemd to minikube start
* Related issue: https://github.com/kubernetes/minikube/issues/4172

Accordint to suggestion, I tried "minikube start --vm-driver=none --image-mirror-country=cn --extra-config=kubelet.cgroup-driver=systemd" and nothing changed.
And according to the output of "docker info", cgroup driver is "cgroupfs", not "systemd".
Besides, I checked the status of kubelet service, it's activating (auto-restart) and the exit-code is 203.

medyagh · 2020-06-10T19:11:45Z

@gaozhekang

two things I noticed
1-

The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get 
http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.

are you using a VPN or firewall ? seems like u can not hit localhost:1248

2- the cgroups seems to be not the one that kubeadm wants

gaozhekang · 2020-06-11T02:41:24Z

Thanks. I have disabled firewalld and flushed iptables, so firewall may not be the problem. I guess maybe it's because google is not accessible in China, so I replaced minkube kubelet kubectl and kubeadm with aliyun repo. It seems kubelet service starts to become running. And "journal -xeu kubelet" outputs:

Jun 11 10:22:28 localhost.localdomain kubelet[281340]: E0611 10:22:28.324222 281340 kubelet.go:2267] node "localhost.localdomain" not found
Jun 11 10:22:28 localhost.localdomain kubelet[281340]: E0611 10:22:28.368961 281340 event.go:269] Unable to write event: 'Post https://control-plane.minikube.internal:8443/api/v1/namespaces/default/events: dial tcp 192.168.122.123:8443: connect: connection refused' (may retry after sleeping)

/etc/hosts is:

127.0.0.1    localhost localhost.localdomain localhost4 localhost4.localdomain
::1               localhost localhost.localdomain localhost6 localhost6.localdomain
192.168.122.123   server.example.com node1
192.168.122.121   client.example.com master
127.0.0.1    host.minikube.internal
192.168.122.123   control-plane.minikube.internal

"exec-opts": ["native.cgroupdriver=systemd"] is added to /etc/docker/daemon.json, and the cgroup warning disappeared.

While port 10248 connection and cgroup warning are solved, it stills report an error when running "minikube start --vm-driver=none --registry-mirror=https://registry.docker-cn.com --v=10"

[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifest"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubecheck-check] Initial timeout of 40s passed.
        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause
                Once you have found the failing container, you can inspect its logs with：
                - 'docker logs CONTAINEDRID'


stderr:
W0610 10:32:08.535462 283702 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
W0610 10:32:10.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
W0610 11:12:20.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

gaozhekang · 2020-06-11T03:02:22Z

I try the same way in CentOS 7.5 + x86_64 env, and it reports the same error. So maybe this is an known bug?

medyagh · 2020-06-15T00:31:52Z

I haven't personally tired minikbue with arm arch yet, but I would like to have an integration test for this.

@gaozhekang have you tried KVM driver maybe there be luck with that one ?

afbjorklund · 2020-06-15T20:05:10Z

I haven't personally tired minikbue with arm arch yet,

Since we only support the "none" driver, the experience is pretty much the same as kubeadm.

Since nobody has mentioned SELinux yet, and this is CentOS, then I suspect that it is #7905

Also it is arm64 not arm, but that's another story.

KVM doesn't work, since we don't have an ARM ISO.

afbjorklund · 2020-06-15T20:06:14Z

but I would like to have an integration test for this.

We would need some hardware for this, see #6280

But it would nice with some CentOS tests: #3552

gaozhekang · 2020-06-17T01:21:17Z

Thanks. I have tried another times and Centos + X86_64 is OK, but the same way on openEuler + arm64. By the way, my CentOS + X86_64 env has different network env from the arm64 one, so I guess maybe there is some problem with network?

medyagh · 2020-07-07T22:11:07Z

what is the error you get with the one with different nettwork ? @gaozhekang

gaozhekang · 2020-07-08T01:12:40Z

The error is 40s timeout and

stderr:
W0610 10:32:08.535462 283702 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
W0610 10:32:10.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
W0610 11:12:20.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

medyagh · 2020-07-10T18:08:50Z

The error is 40s timeout and

stderr:
W0610 10:32:08.535462 283702 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
W0610 10:32:10.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
W0610 11:12:20.916124 283702 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node.RBAC"
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

that is the W lines (warning), do you mind pasting the whole logs?

medyagh · 2020-07-29T18:11:51Z

I haven't heard back from you, I wonder if you still have this issue?
Regrettably, there isn't enough information in this issue to make it actionable, and a long enough duration has passed, so this issue is likely difficult to replicate.

I will close this issue for now but please feel free to reopen whenever you feel ready to provide more information.

kevinzs2048 · 2020-10-16T01:07:31Z

/assign

kevinzs2048 · 2020-10-16T01:08:33Z

This is Kevin from Linaro, I will corporate with Huawei guys to continue work on this.
Also we have Arm64 machines which can offer to upstream as the Arm64 CI

kevinzs2048 · 2020-10-16T07:48:49Z

@gaozhekang Hi, could you tell me how to install the Kubeadm/kubectl/kubelet in OpenEuler?

gaozhekang added the l/zh-CN Issues in or relating to Chinese label Jun 9, 2020

medyagh added triage/needs-information Indicates an issue needs more information in order to work on it. co/none-driver kind/support Categorizes issue or PR as a support question. labels Jun 9, 2020

medyagh changed the title ~~Cannot execute “minikube start --vm-driver=none” on openEuler~~ none driver on openEuler Jun 9, 2020

medyagh closed this as completed Jul 29, 2020

k8s-ci-robot assigned kevinzs2048 Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

none driver on openEuler #8420

none driver on openEuler #8420

gaozhekang commented Jun 9, 2020 •

edited by medyagh

Loading

medyagh commented Jun 9, 2020

afbjorklund commented Jun 9, 2020

gaozhekang commented Jun 10, 2020

medyagh commented Jun 10, 2020

gaozhekang commented Jun 10, 2020

medyagh commented Jun 10, 2020

gaozhekang commented Jun 11, 2020

gaozhekang commented Jun 11, 2020

medyagh commented Jun 15, 2020

afbjorklund commented Jun 15, 2020 •

edited

Loading

afbjorklund commented Jun 15, 2020 •

edited

Loading

gaozhekang commented Jun 17, 2020

medyagh commented Jul 7, 2020

gaozhekang commented Jul 8, 2020

medyagh commented Jul 10, 2020

medyagh commented Jul 29, 2020

kevinzs2048 commented Oct 16, 2020

kevinzs2048 commented Oct 16, 2020

kevinzs2048 commented Oct 16, 2020

none driver on openEuler #8420

none driver on openEuler #8420

Comments

gaozhekang commented Jun 9, 2020 • edited by medyagh Loading

medyagh commented Jun 9, 2020

afbjorklund commented Jun 9, 2020

gaozhekang commented Jun 10, 2020

medyagh commented Jun 10, 2020

gaozhekang commented Jun 10, 2020

medyagh commented Jun 10, 2020

gaozhekang commented Jun 11, 2020

gaozhekang commented Jun 11, 2020

medyagh commented Jun 15, 2020

afbjorklund commented Jun 15, 2020 • edited Loading

afbjorklund commented Jun 15, 2020 • edited Loading

gaozhekang commented Jun 17, 2020

medyagh commented Jul 7, 2020

gaozhekang commented Jul 8, 2020

medyagh commented Jul 10, 2020

medyagh commented Jul 29, 2020

kevinzs2048 commented Oct 16, 2020

kevinzs2048 commented Oct 16, 2020

kevinzs2048 commented Oct 16, 2020

gaozhekang commented Jun 9, 2020 •

edited by medyagh

Loading

afbjorklund commented Jun 15, 2020 •

edited

Loading

afbjorklund commented Jun 15, 2020 •

edited

Loading