Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.16-rc.1 regression w/ CRI-O: Get runtime version failed: rpc error: code = Unavailable #5323

Closed
tstromberg opened this issue Sep 12, 2019 · 4 comments · Fixed by #5338
Closed
Labels
area/kubernetes-versions Improving support for versions of Kubernetes co/runtime/crio CRIO related issues help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@tstromberg
Copy link
Contributor

./out/minikube start --container-runtime=crio --kubernetes-version=v1.16.0-rc.1

causes:

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
	- 'docker ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'docker logs CONTAINERID'

: Process exited with status 1

😿  Sorry that minikube crashed. If this was unexpected, we would love to hear from you:
👉  https://github.com/kubernetes/minikube/issues/new/choose

minikube logs show issues, but no obvious root causes:

==> container status <==
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID

This does look interesting:

Sep 12 01:08:56 minikube kubelet[3529]: E0912 01:08:56.423423    3529 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dminikube&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
Sep 12 01:08:57 minikube kubelet[3529]: W0912 01:08:57.368453    3529 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0  <nil>}: didn't receive server preface in time. Reconnecting...
Sep 12 01:08:57 minikube kubelet[3529]: W0912 01:08:57.368782    3529 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0  <nil>}: didn't receive server preface in time. Reconnecting...
Sep 12 01:08:57 minikube kubelet[3529]: E0912 01:08:57.368954    3529 remote_runtime.go:81] Version from runtime service failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake
Sep 12 01:08:57 minikube kubelet[3529]: E0912 01:08:57.369101    3529 kuberuntime_manager.go:193] Get runtime version failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake
Sep 12 01:08:57 minikube kubelet[3529]: F0912 01:08:57.369124    3529 server.go:271] failed to run Kubelet: failed to create kubelet: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake
Sep 12 01:08:57 minikube systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Sep 12 01:08:57 minikube systemd[1]: kubelet.service: Failed with result 'exit-code'.
Sep 12 01:08:58 minikube systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Sep 12 01:08:58 minikube systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 17.
Sep 12 01:08:58 minikube systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Sep 12 01:08:58 minikube systemd[1]: Started kubelet: The Kubernetes Node Agent.
@tstromberg tstromberg changed the title v1.16-rc.1 broken with crio: v1.16-rc.1 crio: It seems like the kubelet isn't running or healthy. Sep 12, 2019
@tstromberg tstromberg changed the title v1.16-rc.1 crio: It seems like the kubelet isn't running or healthy. v1.16-rc.1 crio: grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}: didn't receive server preface in time. Sep 12, 2019
@tstromberg
Copy link
Contributor Author

Confirmed that this is specific to Kuberentes v1.16 rc1. It does not occur with beta 1.

Sep 12 01:15:10 minikube kubelet[4177]: W0912 01:15:10.872074 4177 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}: didn't receive server preface in time. Reconnecting... Sep 12 01:15:10 minikube kubelet[4177]: W0912 01:15:10.872273 4177 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}: didn't receive server preface in time. Reconnecting... Sep 12 01:15:10 minikube kubelet[4177]: E0912 01:15:10.872614 4177 remote_runtime.go:81] Version from runtime service failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake Sep 12 01:15:10 minikube kubelet[4177]: E0912 01:15:10.872803 4177 kuberuntime_manager.go:193] Get runtime version failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake Sep 12 01:15:10 minikube kubelet[4177]: F0912 01:15:10.872912 4177 server.go:271] failed to run Kubelet: failed to create kubelet: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake

@tstromberg tstromberg added this to the v1.4.0 milestone Sep 12, 2019
@tstromberg tstromberg added area/kubernetes-versions Improving support for versions of Kubernetes help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 12, 2019
@tstromberg tstromberg changed the title v1.16-rc.1 crio: grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}: didn't receive server preface in time. v1.16-rc.1 regression w/ CRI-O: Get runtime version failed: rpc error: code = Unavailable Sep 12, 2019
@afbjorklund
Copy link
Collaborator

afbjorklund commented Sep 12, 2019

Possibly related to grpc issues, if so this should help: GRPC_GO_REQUIRE_HANDSHAKE=off

But can only find issues originating from crictl, not from the actual kubernetes deployment itself.

grpc/grpc-go#2636

@afbjorklund
Copy link
Collaborator

afbjorklund commented Sep 12, 2019

Looks like upgrading to cri-o 1.15.2 should fix this issue, will try locally (error reproduces here)

cri-o/cri-o@71c0b9e

cri-o/cri-o#2697

@afbjorklund afbjorklund added the co/runtime/crio CRIO related issues label Sep 12, 2019
@afbjorklund
Copy link
Collaborator

afbjorklund commented Sep 12, 2019

Seems much happier: Your Kubernetes control-plane has initialized successfully!

🎁 Preparing Kubernetes v1.16.0-rc.1 on CRI-O 1.15.2 ...

🏄 Done! kubectl is now configured to use "minikube"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes-versions Improving support for versions of Kubernetes co/runtime/crio CRIO related issues help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants