Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeadm hosted install fails (kube-dns CrashLoopBackOff after 3m) help wanted #1337

Closed
pksec opened this issue Nov 10, 2017 · 3 comments
Closed

Comments

@pksec
Copy link

pksec commented Nov 10, 2017

I used calico manifests and default kubeadm setup instructions

  • First I thought it could be a problem with kubernetes. But I was able to successfully deploy Weave.
  • I have successfully tested star-policy, advanced-policy demos few months back. Now it is not working. I have spent almost 2 days to find if I miss something. Now I thought it is time to ask some help, as I could not understand what is happening.
  • I am still new to kubernetes and calico. Please point out any silly things if I have missed.

Kubernetes issues (similar to the one, but seems to have fixed them)

My Environment

Steps to Reproduce (for bugs)

https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/

  1. sudo kubeadm init --service-cidr=10.96.0.0/12 --pod-network-cidr=192.168.0.0/16
  2. kubectl apply -f https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
  3. kubectl taint nodes --all node-role.kubernetes.io/master-

Current Behavior

kubelet config

/etc/systemd/system/kubelet.service.d$ sudo nano 10-kubeadm.conf

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_EXTRA_ARGS

pod status

kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY     STATUS             RESTARTS   AGE
kube-system   calico-etcd-d63hj                          1/1       Running            0          7m
kube-system   calico-kube-controllers-1449740419-v0gc7   1/1       Running            0          7m
kube-system   calico-node-8mstx                          2/2       Running            0          7m
kube-system   etcd-pka-sec                               1/1       Running            0          8m
kube-system   kube-apiserver-pka-sec                     1/1       Running            0          8m
kube-system   kube-controller-manager-pka-sec            1/1       Running            0          8m
kube-system   kube-dns-2617979913-9q70b                  2/3       CrashLoopBackOff   7          9m
kube-system   kube-proxy-xv1sl                           1/1       Running            0          9m
kube-system   kube-scheduler-pka-sec                     1/1       Running            0          8m`

kube-dns crashes

kube-dns log

kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns

I1110 09:48:48.502946     427 dns.go:48] version: 1.14.4-2-g5584e04
I1110 09:48:48.503799     427 server.go:70] Using configuration read from directory: /kube-dns-config with period 10s
I1110 09:48:48.503838     427 server.go:113] FLAG: --alsologtostderr="false"
I1110 09:48:48.503848     427 server.go:113] FLAG: --config-dir="/kube-dns-config"
I1110 09:48:48.503855     427 server.go:113] FLAG: --config-map=""
I1110 09:48:48.503860     427 server.go:113] FLAG: --config-map-namespace="kube-system"
I1110 09:48:48.503864     427 server.go:113] FLAG: --config-period="10s"
I1110 09:48:48.503871     427 server.go:113] FLAG: --dns-bind-address="0.0.0.0"
I1110 09:48:48.503875     427 server.go:113] FLAG: --dns-port="10053"
I1110 09:48:48.503881     427 server.go:113] FLAG: --domain="cluster.local."
I1110 09:48:48.503892     427 server.go:113] FLAG: --federations=""
I1110 09:48:48.503901     427 server.go:113] FLAG: --healthz-port="8081"
I1110 09:48:48.503917     427 server.go:113] FLAG: --initial-sync-timeout="1m0s"
I1110 09:48:48.503926     427 server.go:113] FLAG: --kube-master-url=""
I1110 09:48:48.503936     427 server.go:113] FLAG: --kubecfg-file=""
I1110 09:48:48.503944     427 server.go:113] FLAG: --log-backtrace-at=":0"
I1110 09:48:48.503955     427 server.go:113] FLAG: --log-dir=""
I1110 09:48:48.503964     427 server.go:113] FLAG: --log-flush-frequency="5s"
I1110 09:48:48.503972     427 server.go:113] FLAG: --logtostderr="true"
I1110 09:48:48.503979     427 server.go:113] FLAG: --nameservers=""
I1110 09:48:48.503987     427 server.go:113] FLAG: --stderrthreshold="2"
I1110 09:48:48.503995     427 server.go:113] FLAG: --v="2"
I1110 09:48:48.504004     427 server.go:113] FLAG: --version="false"
I1110 09:48:48.504016     427 server.go:113] FLAG: --vmodule=""
I1110 09:48:48.504069     427 server.go:176] Starting SkyDNS server (0.0.0.0:10053)
I1110 09:48:48.504300     427 server.go:198] Skydns metrics enabled (/metrics:10055)
I1110 09:48:48.504315     427 dns.go:147] Starting endpointsController
I1110 09:48:48.504324     427 dns.go:150] Starting serviceController
I1110 09:48:48.504381     427 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I1110 09:48:48.504419     427 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I1110 09:48:49.004534     427 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
E1110 09:49:18.504958     427 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1110 09:49:18.504954     427 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I1110 09:49:19.004556     427 dns.go:174] Waiting for services and endpoints to be initialized from apiserver..

kubectl describe pod kube-dns-2617979913-9q70b -n kube-system

Name:		kube-dns-2617979913-9q70b
Namespace:	kube-system
Node:		pka-sec/10.192.155.211
Start Time:	Fri, 10 Nov 2017 09:54:06 +0100
Labels:		k8s-app=kube-dns
		pod-template-hash=2617979913
Annotations:	kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"kube-dns-2617979913","uid":"6c5c954a-c5f4-11e7-ab92-e018770a...
		scheduler.alpha.kubernetes.io/critical-pod=
Status:		Running
IP:		192.168.48.117
Created By:	ReplicaSet/kube-dns-2617979913
Controlled By:	ReplicaSet/kube-dns-2617979913
Containers:
  kubedns:
    Container ID:	docker://6061c4765a8bde9ce4b6e8cfcaf86f869a3c83e9a85948018b381c066e17b317
    Image:		gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.5
    Image ID:		docker-pullable://gcr.io/google_containers/k8s-dns-kube-dns-amd64@sha256:1a3fc069de481ae690188f6f1ba4664b5cc7760af37120f70c86505c79eea61d
    Ports:		10053/UDP, 10053/TCP, 10055/TCP
    Args:
      --domain=cluster.local.
      --dns-port=10053
      --config-dir=/kube-dns-config
      --v=2
    State:		Running
      Started:		Fri, 10 Nov 2017 09:55:09 +0100
    Last State:		Terminated
      Reason:		Error
      Exit Code:	255
      Started:		Fri, 10 Nov 2017 09:54:07 +0100
      Finished:		Fri, 10 Nov 2017 09:55:08 +0100
    Ready:		False
    Restart Count:	1
    Limits:
      memory:	170Mi
    Requests:
      cpu:	100m
      memory:	70Mi
    Liveness:	http-get http://:10054/healthcheck/kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:	http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Environment:
      PROMETHEUS_PORT:	10055
    Mounts:
      /kube-dns-config from kube-dns-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-hmg08 (ro)
  dnsmasq:
    Container ID:	docker://4e750303e28ebe6a13e70f26d1968129dc5816a1a0f32c2bc60a14a6cf88efd8
    Image:		gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.5
    Image ID:		docker-pullable://gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64@sha256:46b933bb70270c8a02fa6b6f87d440f6f1fce1a5a2a719e164f83f7b109f7544
    Ports:		53/UDP, 53/TCP
    Args:
      -v=2
      -logtostderr
      -configDir=/etc/k8s/dns/dnsmasq-nanny
      -restartDnsmasq=true
      --
      -k
      --cache-size=1000
      --log-facility=-
      --server=/cluster.local/127.0.0.1#10053
      --server=/in-addr.arpa/127.0.0.1#10053
      --server=/ip6.arpa/127.0.0.1#10053
    State:		Running
      Started:		Fri, 10 Nov 2017 09:54:08 +0100
    Ready:		True
    Restart Count:	0
    Requests:
      cpu:		150m
      memory:		20Mi
    Liveness:		http-get http://:10054/healthcheck/dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:	<none>
    Mounts:
      /etc/k8s/dns/dnsmasq-nanny from kube-dns-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-hmg08 (ro)
  sidecar:
    Container ID:	docker://62c6f4de220fe6f5e679448f4815cf6dd9a2932e5b57e8166193d55e5f94cfe1
    Image:		gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.5
    Image ID:		docker-pullable://gcr.io/google_containers/k8s-dns-sidecar-amd64@sha256:9aab42bf6a2a068b797fe7d91a5d8d915b10dbbc3d6f2b10492848debfba6044
    Port:		10054/TCP
    Args:
      --v=2
      --logtostderr
      --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A
      --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
    State:		Running
      Started:		Fri, 10 Nov 2017 09:54:08 +0100
    Ready:		True
    Restart Count:	0
    Requests:
      cpu:		10m
      memory:		20Mi
    Liveness:		http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:	<none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-hmg08 (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  kube-dns-config:
    Type:	ConfigMap (a volume populated by a ConfigMap)
    Name:	kube-dns
    Optional:	true
  kube-dns-token-hmg08:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	kube-dns-token-hmg08
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	<none>
Tolerations:	CriticalAddonsOnly
		node-role.kubernetes.io/master:NoSchedule
		node.alpha.kubernetes.io/notReady:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath			Type	Reason			Message
  ---------	--------	-----	----			-------------			--------------			-------
  3m		2m		8	default-scheduler					Warning	FailedScheduling	no nodes available to schedule pods
  1m		1m		1	default-scheduler					Normal	Scheduled		Successfully assigned kube-dns-2617979913-9q70b to pka-sec
  1m		1m		1	kubelet, pka-sec					Normal	SuccessfulMountVolume	MountVolume.SetUp succeeded for volume "kube-dns-config" 
  1m		1m		1	kubelet, pka-sec					Normal	SuccessfulMountVolume	MountVolume.SetUp succeeded for volume "kube-dns-token-hmg08" 
  1m		1m		1	kubelet, pka-sec	spec.containers{dnsmasq}	Normal	Pulled			Container image "gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.5" already present on machine
  1m		1m		1	kubelet, pka-sec	spec.containers{sidecar}	Normal	Created			Created container
  1m		1m		1	kubelet, pka-sec	spec.containers{sidecar}	Normal	Started			Started container
  1m		1m		1	kubelet, pka-sec	spec.containers{dnsmasq}	Normal	Created			Created container
  1m		1m		1	kubelet, pka-sec	spec.containers{dnsmasq}	Normal	Started			Started container
  1m		1m		1	kubelet, pka-sec	spec.containers{sidecar}	Normal	Pulled			Container image "gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.5" already present on machine
  1m		41s		2	kubelet, pka-sec	spec.containers{kubedns}	Normal	Pulled			Container image "gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.5" already present on machine
  1m		41s		2	kubelet, pka-sec	spec.containers{kubedns}	Normal	Created			Created container
  1m		41s		2	kubelet, pka-sec	spec.containers{kubedns}	Normal	Started			Started container
  1m		4s		11	kubelet, pka-sec	spec.containers{kubedns}	Warning	Unhealthy		Readiness probe failed: Get http://192.168.48.117:8081/readiness: dial tcp 192.168.48.117:8081: getsockopt: connection refused
  34s		4s		4	kubelet, pka-sec	spec.containers{dnsmasq}	Warning	Unhealthy		Liveness probe failed: HTTP probe failed with statuscode: 503

kubectl get nodes -o wide

NAME      STATUS    AGE       VERSION   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION
pka-sec   Ready     47m       v1.7.10   <none>        Ubuntu 16.04.3 LTS   4.4.0-98-generic
@tmjd
Copy link
Member

tmjd commented Nov 10, 2017

You should check the kube-apiserver logs and the kube-proxy logs (on the host where kube-dns is running) for errors and see if they point out any issues.

@oomichi
Copy link

oomichi commented Feb 14, 2018

/assign oomichi

@oomichi
Copy link

oomichi commented Feb 14, 2018

Unfortunately I am not a collaborator of this, and I cannot close this issue.
I think this can be closed because of kubernetes/website#7397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants