Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

ETCD TLS Bad Certificate #335

Open
bencodner opened this issue Jul 30, 2020 · 0 comments
Open

ETCD TLS Bad Certificate #335

bencodner opened this issue Jul 30, 2020 · 0 comments

Comments

@bencodner
Copy link

bencodner commented Jul 30, 2020

I have had this issue a few times now and trying to understand what keeps causing it. I was previously running 1.13 but did a fresh install in our dev environment upgrading to v1.17.4 and everything has been running great until today.

KOPS:
Version 1.17.0-beta.1 (git-32af4ed9b)
----------------------------------------------------------------------------------------------------------------------
KUBECTL:
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T20:55:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

----------------------------------------------------------------------------------------------------------------------
Image: kopeio/etcd-manager:3.0.20200116

I0730 16:38:32.187889    2613 controller.go:173] starting controller iteration
I0730 16:38:32.187922    2613 controller.go:269] I am leader with token "BxoWU5-sK8fgJ6ojmkfx-A"
2020-07-30 16:38:32.193214 I | embed: rejected connection from "10.10.10.95:38972" (error "remote error: tls: bad certificate", ServerName "etcd-events-a.internal.k8s-west2.redacted.net")
2020-07-30 16:38:33.193650 I | embed: rejected connection from "10.10.10.95:38974" (error "remote error: tls: bad certificate", ServerName "etcd-events-a.internal.k8s-west2.redacted.net")
2020-07-30 16:38:34.930768 I | embed: rejected connection from "10.10.10.95:38978" (error "remote error: tls: bad certificate", ServerName "etcd-events-a.internal.k8s-west2.redacted.net")
W0730 16:38:37.188592    2613 controller.go:675] unable to reach member etcdClusterPeerInfo{peer=peer{id:"etcd-events-a" endpoints:"10.10.10.95:3997" }, info=cluster_name:"etcd-events" node_configuration:<name:"etcd-events-a" peer_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:2381" client_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:4002" quarantined_client_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:3995" > etcd_state:<cluster:<desired_cluster_size:1 cluster_token:"etcd-cluster-token-etcd-events" nodes:<name:"etcd-events-a" peer_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:2381" client_urls:"https://0.0.0.0:4002" quarantined_client_urls:"http://0.0.0.0:3995" tls_enabled:true > > etcd_version:"3.3.13" > }: error building etcd client for https://etcd-events-a.internal.k8s-west2.redacted.net:4002: context deadline exceeded
I0730 16:38:37.188691    2613 controller.go:276] etcd cluster state: etcdClusterState
  members:
  peers:
    etcdClusterPeerInfo{peer=peer{id:"etcd-events-a" endpoints:"10.10.10.95:3997" }, info=cluster_name:"etcd-events" node_configuration:<name:"etcd-events-a" peer_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:2381" client_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:4002" quarantined_client_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:3995" > etcd_state:<cluster:<desired_cluster_size:1 cluster_token:"etcd-cluster-token-etcd-events" nodes:<name:"etcd-events-a" peer_urls:"https://etcd-events-a.internal.k8s-west2.redacted.net:2381" client_urls:"https://0.0.0.0:4002" quarantined_client_urls:"http://0.0.0.0:3995" tls_enabled:true > > etcd_version:"3.3.13" > }
I0730 16:38:37.188727    2613 controller.go:277] etcd cluster members: map[]
I0730 16:38:37.188742    2613 controller.go:615] sending member map to all peers: members:<name:"etcd-events-a" dns:"etcd-events-a.internal.k8s-west2.redacted.net" addresses:"10.10.10.95" > 
I0730 16:38:37.189042    2613 etcdserver.go:226] updating hosts: map[10.10.10.95:[etcd-events-a.internal.k8s-west2.redacted.net]]
I0730 16:38:37.189068    2613 hosts.go:84] hosts update: primary=map[10.10.10.95:[etcd-events-a.internal.k8s-west2.redacted.net]], fallbacks=map[etcd-events-a.internal.k8s-west2.redacted.net:[10.10.10.95 10.10.10.95]], final=map[10.10.10.95:[etcd-events-a.internal.k8s-west2.redacted.net]]
I0730 16:38:37.204747    2613 commands.go:22] not refreshing commands - TTL not hit
I0730 16:38:37.204774    2613 s3fs.go:220] Reading file "s3://k8s-west2.redacted.net-kops-store/k8s-west2.redacted.net/backups/etcd/events/control/etcd-cluster-created"
I0730 16:38:37.240942    2613 controller.go:369] spec member_count:1 etcd_version:"3.3.13" 
I0730 16:38:37.241108    2613 commands.go:25] refreshing commands
I0730 16:38:37.340952    2613 vfs.go:104] listed commands in s3://k8s-west2.redacted.net-kops-store/k8s-west2.redacted.net/backups/etcd/events/control: 0 commands
I0730 16:38:37.340985    2613 s3fs.go:220] Reading file "s3://k8s-west2.redacted.net-kops-store/k8s-west2.redacted.net/backups/etcd/events/control/etcd-cluster-spec"
W0730 16:38:37.353014    2613 controller.go:149] unexpected error running etcd cluster reconciliation loop: etcd has 0 members registered; must issue restore-backup command to proceed
----------------------------------------------------------------------------------------------------------------------

EVENTS POD: 
Name:                 etcd-manager-events-ip-10-10-10-95.us-west-2.compute.internal
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 ip-10-10-10-95.us-west-2.compute.internal/10.10.10.95
Start Time:           Sun, 19 Apr 2020 11:49:47 -0600
Labels:               k8s-app=etcd-manager-events
Annotations:          kubernetes.io/config.hash: 8462a5a3729a329407ca0e6b37444ad5
                      kubernetes.io/config.mirror: 8462a5a3729a329407ca0e6b37444ad5
                      kubernetes.io/config.seen: 2020-04-19T17:49:46.926014141Z
                      kubernetes.io/config.source: file
                      scheduler.alpha.kubernetes.io/critical-pod: 
Status:               Running
IP:                   10.10.10.95
IPs:
  IP:           10.10.10.95
Controlled By:  Node/ip-10-10-10-95.us-west-2.compute.internal
Containers:
  etcd-manager:
    Container ID:  docker://b99646659d82a3f2845e3aaebb6b3b6730c406badfb5668fae732f37a8fca5a2
    Image:         kopeio/etcd-manager:3.0.20200116
    Image ID:      docker-pullable://kopeio/etcd-manager@sha256:eb72d0a120059598446e4ed45781e40ff79a3bbcaa5861ea1b3d0e72a6654af5
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      mkfifo /tmp/pipe; (tee -a /var/log/etcd.log < /tmp/pipe & ) ; exec /etcd-manager --backup-store=s3://k8s-west2.redacted.net-kops-store/k8s-west2.redacted.net/backups/etcd/events --client-urls=https://__name__:4002 --cluster-name=etcd-events --containerized=true --dns-suffix=.internal.k8s-west2.redacted.net --etcd-insecure=false --grpc-port=3997 --insecure=false --peer-urls=https://__name__:2381 --quarantine-client-urls=https://__name__:3995 --v=6 --volume-name-tag=k8s.io/etcd/events --volume-provider=aws --volume-tag=k8s.io/etcd/events --volume-tag=k8s.io/role/master=1 --volume-tag=kubernetes.io/cluster/k8s-west2.redacted.net=owned > /tmp/pipe 2>&1
    State:          Running
      Started:      Sun, 19 Apr 2020 11:47:56 -0600
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        200m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /etc/hosts from hosts (rw)
      /etc/kubernetes/pki/etcd-manager from pki (rw)
      /rootfs from rootfs (rw)
      /var/log/etcd.log from varlogetcd (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  rootfs:
    Type:          HostPath (bare host directory volume)
    Path:          /
    HostPathType:  Directory
  hosts:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/hosts
    HostPathType:  File
  pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki/etcd-manager-events
    HostPathType:  DirectoryOrCreate
  varlogetcd:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/etcd-events.log
    HostPathType:  FileOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute
                   CriticalAddonsOnly
Events:            <none>

----------------------------------------------------------------------------------------------------------------------

Cluster Validation: 
Validating cluster k8s-west2.redacted.net


VALIDATION ERRORS
KIND    NAME  MESSAGE
ComponentStatus etcd-0  component "etcd-0" is unhealthy
ComponentStatus etcd-1  component "etcd-1" is unhealthy

Validation Failed

----------------------------------------------------------------------------------------------------------------------
Host certs:
for i in $(ls /etc/kubernetes/pki/etcd-manager-events | grep crt);do openssl x509 -enddate -noout -in $i;done
notAfter=Jul 26 20:31:10 2029 GMT
notAfter=Mar 26 17:03:49 2029 GMT
notAfter=Apr 19 17:49:00 2021 GMT
notAfter=Apr 19 17:49:00 2021 GMT
notAfter=Jul 26 20:31:10 2029 GMT
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant