Upgrade restarts all pods after a restore happens #1008

junghoahnsc · 2017-04-26T20:30:53Z

After a restore happens (by deleting nodes for testing), whenever I try to upgrade, all pods are died and then restored. But when there was no restore before, upgrade just restarts pods.
I tested the v0.2.5.

Steps:

create a cluster with backup (with etcd 3.1.5)
delete more than half pods to make restore happen
upgrade the cluster with etcd 3.1.6

xiang90 · 2017-04-26T20:31:49Z

create a cluster with backup (with etcd 3.1.5)

this cluster is also created with etcd operator 0.2.5?

junghoahnsc · 2017-04-26T20:33:28Z

Oh, yes. I created with operator 0.2.5.

hasbro17 · 2017-04-26T21:31:23Z

Confirmed. I am able to recreate the issue on a kubernetes 1.6.0 cluster with etcd-operator v0.2.5.

Here are the steps:

Create the etcd-operator deployment with the following configuration:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: etcd-operator
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: etcd-operator
    spec:
      containers:
      - name: etcd-operator
        image: quay.io/coreos/etcd-operator:v0.2.5
        env:
        - name: MY_POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

$ kubectl create -f deployment.yaml

Create the 3 member etcd cluster with the backup config:

apiVersion: "etcd.coreos.com/v1beta1"
kind: "Cluster"
metadata:
  name: "example-etcd-cluster-with-backup"
spec:
  size: 3
  version: "3.1.5"
  backup:
    # short snapshot interval for testing, do not use this in production!
    backupIntervalInSecond: 30
    maxBackups: 5
    storageType: "PersistentVolume"
    pv:
      volumeSizeInMB: 512

$ kubectl create -f example-etcd-cluster-with-backup.yaml

Delete 2 of the 3 pods:

$ kubectl delete pod example-etcd-cluster-with-backup-0000 example-etcd-cluster-with-backup-0001 --now

Wait for the cluster to do disaster recovery and heal back to 3 members:

NAME                                                    READY     STATUS    RESTARTS   AGE
etcd-operator-3244348504-3x3fl                          1/1       Running   0          25m
example-etcd-cluster-with-backup-0003                   1/1       Running   0          6m
example-etcd-cluster-with-backup-0004                   1/1       Running   0          6m
example-etcd-cluster-with-backup-0005                   1/1       Running   0          6m

Upgrade the cluster spec.version to 3.1.6:

$ kubectl edit cluster example-etcd-cluster-with-backup

After the upgrade the existing pods die out and the operator does disaster recovery to restart all the pods.

Operator logs during the cluster upgrade to 3.1.6. More readable logs here:

time="2017-04-26T21:08:27Z" level=info msg="spec update: from: {3 3.1.5 false <nil> 0xc42020b740 <nil> <nil> <nil>} to: {3 3.1.6 false <nil> 0xc42020a9c0 <nil> <nil> <nil>}" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:35Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:35Z" level=info msg="upgrading the etcd member example-etcd-cluster-with-backup-0003 from 3.1.5 to 3.1.6" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:35Z" level=info msg="finished upgrading the etcd member example-etcd-cluster-with-backup-0003" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:35Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:43Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:43Z" level=info msg="upgrading the etcd member example-etcd-cluster-with-backup-0004 from 3.1.5 to 3.1.6" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:43Z" level=info msg="finished upgrading the etcd member example-etcd-cluster-with-backup-0004" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:43Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:51Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:51Z" level=info msg="upgrading the etcd member example-etcd-cluster-with-backup-0005 from 3.1.5 to 3.1.6" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:51Z" level=info msg="finished upgrading the etcd member example-etcd-cluster-with-backup-0005" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:51Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:59Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:08:59Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:07Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:07Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:12Z" level=warning msg="health check of etcd member (http://10.244.2.20:2379) failed: failed to create etcd client for http://10.244.2.20:2379: grpc: timed out when dialing" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="running members: example-etcd-cluster-with-backup-0003,example-etcd-cluster-with-backup-0005" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="cluster membership: example-etcd-cluster-with-backup-0003,example-etcd-cluster-with-backup-0004,example-etcd-cluster-with-backup-0005" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="removing one dead member" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="removing dead member \"example-etcd-cluster-with-backup-0004\"" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="removed member (example-etcd-cluster-with-backup-0004) with ID (2622584403024468680)" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:20Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="running members: example-etcd-cluster-with-backup-0003" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="cluster membership: example-etcd-cluster-with-backup-0005,example-etcd-cluster-with-backup-0003" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="Disaster recovery" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="pods are still running (example-etcd-cluster-with-backup-0003). Will try to make a latest backup from one of them." cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="made a latest backup" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="cluster created with seed member (example-etcd-cluster-with-backup-0006)" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:28Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:36Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:36Z" level=info msg="running members: example-etcd-cluster-with-backup-0006" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:36Z" level=info msg="cluster membership: example-etcd-cluster-with-backup-0006" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:39Z" level=info msg="added member (example-etcd-cluster-with-backup-0007)" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:39Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=info msg="running members: example-etcd-cluster-with-backup-0007,example-etcd-cluster-with-backup-0006" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=info msg="cluster membership: example-etcd-cluster-with-backup-0006,example-etcd-cluster-with-backup-0007" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=error msg="fail to add new member (example-etcd-cluster-with-backup-0008): etcdserver: unhealthy cluster" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:47Z" level=error msg="failed to reconcile: etcdserver: unhealthy cluster" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:55Z" level=info msg="Start reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:55Z" level=info msg="running members: example-etcd-cluster-with-backup-0007,example-etcd-cluster-with-backup-0006" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:55Z" level=info msg="cluster membership: example-etcd-cluster-with-backup-0006,example-etcd-cluster-with-backup-0007" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:55Z" level=info msg="added member (example-etcd-cluster-with-backup-0008)" cluster-name=example-etcd-cluster-with-backup pkg=cluster
time="2017-04-26T21:09:55Z" level=info msg="Finish reconciling" cluster-name=example-etcd-cluster-with-backup pkg=cluster

The logs for the etcd member example-etcd-cluster-with-backup-0004 which is the first pod to disappear from the running members of the etcd operator.

The cluster TPR after the second disaster recovery:

apiVersion: v1
items:
- apiVersion: etcd.coreos.com/v1beta1
  kind: Cluster
  metadata:
    creationTimestamp: 2017-04-26T20:57:16Z
    name: example-etcd-cluster-with-backup
    namespace: e2e-local
    resourceVersion: "178359"
    selfLink: /apis/etcd.coreos.com/v1beta1/namespaces/e2e-local/clusters/example-etcd-cluster-with-backup
    uid: ee0dcc9d-2ac2-11e7-92e9-42010af00002
  spec:
    backup:
      backupIntervalInSecond: 30
      cleanupBackupsOnClusterDelete: false
      maxBackups: 5
      pv:
        volumeSizeInMB: 512
      storageType: PersistentVolume
    size: 3
    version: 3.1.6
  status:
    backupServiceStatus:
      backupSize: 0.059
      backups: 2
      recentBackup:
        creationTime: 2017-04-26T21:00:39Z
        size: 0.027
        timeTookInSecond: 1
        version: 3.1.5
    conditions:
    - reason: upgrading cluster member example-etcd-cluster-with-backup-0003 version
        to 3.1.6
      transitionTime: 2017-04-26T21:08:35Z
      type: Upgrading
    - reason: upgrading cluster member example-etcd-cluster-with-backup-0004 version
        to 3.1.6
      transitionTime: 2017-04-26T21:08:43Z
      type: Upgrading
    - reason: upgrading cluster member example-etcd-cluster-with-backup-0005 version
        to 3.1.6
      transitionTime: 2017-04-26T21:08:51Z
      type: Upgrading
    - reason: ""
      transitionTime: 2017-04-26T21:08:59Z
      type: Ready
    - reason: removing dead member example-etcd-cluster-with-backup-0004
      transitionTime: 2017-04-26T21:09:20Z
      type: RemovingDeadMember
    - reason: ""
      transitionTime: 2017-04-26T21:09:28Z
      type: Recovering
    - reason: 'Current cluster size: 1, desired cluster size: 3'
      transitionTime: 2017-04-26T21:09:36Z
      type: ScalingUp
    - reason: 'Current cluster size: 2, desired cluster size: 3'
      transitionTime: 2017-04-26T21:09:47Z
      type: ScalingUp
    - reason: 'Current cluster size: 2, desired cluster size: 3'
      transitionTime: 2017-04-26T21:09:55Z
      type: ScalingUp
    - reason: ""
      transitionTime: 2017-04-26T21:10:03Z
      type: Ready
    controlPaused: false
    currentVersion: 3.1.6
    members:
      ready:
      - example-etcd-cluster-with-backup-0006
      - example-etcd-cluster-with-backup-0007
      - example-etcd-cluster-with-backup-0008
    phase: Running
    reason: ""
    size: 3
    targetVersion: ""
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""

hasbro17 · 2017-04-27T00:35:03Z

So after the upgrade to 3.1.6 there is a period of time when all pods are stable. Then the first pod to start terminating is example-etcd-cluster-with-backup-0004. That is detected and removed from the cluster membership by the operator.

example-etcd-cluster-with-backup-0003   1/1       Running   0         2m
example-etcd-cluster-with-backup-0004   1/1       Running   0         2m
example-etcd-cluster-with-backup-0005   1/1       Running   0         2m
example-etcd-cluster-with-backup-0003   1/1       Running   1         2m
example-etcd-cluster-with-backup-0004   1/1       Running   1         2m
example-etcd-cluster-with-backup-0004   0/1       Error     1         2m
example-etcd-cluster-with-backup-0004   0/1       Terminating   1         2m
example-etcd-cluster-with-backup-0004   0/1       Terminating   1         2m
example-etcd-cluster-with-backup-0005   1/1       Running   1         2m
example-etcd-cluster-with-backup-0005   0/1       Error     1         2m
example-etcd-cluster-with-backup-0003   1/1       Terminating   1         3m
example-etcd-cluster-with-backup-0003   1/1       Terminating   1         3m

When example-etcd-cluster-with-backup-0005 goes away the operator goes into disaster recovery.

The logs for all 3 pods 0003, 0004 and 0005 are shown below. The parts of interest are the logs after the time when the actual upgrade from 3.1.5 to 3.1.6 happens. I've marked that point in the logs below.

0003

2017-04-27 00:16:19.948018 I | etcdmain: etcd Version: 3.1.5
2017-04-27 00:16:19.948115 I | etcdmain: Git SHA: 20490ca
2017-04-27 00:16:19.948119 I | etcdmain: Go Version: go1.7.5
2017-04-27 00:16:19.948122 I | etcdmain: Go OS/Arch: linux/amd64
2017-04-27 00:16:19.948126 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2017-04-27 00:16:19.948175 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2017-04-27 00:16:19.948259 I | embed: listening for peers on http://0.0.0.0:2380
2017-04-27 00:16:19.948306 I | embed: listening for client requests on 0.0.0.0:2379
2017-04-27 00:16:19.949089 I | etcdserver: recovered store from snapshot at index 1
2017-04-27 00:16:19.949102 I | etcdserver: name = example-etcd-cluster-with-backup-0003
2017-04-27 00:16:19.949106 I | etcdserver: data dir = /var/etcd/data
2017-04-27 00:16:19.949109 I | etcdserver: member dir = /var/etcd/data/member
2017-04-27 00:16:19.949112 I | etcdserver: heartbeat = 100ms
2017-04-27 00:16:19.949115 I | etcdserver: election = 1000ms
2017-04-27 00:16:19.949118 I | etcdserver: snapshot count = 10000
2017-04-27 00:16:19.949129 I | etcdserver: advertise client URLs = http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379
2017-04-27 00:16:19.949342 I | etcdserver: restarting member f236749ea9d6acdd in cluster df156762086d415c at commit index 1
2017-04-27 00:16:19.949375 I | raft: f236749ea9d6acdd became follower at term 1
2017-04-27 00:16:19.949386 I | raft: newRaft f236749ea9d6acdd [peers: [f236749ea9d6acdd], term: 1, commit: 1, applied: 1, lastindex: 1, lastterm: 1]
2017-04-27 00:16:19.949446 I | etcdserver/membership: added member f236749ea9d6acdd [http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:19.962315 I | etcdserver: starting server... [version: 3.1.5, cluster version: to_be_decided]
2017-04-27 00:16:20.049709 I | raft: f236749ea9d6acdd is starting a new election at term 1
2017-04-27 00:16:20.049752 I | raft: f236749ea9d6acdd became candidate at term 2
2017-04-27 00:16:20.049768 I | raft: f236749ea9d6acdd received MsgVoteResp from f236749ea9d6acdd at term 2
2017-04-27 00:16:20.049782 I | raft: f236749ea9d6acdd became leader at term 2
2017-04-27 00:16:20.049794 I | raft: raft.node: f236749ea9d6acdd elected leader f236749ea9d6acdd at term 2
2017-04-27 00:16:20.050329 I | etcdserver: published {Name:example-etcd-cluster-with-backup-0003 ClientURLs:[http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379]} to cluster df156762086d415c
2017-04-27 00:16:20.050418 I | etcdserver: setting up the initial cluster version to 3.1
2017-04-27 00:16:20.050462 I | embed: ready to serve client requests
2017-04-27 00:16:20.050775 N | embed: serving insecure client requests on [::]:2379, this is strongly discouraged!
2017-04-27 00:16:20.051419 N | etcdserver/membership: set the initial cluster version to 3.1
2017-04-27 00:16:20.051510 I | etcdserver/api: enabled capabilities for version 3.1
2017-04-27 00:16:20.646453 I | etcdserver/membership: added member b37c2e49dedb8dde [http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c
2017-04-27 00:16:20.646487 I | rafthttp: starting peer b37c2e49dedb8dde...
2017-04-27 00:16:20.646508 I | rafthttp: started HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:16:20.656305 I | rafthttp: started peer b37c2e49dedb8dde
2017-04-27 00:16:20.656355 I | rafthttp: added peer b37c2e49dedb8dde
2017-04-27 00:16:20.657541 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:20.658099 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:20.658330 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:20.660727 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:21.049687 W | raft: f236749ea9d6acdd stepped down to follower since quorum is not active
2017-04-27 00:16:21.049723 I | raft: f236749ea9d6acdd became follower at term 2
2017-04-27 00:16:21.049732 I | raft: raft.node: f236749ea9d6acdd lost leader f236749ea9d6acdd at term 2
2017-04-27 00:16:22.549679 I | raft: f236749ea9d6acdd is starting a new election at term 2
2017-04-27 00:16:22.549944 I | raft: f236749ea9d6acdd became candidate at term 3
2017-04-27 00:16:22.549960 I | raft: f236749ea9d6acdd received MsgVoteResp from f236749ea9d6acdd at term 3
2017-04-27 00:16:22.549970 I | raft: f236749ea9d6acdd [logterm: 2, index: 5] sent MsgVote request to b37c2e49dedb8dde at term 3
2017-04-27 00:16:24.350417 I | raft: f236749ea9d6acdd is starting a new election at term 3
2017-04-27 00:16:24.350457 I | raft: f236749ea9d6acdd became candidate at term 4
2017-04-27 00:16:24.350470 I | raft: f236749ea9d6acdd received MsgVoteResp from f236749ea9d6acdd at term 4
2017-04-27 00:16:24.350480 I | raft: f236749ea9d6acdd [logterm: 2, index: 5] sent MsgVote request to b37c2e49dedb8dde at term 4
2017-04-27 00:16:25.660593 W | rafthttp: health check for peer b37c2e49dedb8dde could not connect: dial tcp 10.244.2.68:2380: getsockopt: connection refused
2017-04-27 00:16:25.849688 I | raft: f236749ea9d6acdd is starting a new election at term 4
2017-04-27 00:16:25.849745 I | raft: f236749ea9d6acdd became candidate at term 5
2017-04-27 00:16:25.849758 I | raft: f236749ea9d6acdd received MsgVoteResp from f236749ea9d6acdd at term 5
2017-04-27 00:16:25.849768 I | raft: f236749ea9d6acdd [logterm: 2, index: 5] sent MsgVote request to b37c2e49dedb8dde at term 5
2017-04-27 00:16:26.637875 I | rafthttp: peer b37c2e49dedb8dde became active
2017-04-27 00:16:26.637981 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:26.638143 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:26.643030 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:26.643113 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:27.249714 I | raft: f236749ea9d6acdd is starting a new election at term 5
2017-04-27 00:16:27.249765 I | raft: f236749ea9d6acdd became candidate at term 6
2017-04-27 00:16:27.249778 I | raft: f236749ea9d6acdd received MsgVoteResp from f236749ea9d6acdd at term 6
2017-04-27 00:16:27.249790 I | raft: f236749ea9d6acdd [logterm: 2, index: 5] sent MsgVote request to b37c2e49dedb8dde at term 6
2017-04-27 00:16:27.259293 I | raft: f236749ea9d6acdd received MsgVoteResp from b37c2e49dedb8dde at term 6
2017-04-27 00:16:27.259321 I | raft: f236749ea9d6acdd [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-27 00:16:27.259339 I | raft: f236749ea9d6acdd became leader at term 6
2017-04-27 00:16:27.259348 I | raft: raft.node: f236749ea9d6acdd elected leader f236749ea9d6acdd at term 6
2017-04-27 00:16:27.270362 I | rafthttp: start to send database snapshot [index: 5, to b37c2e49dedb8dde]...
2017-04-27 00:16:27.271348 I | etcdserver: wrote database snapshot out [total bytes: 32768]
2017-04-27 00:16:27.291443 I | rafthttp: database snapshot [index: 5, to: b37c2e49dedb8dde] sent out successfully
2017-04-27 00:16:27.324925 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:27.324940 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:27.330576 W | rafthttp: closed an existing TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:27.330596 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:27.330856 W | rafthttp: closed an existing TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:27.330873 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:27.426154 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:27.431024 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:28.684403 W | etcdserver: not healthy for reconfigure, rejecting member add {ID:fd700a65b8cc83fa RaftAttributes:{PeerURLs:[http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380]} Attributes:{Name: ClientURLs:[]}}
2017-04-27 00:16:36.703529 I | etcdserver/membership: added member 269bf91b4cd00e4b [http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c
2017-04-27 00:16:36.703566 I | rafthttp: starting peer 269bf91b4cd00e4b...
2017-04-27 00:16:36.703579 I | rafthttp: started HTTP pipelining with peer 269bf91b4cd00e4b
2017-04-27 00:16:36.705348 I | rafthttp: started peer 269bf91b4cd00e4b
2017-04-27 00:16:36.705372 I | rafthttp: added peer 269bf91b4cd00e4b
2017-04-27 00:16:36.705501 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (writer)
2017-04-27 00:16:36.705511 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (writer)
2017-04-27 00:16:36.705540 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:36.706590 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:39.327210 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member 269bf91b4cd00e4b (Get http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.69:2380: getsockopt: connection refused)
2017-04-27 00:16:39.327244 W | etcdserver: cannot get the version of member 269bf91b4cd00e4b (Get http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.69:2380: getsockopt: connection refused)
2017-04-27 00:16:41.709471 W | rafthttp: health check for peer 269bf91b4cd00e4b could not connect: dial tcp 10.244.2.69:2380: getsockopt: connection refused
2017-04-27 00:16:42.715595 I | rafthttp: peer 269bf91b4cd00e4b became active
2017-04-27 00:16:42.715698 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.715808 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:42.716656 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.720736 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.734119 I | rafthttp: start to send database snapshot [index: 8, to 269bf91b4cd00e4b]...
2017-04-27 00:16:42.734644 I | etcdserver: wrote database snapshot out [total bytes: 32768]
2017-04-27 00:16:42.768269 I | rafthttp: database snapshot [index: 8, to: 269bf91b4cd00e4b] sent out successfully
2017-04-27 00:16:42.790206 W | rafthttp: lost the TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.790347 W | rafthttp: lost the TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:42.803121 W | rafthttp: closed an existing TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.803136 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.803833 W | rafthttp: closed an existing TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.803848 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.897061 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.897217 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)

<<<Upgrade to 3.1.6>>>>

// No more logs

0004

2017-04-27 00:16:26.547695 I | etcdmain: etcd Version: 3.1.5
2017-04-27 00:16:26.547767 I | etcdmain: Git SHA: 20490ca
2017-04-27 00:16:26.547771 I | etcdmain: Go Version: go1.7.5
2017-04-27 00:16:26.547775 I | etcdmain: Go OS/Arch: linux/amd64
2017-04-27 00:16:26.547778 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2017-04-27 00:16:26.547872 I | embed: listening for peers on http://0.0.0.0:2380
2017-04-27 00:16:26.547900 I | embed: listening for client requests on 0.0.0.0:2379
2017-04-27 00:16:26.569587 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.67:2380
2017-04-27 00:16:26.569885 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.67:2380
2017-04-27 00:16:26.569991 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.68:2380
2017-04-27 00:16:26.570035 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.68:2380
2017-04-27 00:16:26.570393 I | etcdserver: name = example-etcd-cluster-with-backup-0004
2017-04-27 00:16:26.570401 I | etcdserver: data dir = /var/etcd/data
2017-04-27 00:16:26.570407 I | etcdserver: member dir = /var/etcd/data/member
2017-04-27 00:16:26.570410 I | etcdserver: heartbeat = 100ms
2017-04-27 00:16:26.570434 I | etcdserver: election = 1000ms
2017-04-27 00:16:26.570441 I | etcdserver: snapshot count = 10000
2017-04-27 00:16:26.570462 I | etcdserver: advertise client URLs = http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379
2017-04-27 00:16:26.585315 I | etcdserver: starting member b37c2e49dedb8dde in cluster df156762086d415c
2017-04-27 00:16:26.585373 I | raft: b37c2e49dedb8dde became follower at term 0
2017-04-27 00:16:26.585383 I | raft: newRaft b37c2e49dedb8dde [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2017-04-27 00:16:26.585389 I | raft: b37c2e49dedb8dde became follower at term 1
2017-04-27 00:16:26.624600 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:26.624634 I | rafthttp: starting peer f236749ea9d6acdd...
2017-04-27 00:16:26.624644 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:26.627364 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:26.627463 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:26.630139 I | rafthttp: started peer f236749ea9d6acdd
2017-04-27 00:16:26.630177 I | rafthttp: added peer f236749ea9d6acdd
2017-04-27 00:16:26.630197 I | etcdserver: starting server... [version: 3.1.5, cluster version: to_be_decided]
2017-04-27 00:16:26.630544 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:26.631132 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:26.633898 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:16:26.633958 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:26.634033 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:26.642876 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:26.642968 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:16:27.251491 I | raft: b37c2e49dedb8dde [term: 1] received a MsgVote message with higher term from f236749ea9d6acdd [term: 6]
2017-04-27 00:16:27.251513 I | raft: b37c2e49dedb8dde became follower at term 6
2017-04-27 00:16:27.251524 I | raft: b37c2e49dedb8dde [logterm: 0, index: 0, vote: 0] cast MsgVote for f236749ea9d6acdd [logterm: 2, index: 5] at term 6
2017-04-27 00:16:27.259695 I | raft: raft.node: b37c2e49dedb8dde elected leader f236749ea9d6acdd at term 6
2017-04-27 00:16:27.271428 I | rafthttp: receiving database snapshot [index:5, from f236749ea9d6acdd] ...
2017-04-27 00:16:27.290939 I | snap: saved database snapshot to disk [total bytes: 32768]
2017-04-27 00:16:27.290973 I | rafthttp: received and saved database snapshot [index: 5, from: f236749ea9d6acdd] successfully
2017-04-27 00:16:27.291075 I | raft: b37c2e49dedb8dde [commit: 0, lastindex: 0, lastterm: 0] starts to restore snapshot [index: 5, term: 2]
2017-04-27 00:16:27.291097 I | raft: log [committed=0, applied=0, unstable.offset=1, len(unstable.Entries)=0] starts to restore snapshot [index: 5, term: 2]
2017-04-27 00:16:27.291119 I | raft: b37c2e49dedb8dde restored progress of b37c2e49dedb8dde [next = 6, match = 5, state = ProgressStateProbe, waiting = false, pendingSnapshot = 0]
2017-04-27 00:16:27.291130 I | raft: b37c2e49dedb8dde restored progress of f236749ea9d6acdd [next = 6, match = 0, state = ProgressStateProbe, waiting = false, pendingSnapshot = 0]
2017-04-27 00:16:27.291140 I | raft: b37c2e49dedb8dde [commit: 5] restored snapshot [index: 5, term: 2]
2017-04-27 00:16:27.291324 I | etcdserver: applying snapshot at index 0...
2017-04-27 00:16:27.291471 I | etcdserver: recovering lessor...
2017-04-27 00:16:27.301388 I | etcdserver: finished recovering lessor
2017-04-27 00:16:27.301418 I | etcdserver: restoring mvcc store...
2017-04-27 00:16:27.303520 I | etcdserver: finished restoring mvcc store
2017-04-27 00:16:27.303549 I | etcdserver: recovering alarms...
2017-04-27 00:16:27.306409 I | etcdserver: closing old backend...
2017-04-27 00:16:27.315129 I | etcdserver: raft applied incoming snapshot at index 5
2017-04-27 00:16:27.323326 I | etcdserver: finished recovering alarms
2017-04-27 00:16:27.323353 I | etcdserver: recovering auth store...
2017-04-27 00:16:27.323368 I | etcdserver: finished recovering auth store
2017-04-27 00:16:27.323372 I | etcdserver: recovering store v2...
2017-04-27 00:16:27.323883 I | etcdserver: finished recovering store v2
2017-04-27 00:16:27.323899 I | etcdserver: recovering cluster configuration...
2017-04-27 00:16:27.323963 I | etcdserver/api: enabled capabilities for version 3.1
2017-04-27 00:16:27.323980 I | etcdserver/membership: added member b37c2e49dedb8dde [http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:27.323990 I | etcdserver/membership: added member f236749ea9d6acdd [http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:27.323997 I | etcdserver/membership: set the cluster version to 3.1 from store
2017-04-27 00:16:27.324003 I | etcdserver: finished recovering cluster configuration
2017-04-27 00:16:27.324010 I | etcdserver: removing old peers from network...
2017-04-27 00:16:27.324021 I | rafthttp: stopping peer f236749ea9d6acdd...
2017-04-27 00:16:27.324210 I | rafthttp: closed the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:16:27.324220 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:27.324346 I | rafthttp: closed the TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:27.324351 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:27.324421 I | etcdserver: finished closing old backend
2017-04-27 00:16:27.324467 I | rafthttp: stopped HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:27.324514 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:27.324523 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:27.324555 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:27.324561 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:27.324565 I | rafthttp: stopped peer f236749ea9d6acdd
2017-04-27 00:16:27.324573 I | rafthttp: removed peer f236749ea9d6acdd
2017-04-27 00:16:27.324577 I | etcdserver: finished removing old peers from network
2017-04-27 00:16:27.324580 I | etcdserver: adding peers from new cluster configuration into network...
2017-04-27 00:16:27.324596 I | rafthttp: starting peer f236749ea9d6acdd...
2017-04-27 00:16:27.324612 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:27.327772 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:27.327796 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:27.329572 I | rafthttp: started peer f236749ea9d6acdd
2017-04-27 00:16:27.329591 I | rafthttp: added peer f236749ea9d6acdd
2017-04-27 00:16:27.329595 I | etcdserver: finished adding peers from new cluster configuration into network...
2017-04-27 00:16:27.329601 I | etcdserver: finished applying incoming snapshot at index 0
2017-04-27 00:16:27.329666 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:27.329737 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:27.334116 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:16:27.334140 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:27.334281 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:27.350188 I | etcdserver: published {Name:example-etcd-cluster-with-backup-0004 ClientURLs:[http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379]} to cluster df156762086d415c
2017-04-27 00:16:27.350292 I | embed: ready to serve client requests
2017-04-27 00:16:27.350870 N | embed: serving insecure client requests on [::]:2379, this is strongly discouraged!
2017-04-27 00:16:27.426334 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:16:27.428830 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:36.703663 I | etcdserver/membership: added member 269bf91b4cd00e4b [http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c
2017-04-27 00:16:36.703696 I | rafthttp: starting peer 269bf91b4cd00e4b...
2017-04-27 00:16:36.703723 I | rafthttp: started HTTP pipelining with peer 269bf91b4cd00e4b
2017-04-27 00:16:36.706912 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (writer)
2017-04-27 00:16:36.707064 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (writer)
2017-04-27 00:16:36.712577 I | rafthttp: started peer 269bf91b4cd00e4b
2017-04-27 00:16:36.712657 I | rafthttp: added peer 269bf91b4cd00e4b
2017-04-27 00:16:36.714161 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:36.714740 I | rafthttp: started streaming with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:41.712879 W | rafthttp: health check for peer 269bf91b4cd00e4b could not connect: dial tcp 10.244.2.69:2380: getsockopt: connection refused
2017-04-27 00:16:42.711503 I | rafthttp: peer 269bf91b4cd00e4b became active
2017-04-27 00:16:42.711547 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.712278 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.719232 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.720921 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:42.787156 W | rafthttp: lost the TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)
2017-04-27 00:16:42.787447 W | rafthttp: lost the TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.807922 W | rafthttp: closed an existing TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.807943 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message writer)
2017-04-27 00:16:42.812534 W | rafthttp: closed an existing TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.812556 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 writer)
2017-04-27 00:16:42.888585 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream Message reader)
2017-04-27 00:16:42.888687 I | rafthttp: established a TCP streaming connection with peer 269bf91b4cd00e4b (stream MsgApp v2 reader)


<<<Upgrade to 3.1.6>>>>

2017-04-27 00:18:51.167000 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:18:51.167014 E | rafthttp: failed to read f236749ea9d6acdd on stream Message (unexpected EOF)
2017-04-27 00:18:51.167018 I | rafthttp: peer f236749ea9d6acdd became inactive
2017-04-27 00:18:51.167057 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:18:52.164030 I | raft: b37c2e49dedb8dde [term: 6] received a MsgVote message with higher term from 269bf91b4cd00e4b [term: 7]
2017-04-27 00:18:52.164054 I | raft: b37c2e49dedb8dde became follower at term 7
2017-04-27 00:18:52.164069 I | raft: b37c2e49dedb8dde [logterm: 6, index: 9, vote: 0] cast MsgVote for 269bf91b4cd00e4b [logterm: 6, index: 9] at term 7
2017-04-27 00:18:52.164077 I | raft: raft.node: b37c2e49dedb8dde lost leader f236749ea9d6acdd at term 7
2017-04-27 00:18:52.171924 I | raft: raft.node: b37c2e49dedb8dde elected leader 269bf91b4cd00e4b at term 7
2017-04-27 00:18:56.460564 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:18:56.460670 W | rafthttp: closed an existing TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:18:56.460709 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:18:56.461095 W | rafthttp: closed an existing TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:18:56.461165 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:18:56.548606 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:18:56.548933 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)

0005

2017-04-27 00:16:42.627950 I | etcdmain: etcd Version: 3.1.5
2017-04-27 00:16:42.628038 I | etcdmain: Git SHA: 20490ca
2017-04-27 00:16:42.628042 I | etcdmain: Go Version: go1.7.5
2017-04-27 00:16:42.628045 I | etcdmain: Go OS/Arch: linux/amd64
2017-04-27 00:16:42.628049 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2017-04-27 00:16:42.628155 I | embed: listening for peers on http://0.0.0.0:2380
2017-04-27 00:16:42.628188 I | embed: listening for client requests on 0.0.0.0:2379
2017-04-27 00:16:42.642486 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.67:2380
2017-04-27 00:16:42.642818 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.67:2380
2017-04-27 00:16:42.643166 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.68:2380
2017-04-27 00:16:42.643444 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.68:2380
2017-04-27 00:16:42.643531 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.69:2380
2017-04-27 00:16:42.643569 I | pkg/netutil: resolving example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380 to 10.244.2.69:2380
2017-04-27 00:16:42.644641 I | etcdserver: name = example-etcd-cluster-with-backup-0005
2017-04-27 00:16:42.644665 I | etcdserver: data dir = /var/etcd/data
2017-04-27 00:16:42.644670 I | etcdserver: member dir = /var/etcd/data/member
2017-04-27 00:16:42.644675 I | etcdserver: heartbeat = 100ms
2017-04-27 00:16:42.644679 I | etcdserver: election = 1000ms
2017-04-27 00:16:42.644685 I | etcdserver: snapshot count = 10000
2017-04-27 00:16:42.644697 I | etcdserver: advertise client URLs = http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379
2017-04-27 00:16:42.661200 I | etcdserver: starting member 269bf91b4cd00e4b in cluster df156762086d415c
2017-04-27 00:16:42.661252 I | raft: 269bf91b4cd00e4b became follower at term 0
2017-04-27 00:16:42.661266 I | raft: newRaft 269bf91b4cd00e4b [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2017-04-27 00:16:42.661271 I | raft: 269bf91b4cd00e4b became follower at term 1
2017-04-27 00:16:42.696855 I | rafthttp: started HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:16:42.696891 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:42.696907 I | rafthttp: starting peer b37c2e49dedb8dde...
2017-04-27 00:16:42.696936 I | rafthttp: started HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:16:42.701573 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.707209 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.707247 I | rafthttp: started peer b37c2e49dedb8dde
2017-04-27 00:16:42.707267 I | rafthttp: added peer b37c2e49dedb8dde
2017-04-27 00:16:42.707372 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.707662 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.707760 I | rafthttp: starting peer f236749ea9d6acdd...
2017-04-27 00:16:42.707778 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:42.712103 I | rafthttp: peer b37c2e49dedb8dde became active
2017-04-27 00:16:42.712185 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.712334 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.713879 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.714210 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.714400 I | rafthttp: started peer f236749ea9d6acdd
2017-04-27 00:16:42.714446 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.714550 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.714710 I | rafthttp: added peer f236749ea9d6acdd
2017-04-27 00:16:42.714748 I | etcdserver: starting server... [version: 3.1.5, cluster version: to_be_decided]
2017-04-27 00:16:42.716237 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:16:42.716261 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.717107 I | raft: 269bf91b4cd00e4b [term: 1] received a MsgHeartbeat message with higher term from f236749ea9d6acdd [term: 6]
2017-04-27 00:16:42.717128 I | raft: 269bf91b4cd00e4b became follower at term 6
2017-04-27 00:16:42.717145 I | raft: raft.node: 269bf91b4cd00e4b elected leader f236749ea9d6acdd at term 6
2017-04-27 00:16:42.718421 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.718916 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:42.719210 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:42.719497 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:42.719553 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:16:42.734499 I | rafthttp: receiving database snapshot [index:8, from f236749ea9d6acdd] ...
2017-04-27 00:16:42.767692 I | snap: saved database snapshot to disk [total bytes: 32768]
2017-04-27 00:16:42.767723 I | rafthttp: received and saved database snapshot [index: 8, from: f236749ea9d6acdd] successfully
2017-04-27 00:16:42.767833 I | raft: 269bf91b4cd00e4b [commit: 0, lastindex: 0, lastterm: 0] starts to restore snapshot [index: 8, term: 6]
2017-04-27 00:16:42.767922 I | raft: log [committed=0, applied=0, unstable.offset=1, len(unstable.Entries)=0] starts to restore snapshot [index: 8, term: 6]
2017-04-27 00:16:42.767995 I | raft: 269bf91b4cd00e4b restored progress of 269bf91b4cd00e4b [next = 9, match = 8, state = ProgressStateProbe, waiting = false, pendingSnapshot = 0]
2017-04-27 00:16:42.768006 I | raft: 269bf91b4cd00e4b restored progress of b37c2e49dedb8dde [next = 9, match = 0, state = ProgressStateProbe, waiting = false, pendingSnapshot = 0]
2017-04-27 00:16:42.768047 I | raft: 269bf91b4cd00e4b restored progress of f236749ea9d6acdd [next = 9, match = 0, state = ProgressStateProbe, waiting = false, pendingSnapshot = 0]
2017-04-27 00:16:42.768056 I | raft: 269bf91b4cd00e4b [commit: 8] restored snapshot [index: 8, term: 6]
2017-04-27 00:16:42.768272 I | etcdserver: applying snapshot at index 0...
2017-04-27 00:16:42.768563 I | etcdserver: recovering lessor...
2017-04-27 00:16:42.783427 I | etcdserver: finished recovering lessor
2017-04-27 00:16:42.783450 I | etcdserver: restoring mvcc store...
2017-04-27 00:16:42.783909 I | etcdserver: finished restoring mvcc store
2017-04-27 00:16:42.783923 I | etcdserver: recovering alarms...
2017-04-27 00:16:42.784085 I | etcdserver: closing old backend...
2017-04-27 00:16:42.785487 I | etcdserver: finished recovering alarms
2017-04-27 00:16:42.785508 I | etcdserver: recovering auth store...
2017-04-27 00:16:42.785523 I | etcdserver: finished recovering auth store
2017-04-27 00:16:42.785543 I | etcdserver: recovering store v2...
2017-04-27 00:16:42.786221 I | etcdserver: finished recovering store v2
2017-04-27 00:16:42.786283 I | etcdserver: recovering cluster configuration...
2017-04-27 00:16:42.786453 I | etcdserver/api: enabled capabilities for version 3.1
2017-04-27 00:16:42.786470 I | etcdserver/membership: added member 269bf91b4cd00e4b [http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:42.786489 I | etcdserver/membership: added member b37c2e49dedb8dde [http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:42.786494 I | etcdserver/membership: added member f236749ea9d6acdd [http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380] to cluster df156762086d415c from store
2017-04-27 00:16:42.786502 I | etcdserver/membership: set the cluster version to 3.1 from store
2017-04-27 00:16:42.786552 I | etcdserver: finished recovering cluster configuration
2017-04-27 00:16:42.786556 I | etcdserver: removing old peers from network...
2017-04-27 00:16:42.786563 I | rafthttp: stopping peer b37c2e49dedb8dde...
2017-04-27 00:16:42.786749 I | rafthttp: closed the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:42.786759 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.786893 I | rafthttp: closed the TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:42.786903 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.787304 I | rafthttp: stopped HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:16:42.787350 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.787422 E | rafthttp: failed to read b37c2e49dedb8dde on stream MsgApp v2 (net/http: request canceled)
2017-04-27 00:16:42.787431 I | rafthttp: peer b37c2e49dedb8dde became inactive
2017-04-27 00:16:42.787438 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.787532 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.787545 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.787550 I | rafthttp: stopped peer b37c2e49dedb8dde
2017-04-27 00:16:42.787563 I | rafthttp: removed peer b37c2e49dedb8dde
2017-04-27 00:16:42.787569 I | rafthttp: stopping peer f236749ea9d6acdd...
2017-04-27 00:16:42.788282 I | rafthttp: closed the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:16:42.788299 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.789855 I | rafthttp: closed the TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:42.789878 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.790102 I | rafthttp: stopped HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:42.790322 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.790392 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.790487 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.790504 I | rafthttp: stopped streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.790571 I | rafthttp: stopped peer f236749ea9d6acdd
2017-04-27 00:16:42.790587 I | rafthttp: removed peer f236749ea9d6acdd
2017-04-27 00:16:42.790596 I | etcdserver: finished removing old peers from network
2017-04-27 00:16:42.790602 I | etcdserver: adding peers from new cluster configuration into network...
2017-04-27 00:16:42.790649 I | rafthttp: starting peer b37c2e49dedb8dde...
2017-04-27 00:16:42.790686 I | rafthttp: started HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:16:42.794875 I | rafthttp: started peer b37c2e49dedb8dde
2017-04-27 00:16:42.794905 I | rafthttp: added peer b37c2e49dedb8dde
2017-04-27 00:16:42.794954 I | rafthttp: starting peer f236749ea9d6acdd...
2017-04-27 00:16:42.794993 I | rafthttp: started HTTP pipelining with peer f236749ea9d6acdd
2017-04-27 00:16:42.796328 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.796537 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.796682 I | rafthttp: started streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:16:42.797158 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.797526 I | rafthttp: started streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.798849 I | rafthttp: started streaming with peer f236749ea9d6acdd (writer)
2017-04-27 00:16:42.800820 I | rafthttp: started peer f236749ea9d6acdd
2017-04-27 00:16:42.800846 I | rafthttp: added peer f236749ea9d6acdd
2017-04-27 00:16:42.800851 I | etcdserver: finished adding peers from new cluster configuration into network...
2017-04-27 00:16:42.800858 I | etcdserver: finished applying incoming snapshot at index 0
2017-04-27 00:16:42.800993 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.801266 I | rafthttp: started streaming with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.801683 I | etcdserver: raft applied incoming snapshot at index 8
2017-04-27 00:16:42.803472 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:16:42.803516 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:16:42.803921 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:16:42.806616 I | etcdserver: finished closing old backend
2017-04-27 00:16:42.807007 I | rafthttp: peer b37c2e49dedb8dde became active
2017-04-27 00:16:42.807025 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:16:42.807178 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:16:42.851243 I | embed: ready to serve client requests
2017-04-27 00:16:42.851438 I | etcdserver: published {Name:example-etcd-cluster-with-backup-0005 ClientURLs:[http://example-etcd-cluster-with-backup-0005.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2379]} to cluster df156762086d415c
2017-04-27 00:16:42.851851 N | embed: serving insecure client requests on [::]:2379, this is strongly discouraged!
2017-04-27 00:16:42.888409 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:16:42.896691 I | rafthttp: established a TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:16:42.896986 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:16:42.897704 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)

<<<Upgrade to 3.1.6>>>>

2017-04-27 00:18:51.166792 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:18:51.166817 E | rafthttp: failed to read f236749ea9d6acdd on stream MsgApp v2 (unexpected EOF)
2017-04-27 00:18:51.166822 I | rafthttp: peer f236749ea9d6acdd became inactive
2017-04-27 00:18:51.166878 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:18:52.161571 I | raft: 269bf91b4cd00e4b is starting a new election at term 6
2017-04-27 00:18:52.161628 I | raft: 269bf91b4cd00e4b became candidate at term 7
2017-04-27 00:18:52.161657 I | raft: 269bf91b4cd00e4b received MsgVoteResp from 269bf91b4cd00e4b at term 7
2017-04-27 00:18:52.161668 I | raft: 269bf91b4cd00e4b [logterm: 6, index: 9] sent MsgVote request to b37c2e49dedb8dde at term 7
2017-04-27 00:18:52.161678 I | raft: 269bf91b4cd00e4b [logterm: 6, index: 9] sent MsgVote request to f236749ea9d6acdd at term 7
2017-04-27 00:18:52.161686 I | raft: raft.node: 269bf91b4cd00e4b lost leader f236749ea9d6acdd at term 7
2017-04-27 00:18:52.171633 I | raft: 269bf91b4cd00e4b received MsgVoteResp from b37c2e49dedb8dde at term 7
2017-04-27 00:18:52.171658 I | raft: 269bf91b4cd00e4b [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-27 00:18:52.171675 I | raft: 269bf91b4cd00e4b became leader at term 7
2017-04-27 00:18:52.171687 I | raft: raft.node: 269bf91b4cd00e4b elected leader 269bf91b4cd00e4b at term 7
2017-04-27 00:18:52.184845 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member f236749ea9d6acdd (Get http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.67:2380: getsockopt: connection refused)
2017-04-27 00:18:52.184865 W | etcdserver: cannot get the version of member f236749ea9d6acdd (Get http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.67:2380: getsockopt: connection refused)
2017-04-27 00:18:52.362053 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:18:54.464260 W | rafthttp: lost the TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:18:56.186728 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member f236749ea9d6acdd (Get http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.67:2380: getsockopt: connection refused)
2017-04-27 00:18:56.186761 W | etcdserver: cannot get the version of member f236749ea9d6acdd (Get http://example-etcd-cluster-with-backup-0003.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.67:2380: getsockopt: connection refused)
2017-04-27 00:18:56.459761 I | rafthttp: peer f236749ea9d6acdd became active
2017-04-27 00:18:56.460891 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 writer)
2017-04-27 00:18:56.461191 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message writer)
2017-04-27 00:18:56.547100 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream MsgApp v2 reader)
2017-04-27 00:18:56.548256 I | rafthttp: established a TCP streaming connection with peer f236749ea9d6acdd (stream Message reader)
2017-04-27 00:18:59.190101 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:18:59.190130 E | rafthttp: failed to read b37c2e49dedb8dde on stream Message (unexpected EOF)
2017-04-27 00:18:59.190136 I | rafthttp: peer b37c2e49dedb8dde became inactive
2017-04-27 00:18:59.190190 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:18:59.461851 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:19:00.187759 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:00.187797 W | etcdserver: cannot get the version of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.189737 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.189772 W | etcdserver: cannot get the version of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.463516 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:19:05.920755 I | etcdserver/membership: removed member b37c2e49dedb8dde from cluster df156762086d415c
2017-04-27 00:19:05.920778 I | rafthttp: stopping peer b37c2e49dedb8dde...
2017-04-27 00:19:05.920789 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:19:05.920801 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:19:05.924828 I | rafthttp: stopped HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:19:05.924858 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:19:05.924878 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:19:05.924887 I | rafthttp: stopped peer b37c2e49dedb8dde
2017-04-27 00:19:05.924897 I | rafthttp: removed peer b37c2e49dedb8dde

hasbro17 · 2017-04-27T00:39:37Z

The interesting part in all the above is when 0005 which is the leader loses the connection it had with 0004 from before the update to 3.1.6 and then removes it from the cluster:

2017-04-27 00:18:59.190101 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:18:59.190130 E | rafthttp: failed to read b37c2e49dedb8dde on stream Message (unexpected EOF)
2017-04-27 00:18:59.190136 I | rafthttp: peer b37c2e49dedb8dde became inactive
2017-04-27 00:18:59.190190 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:18:59.461851 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream Message writer)
2017-04-27 00:19:00.187759 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:00.187797 W | etcdserver: cannot get the version of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.189737 W | etcdserver: failed to reach the peerURL(http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380) of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.189772 W | etcdserver: cannot get the version of member b37c2e49dedb8dde (Get http://example-etcd-cluster-with-backup-0004.example-etcd-cluster-with-backup.e2e-local.svc.cluster.local:2380/version: dial tcp 10.244.2.68:2380: getsockopt: connection refused)
2017-04-27 00:19:04.463516 W | rafthttp: lost the TCP streaming connection with peer b37c2e49dedb8dde (stream MsgApp v2 writer)
2017-04-27 00:19:05.920755 I | etcdserver/membership: removed member b37c2e49dedb8dde from cluster df156762086d415c
2017-04-27 00:19:05.920778 I | rafthttp: stopping peer b37c2e49dedb8dde...
2017-04-27 00:19:05.920789 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:19:05.920801 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (writer)
2017-04-27 00:19:05.924828 I | rafthttp: stopped HTTP pipelining with peer b37c2e49dedb8dde
2017-04-27 00:19:05.924858 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream MsgApp v2 reader)
2017-04-27 00:19:05.924878 I | rafthttp: stopped streaming with peer b37c2e49dedb8dde (stream Message reader)
2017-04-27 00:19:05.924887 I | rafthttp: stopped peer b37c2e49dedb8dde
2017-04-27 00:19:05.924897 I | rafthttp: removed peer b37c2e49dedb8dde

hongchaodeng · 2017-04-27T06:41:15Z

After etcd pod failed, I collected the logs and it has one more line at the end:

2017-04-27 06:35:09.538398 I | etcdserver/membership: set the cluster version to 3.1 from store
2017-04-27 06:35:09.546831 C | etcdmain: database file (/var/etcd/data/member/snap/db index 4) does not match with snapshot (index 5).

xiang90 · 2017-04-27T07:31:08Z

2017-04-27 06:35:09.538398 I | etcdserver/membership: set the cluster version to 3.1 from store
2017-04-27 06:35:09.546831 C | etcdmain: database file (/var/etcd/data/member/snap/db index 4) does not match with snapshot (index 5).

hmm... might be this is etcd related? how do you know this is an init container issue?

hongchaodeng · 2017-04-27T16:33:38Z

Actually other pods that didn't have init containers also failed with the same error. So it's unlikely init container.

hongchaodeng · 2017-04-27T22:02:30Z

@junghoahnsc
This is known issue in etcd 3.1: etcd-io/etcd#7628 .
Right now just let operator recreate all the pods for you.

junghoahnsc · 2017-04-28T01:15:23Z

Thanks for the investigation!

hasbro17 · 2017-04-28T01:29:57Z

Waiting for fix on etcd-io/etcd#7834.

xiang90 · 2017-05-02T20:43:05Z

@junghoahnsc thanks for reporting. we confirmed this is an etcd issue, not an operator problem. please track the issue here instead: etcd-io/etcd#7856

xiang90 assigned hasbro17 Apr 26, 2017

hongchaodeng added the dependency/external label Apr 28, 2017

xiang90 closed this as completed May 2, 2017

hongchaodeng mentioned this issue Jul 24, 2017

How can I specify volumes spec? #957

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade restarts all pods after a restore happens #1008

Upgrade restarts all pods after a restore happens #1008

junghoahnsc commented Apr 26, 2017

xiang90 commented Apr 26, 2017

junghoahnsc commented Apr 26, 2017

hasbro17 commented Apr 26, 2017 •

edited

Loading

hasbro17 commented Apr 27, 2017

hasbro17 commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

xiang90 commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

junghoahnsc commented Apr 28, 2017

hasbro17 commented Apr 28, 2017

xiang90 commented May 2, 2017

Upgrade restarts all pods after a restore happens #1008

Upgrade restarts all pods after a restore happens #1008

Comments

junghoahnsc commented Apr 26, 2017

xiang90 commented Apr 26, 2017

junghoahnsc commented Apr 26, 2017

hasbro17 commented Apr 26, 2017 • edited Loading

hasbro17 commented Apr 27, 2017

hasbro17 commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

xiang90 commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

hongchaodeng commented Apr 27, 2017

junghoahnsc commented Apr 28, 2017

hasbro17 commented Apr 28, 2017

xiang90 commented May 2, 2017

hasbro17 commented Apr 26, 2017 •

edited

Loading