-
Notifications
You must be signed in to change notification settings - Fork 740
Upgrade restarts all pods after a restore happens #1008
Comments
this cluster is also created with etcd operator 0.2.5? |
Oh, yes. I created with operator 0.2.5. |
Confirmed. I am able to recreate the issue on a kubernetes 1.6.0 cluster with etcd-operator v0.2.5. Here are the steps:
Wait for the cluster to do disaster recovery and heal back to 3 members:
After the upgrade the existing pods die out and the operator does disaster recovery to restart all the pods. Operator logs during the cluster upgrade to 3.1.6. More readable logs here:
The logs for the etcd member The cluster TPR after the second disaster recovery:
|
So after the upgrade to 3.1.6 there is a period of time when all pods are stable. Then the first pod to start terminating is
When The logs for all 3 pods
|
The interesting part in all the above is when
|
After etcd pod failed, I collected the logs and it has one more line at the end:
|
hmm... might be this is etcd related? how do you know this is an init container issue? |
Actually other pods that didn't have init containers also failed with the same error. So it's unlikely init container. |
@junghoahnsc |
Thanks for the investigation! |
Waiting for fix on etcd-io/etcd#7834. |
@junghoahnsc thanks for reporting. we confirmed this is an etcd issue, not an operator problem. please track the issue here instead: etcd-io/etcd#7856 |
After a restore happens (by deleting nodes for testing), whenever I try to upgrade, all pods are died and then restored. But when there was no restore before, upgrade just restarts pods.
I tested the v0.2.5.
Steps:
The text was updated successfully, but these errors were encountered: