-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdctl member add sometimes fails with 3.3.15 #11186
Comments
If this line is printed, the new member is added to the cluster.
Not sure what caused the warning message, but it is just a timeout and then the same request was retried and succeeded. This type of warning message is probably newly added. v3.3.14 and v3.3.15 have a new client side balancer. |
The command exits with a failure though, if it is just a harmless warning it should not do that so you can distinguish it from a real error. The member add does take a couple seconds when it prints this warning so it is probably a timeout. |
The member add command did not exit with error. IIUC, the warning is from the retry logic in the client side balancer. I agree it feels a little confusing, but it is not the output print of the member add command. cc @gyuho |
|
Oh I see. The 'etcdctl member add' command is actually two parts.
From your output print it looks like 2) failed with a timeout. |
Went through the changelog, did not find anything might lead to regression in member configuration. |
I ran the command in debug mode, unfortunately no very interesting extra output:
|
increasing the command and dial timeout to 15s does help either, the member add just hangs for 15 seconds then. Longer timeouts probably won't change anything as our current workaround is to do a member list 5 seconds after the failure and the member is in the list then and the etcd can startup. |
I'm seeing this also on v3.4.1, and have pinned it down to an active retry infinite loop that runs until the
|
Are all etcd members healthy in the cluster? This can be verified by Reading the logic in the for loop, it keeps trying Get() and MemberList() until these two API calls are served by the same member. This logic can be optimized but it is correct (I cannot reproduce this issue with the new balancer). Reading from the warning message, maybe Get() timed out on a member which is causing the entire command to time out? |
yes, all healthy.
The new gRPC client balancer is using round-robin, no? In that case there's no guarantee that will ever happen. Each alternating Get and MemberList RPC rotates to the next member. I suspect the only reason this appears to work much of the time is because the entire flow is racing gRPC SubConn establishment in the client balancer, such that, in practice, all RPCs happen to be dispatched over a single ready SubConn because SubConns to other members are still being stood up. |
On the other hand, we can definitely optimize the logic. Every time we do a Get(). we add the endpoint who served that request to a "synced" list. Once MemberList is served from a "synced" endpoint, we are done. |
Doesn't seem like a balancer issue. The failover should happen for all RPCs and it only happens when it receives the transient error (retriable). Maybe you issued a command to isolated member? |
The way etcdctl is written to wait for a particular member to serve a |
Ideally, |
To root cause, we might need to modify the condition @jgraettinger found to print out the members being compared and see what is happening there: etcd/etcdctl/ctlv3/command/member_command.go Line 166 in d08bb07
I suspect the best mitigation would be to do what @jingyih described in #11186 (comment), although without the root cause it's tough to be sure it will actually fix the problem. @jgraettinger Do you know how to reproduce in a way we could replicate? Or alternatively,would you be willing to run a build of etcdctl with additional logging to help isolate the issue? |
Here is my new theory. Originally the cluster has 1 member. Using |
Oh but in this case logic " listResp.Header.MemberId == resp.Header.MemberId" should always pass. So |
In our design we had originally planned to round robin on failover. But in our implementation, we do round robin across endpoints on a per client request basis: I think @jgraettinger's assessment was correct and @jingyih's proposed fix will work. Let's still validate that, but I'm pretty confident this is the problem. |
Hi gang, I was able to reproduce this with pretty consistently while testing this kustomize manifest for deploying v3.4 as a StatefulSet, when scaling up the cluster beyond the initial 3 members to 4, 5, 6, etc: https://github.com/jgraettinger/gazette/tree/kustomize/kustomize/bases/etcd I'm happy to run it again with an instrumented build; are you able to provide a docker image I can pull? Or, if this theory is right I'd also bet it's easy to reproduce by adding a |
I tried this:
output:
So |
Let me send out a quick fix on client side. Long term, I agree we should make |
Quick fix: #11194. |
We recently updated our etcd from 3.3.13 to 3.3.15 and since then
etcdctl member add
sometimes fails with following error:Though it seems the member add actually worked fine as a etcdctl member list shows the newly added member as unstarted
Starting the etcd process then also works fine.
This behaviour seems to be new in 3.3.15, it did not happen in 3.3.13.
The text was updated successfully, but these errors were encountered: