etcdctl member add sometimes fails with 3.3.15 #11186

juliantaylor · 2019-09-27T15:09:30Z

We recently updated our etcd from 3.3.13 to 3.3.15 and since then etcdctl member add sometimes fails with following error:

Member ca83b0fdcfd3f509 added to cluster  7f02dc1fca0457e
{"level":"warn","ts":"2019-09-27T16:56:16.191+0200","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-.../etcdmaster1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}

Though it seems the member add actually worked fine as a etcdctl member list shows the newly added member as unstarted

# endpoints are three existing and running etcd nodes
etcdctl member add "newmember" --peer-urls=https://newmember:2380
# errors out with above message
etcdctl member list
...
ca83b0fdcfd3f509, unstarted, , https://newmember:2380,

Starting the etcd process then also works fine.

This behaviour seems to be new in 3.3.15, it did not happen in 3.3.13.

The text was updated successfully, but these errors were encountered:

jingyih · 2019-09-27T18:02:43Z

Member ca83b0fdcfd3f509 added to cluster 7f02dc1fca0457e

If this line is printed, the new member is added to the cluster.

{"level":"warn","ts":"2019-09-27T16:56:16.191+0200","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-.../etcdmaster1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}

Not sure what caused the warning message, but it is just a timeout and then the same request was retried and succeeded. This type of warning message is probably newly added. v3.3.14 and v3.3.15 have a new client side balancer.

juliantaylor · 2019-09-27T18:06:23Z

The command exits with a failure though, if it is just a harmless warning it should not do that so you can distinguish it from a real error.

The member add does take a couple seconds when it prints this warning so it is probably a timeout.

jingyih · 2019-09-27T18:45:31Z

The command exits with a failure though, if it is just a harmless warning it should not do that so you can distinguish it from a real error.

The member add command did not exit with error. IIUC, the warning is from the retry logic in the client side balancer. I agree it feels a little confusing, but it is not the output print of the member add command.

cc @gyuho

juliantaylor · 2019-09-27T18:48:25Z

etcdctl does exits with an error (= nonzero exit code) in this case. The issue was found because the scripts starting an extra etcd members aborted because they are run using a shell in set -e mode.

jingyih · 2019-09-28T00:22:38Z

Oh I see. The 'etcdctl member add' command is actually two parts.

Add new member,
get member list which includes the newly added member

From your output print it looks like 2) failed with a timeout.

jingyih · 2019-09-28T00:23:16Z

Went through the changelog, did not find anything might lead to regression in member configuration.

juliantaylor · 2019-09-28T09:07:28Z

I ran the command in debug mode, unfortunately no very interesting extra output:

ETCDCTL_CACERT=/etc/etcd/ca.pem
ETCDCTL_CERT=/etc/etcd/crt.pem
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_ENDPOINTS=[https://master001.zone:2379,https://master002.zone:2379,https://master003.zone:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=/etc/etcd/key.pem
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=simple
WARNING: 2019/09/28 11:02:33 Adjusting keepalive ping interval to minimum period of 10s 
WARNING: 2019/09/28 11:02:33 Adjusting keepalive ping interval to minimum period of 10s 
INFO: 2019/09/28 11:02:33 parsed scheme: "endpoint"
INFO: 2019/09/28 11:02:33 ccResolverWrapper: sending new addresses to cc: [{https://master001.zone:2379 0  <nil>} {https://master002.zone:2379 0  <nil>} {https://master003.zone:2379 0  <nil>}]
Member 1760eab222d27300 added to cluster  7f02dc1fca0457e
{"level":"warn","ts":"2019-09-28T11:02:38.435+0200","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-6b756513-08bd-4e0f-b13d-c7700769f001/master001.zone:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Error: <nil>

juliantaylor · 2019-09-28T09:11:49Z

increasing the command and dial timeout to 15s does help either, the member add just hangs for 15 seconds then. Longer timeouts probably won't change anything as our current workaround is to do a member list 5 seconds after the failure and the member is in the list then and the etcd can startup.

jgraettinger · 2019-10-02T04:43:14Z

I'm seeing this also on v3.4.1, and have pinned it down to an active retry infinite loop that runs until the --command-timeout is exceeded. A SIGQUIT trace I collected from etcdctl follows. Possible the new client-side balancer has a case where the Get and MemberList RPCs are dispatched to different members, such that this case never evaluates true ?

goroutine 1 [runnable]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0xc0003a6f00, 0xc00031c790, 0xc000150db0)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/internal/transport/transport.go:312 +0xcc
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/internal/transport/transport.go:331
google.golang.org/grpc.(*csAttempt).recvMsg(0xc0004a6980, 0xce8480, 0xc00045ec00, 0x0, 0x0, 0x0)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/stream.go:871 +0x70c
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0xc0004a6980, 0xc00031c789, 0x3)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/stream.go:736 +0x46
google.golang.org/grpc.(*clientStream).withRetry(0xc0000339e0, 0xc000150f50, 0xc000150f40, 0xc00031c789, 0xc00031c790)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/stream.go:594 +0x7c
google.golang.org/grpc.(*clientStream).RecvMsg(0xc0000339e0, 0xce8480, 0xc00045ec00, 0x0, 0x0)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/stream.go:735 +0x100
google.golang.org/grpc.invoke(0xe7f7e0, 0xc00013a240, 0xd4e780, 0x16, 0xd0c8c0, 0xc0001257a0, 0xce8480, 0xc00045ec00, 0xc00023f880, 0xc0002daec0, ...)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/call.go:73 +0x141
go.etcd.io/etcd/clientv3.(*Client).unaryClientInterceptor.func1(0xe7f7e0, 0xc00013a240, 0xd4e780, 0x16, 0xd0c8c0, 0xc0001257a0, 0xce8480, 0xc00045ec00, 0xc00023f880, 0xd82150, ...)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/clientv3/retry_interceptor.go:57 +0x458
google.golang.org/grpc.(*ClientConn).Invoke(0xc00023f880, 0xe7f7e0, 0xc00013a240, 0xd4e780, 0x16, 0xd0c8c0, 0xc0001257a0, 0xce8480, 0xc00045ec00, 0xc00022b740, ...)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/call.go:35 +0x10a
google.golang.org/grpc.Invoke(...)
        /Users/leegyuho/go/pkg/mod/google.golang.org/[email protected]/call.go:60
go.etcd.io/etcd/etcdserver/etcdserverpb.(*kVClient).Range(0xc000230100, 0xe7f7e0, 0xc00013a240, 0xc0001257a0, 0xc00022b740, 0x4, 0x6, 0x6, 0x40c098, 0x70)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/etcdserver/etcdserverpb/rpc.pb.go:3468 +0xd3
go.etcd.io/etcd/clientv3.(*retryKVClient).Range(0xc000238bf0, 0xe7f7e0, 0xc00013a240, 0xc0001257a0, 0x14baa60, 0x3, 0x3, 0x8, 0x0, 0xc00061e1e0)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/clientv3/retry.go:105 +0xeb
go.etcd.io/etcd/clientv3.(*kv).Do(0xc00024cde0, 0xe7f7e0, 0xc00013a240, 0x1, 0xc00031c740, 0x1, 0x8, 0x0, 0x0, 0x0, ...)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/clientv3/kv.go:149 +0x1e5
go.etcd.io/etcd/clientv3.(*kv).Get(0xc00024cde0, 0xe7f7e0, 0xc00013a240, 0xd3fbfe, 0x1, 0x0, 0x0, 0x0, 0xc00045e840, 0x0, ...)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/clientv3/kv.go:119 +0xfc
go.etcd.io/etcd/etcdctl/ctlv3/command.memberAddCommandFunc(0xc000285900, 0xc00022e280, 0x1, 0x5)
        /tmp/etcd-release-3.4.1/etcd/release/etcd/etcdctl/ctlv3/command/member_command.go:170 +0x4fb
github.com/spf13/cobra.(*Command).execute(0xc000285900, 0xc00022e230, 0x5, 0x5, 0xc000285900, 0xc00022e230)
        /Users/leegyuho/go/pkg/mod/github.com/spf13/[email protected]/command.go:766 +0x2ae
github.com/spf13/cobra.(*Command).ExecuteC(0x14af240, 0xb, 0x14d8ec0, 0xb7b200)
        /Users/leegyuho/go/pkg/mod/github.com/spf13/[email protected]/command.go:852 +0x2ec
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/leegyuho/go/pkg/mod/github.com/spf13/[email protected]/command.go:800
go.etcd.io/etcd/etcdctl/ctlv3.Start()
        /tmp/etcd-release-3.4.1/etcd/release/etcd/etcdctl/ctlv3/ctl_nocov.go:25 +0x81
main.main()
        /tmp/etcd-release-3.4.1/etcd/release/etcd/etcdctl/main.go:35 +0x77

jingyih · 2019-10-02T21:13:25Z

Are all etcd members healthy in the cluster? This can be verified by etcdctl endpoint health --cluster.

Reading the logic in the for loop, it keeps trying Get() and MemberList() until these two API calls are served by the same member. This logic can be optimized but it is correct (I cannot reproduce this issue with the new balancer). Reading from the warning message, maybe Get() timed out on a member which is causing the entire command to time out?

jgraettinger · 2019-10-02T22:22:24Z

Are all etcd members healthy in the cluster?

yes, all healthy.

it keeps trying Get() and MemberList() until these two API calls are served by the same member.

The new gRPC client balancer is using round-robin, no? In that case there's no guarantee that will ever happen. Each alternating Get and MemberList RPC rotates to the next member.

I suspect the only reason this appears to work much of the time is because the entire flow is racing gRPC SubConn establishment in the client balancer, such that, in practice, all RPCs happen to be dispatched over a single ready SubConn because SubConns to other members are still being stood up.

jingyih · 2019-10-02T22:30:39Z

I see. I am not sure how client balance rotates endpoint for normal key value APIs and member APIs.

@jpbetz @gyuho Could you please take a look at this issue?

jingyih · 2019-10-02T22:34:35Z

On the other hand, we can definitely optimize the logic. Every time we do a Get(). we add the endpoint who served that request to a "synced" list. Once MemberList is served from a "synced" endpoint, we are done.

gyuho · 2019-10-02T22:46:26Z

Doesn't seem like a balancer issue. The failover should happen for all RPCs and it only happens when it receives the transient error (retriable). Maybe you issued a command to isolated member?

jpbetz · 2019-10-02T22:58:09Z

The way etcdctl is written to wait for a particular member to serve a MemberList response should probably be rethought. At first glance, I would expect etcdctl to retry the MemberList until it knows somehow that it has received fresh data. Either it waits until the response contains the member it added, or we prove we can use the response.Header.Revision to determine that we are observing fresh data. It shouldn't need to matter which member serves the MemberList request, and having to make the cli.Get call to sync the cluster list is not something clients should need to do.

jpbetz · 2019-10-02T23:02:48Z

Ideally, MemberList would always serve consistent data. Are there performance reasons that prevent us from ensuring it is synced before serving a response?

jpbetz · 2019-10-02T23:21:08Z

To root cause, we might need to modify the condition @jgraettinger found to print out the members being compared and see what is happening there:

etcd/etcdctl/ctlv3/command/member_command.go

Line 166 in d08bb07

if listResp.Header.MemberId == resp.Header.MemberId {

I suspect the best mitigation would be to do what @jingyih described in #11186 (comment), although without the root cause it's tough to be sure it will actually fix the problem.

@jgraettinger Do you know how to reproduce in a way we could replicate? Or alternatively,would you be willing to run a build of etcdctl with additional logging to help isolate the issue?

jingyih · 2019-10-02T23:57:46Z

Here is my new theory. Originally the cluster has 1 member. Using MemberAdd() to add a new member. Now cluster has 2 members, and it needs 2 members to be started and healthy to serve a quorum read. So before the new member starts, Get() will always hang.

jingyih · 2019-10-03T00:00:23Z

Oh but in this case logic " listResp.Header.MemberId == resp.Header.MemberId" should always pass. So Get() should not be invoked.

jpbetz · 2019-10-03T00:07:13Z

In our design we had originally planned to round robin on failover. But in our implementation, we do round robin across endpoints on a per client request basis:

https://github.com/etcd-io/etcd/blob/master/clientv3/balancer/picker/roundrobin_balanced.go#L67

I think @jgraettinger's assessment was correct and @jingyih's proposed fix will work. Let's still validate that, but I'm pretty confident this is the problem.

jgraettinger · 2019-10-03T00:15:06Z

Hi gang, I was able to reproduce this with pretty consistently while testing this kustomize manifest for deploying v3.4 as a StatefulSet, when scaling up the cluster beyond the initial 3 members to 4, 5, 6, etc: https://github.com/jgraettinger/gazette/tree/kustomize/kustomize/bases/etcd

I'm happy to run it again with an instrumented build; are you able to provide a docker image I can pull? Or, if this theory is right I'd also bet it's easy to reproduce by adding a time.Sleep between the RPCs (to allow all SubConns to finish connecting).

jingyih · 2019-10-03T00:15:28Z

I tried this:

--- a/etcdctl/ctlv3/command/member_command.go
+++ b/etcdctl/ctlv3/command/member_command.go
@@ -164,8 +164,9 @@ func memberAddCommandFunc(cmd *cobra.Command, args []string) {
                                ExitWithError(ExitError, err)
                        }
                        if listResp.Header.MemberId == resp.Header.MemberId {
-                               break
+                               // break
                        }
+                       fmt.Printf("get served by %v, member list served by %v\n", resp.Header.MemberId, listResp.Header.MemberId)
                        // quorum get to sync cluster list
                        gresp, gerr := cli.Get(ctx, "_")
                        if gerr != nil {

output:

get served by 10501334649042878790, member list served by 18249187646912138824
get served by 9372538179322589801, member list served by 10501334649042878790
get served by 18249187646912138824, member list served by 9372538179322589801
get served by 10501334649042878790, member list served by 18249187646912138824
get served by 9372538179322589801, member list served by 10501334649042878790
get served by 18249187646912138824, member list served by 9372538179322589801
get served by 10501334649042878790, member list served by 18249187646912138824
get served by 9372538179322589801, member list served by 10501334649042878790
get served by 18249187646912138824, member list served by 9372538179322589801
get served by 10501334649042878790, member list served by 18249187646912138824
get served by 9372538179322589801, member list served by 10501334649042878790

So listResp.Header.MemberId == resp.Header.MemberId is always false due to the new client balancer.

jingyih · 2019-10-03T00:18:02Z

Let me send out a quick fix on client side. Long term, I agree we should make MemberList consistent, so we can drop the logic on client side.

jingyih · 2019-10-03T00:51:39Z

Quick fix: #11194.

jingyih added the stage/investigating label Sep 28, 2019

jingyih mentioned this issue Oct 3, 2019

etcdctl: fix member add command #11194

Merged

jingyih mentioned this issue Oct 3, 2019

Member list should be served with linearizable guarantee #11198

Closed

gyuho closed this as completed in #11194 Oct 3, 2019

sboschman mentioned this issue Nov 25, 2019

[bitnami/etcd] bitnami/charts#1640

Closed

YoyinZyc mentioned this issue Jan 25, 2020

Member Add Failure #11554

Closed

tgraf mentioned this issue Feb 28, 2020

Cilium with etcd spams logs with "retrying of unary invoker failed" cilium/cilium#9791

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etcdctl member add sometimes fails with 3.3.15 #11186

etcdctl member add sometimes fails with 3.3.15 #11186

juliantaylor commented Sep 27, 2019 •

edited

Loading

jingyih commented Sep 27, 2019 •

edited

Loading

juliantaylor commented Sep 27, 2019

jingyih commented Sep 27, 2019

juliantaylor commented Sep 27, 2019 •

edited

Loading

jingyih commented Sep 28, 2019

jingyih commented Sep 28, 2019

juliantaylor commented Sep 28, 2019 •

edited

Loading

juliantaylor commented Sep 28, 2019 •

edited

Loading

jgraettinger commented Oct 2, 2019

jingyih commented Oct 2, 2019

jgraettinger commented Oct 2, 2019

jingyih commented Oct 2, 2019 •

edited

Loading

jingyih commented Oct 2, 2019

gyuho commented Oct 2, 2019

jpbetz commented Oct 2, 2019

jpbetz commented Oct 2, 2019

jpbetz commented Oct 2, 2019 •

edited

Loading

jingyih commented Oct 2, 2019

jingyih commented Oct 3, 2019

jpbetz commented Oct 3, 2019

jgraettinger commented Oct 3, 2019

jingyih commented Oct 3, 2019

jingyih commented Oct 3, 2019

jingyih commented Oct 3, 2019

etcdctl member add sometimes fails with 3.3.15 #11186

etcdctl member add sometimes fails with 3.3.15 #11186

Comments

juliantaylor commented Sep 27, 2019 • edited Loading

jingyih commented Sep 27, 2019 • edited Loading

juliantaylor commented Sep 27, 2019

jingyih commented Sep 27, 2019

juliantaylor commented Sep 27, 2019 • edited Loading

jingyih commented Sep 28, 2019

jingyih commented Sep 28, 2019

juliantaylor commented Sep 28, 2019 • edited Loading

juliantaylor commented Sep 28, 2019 • edited Loading

jgraettinger commented Oct 2, 2019

jingyih commented Oct 2, 2019

jgraettinger commented Oct 2, 2019

jingyih commented Oct 2, 2019 • edited Loading

jingyih commented Oct 2, 2019

gyuho commented Oct 2, 2019

jpbetz commented Oct 2, 2019

jpbetz commented Oct 2, 2019

jpbetz commented Oct 2, 2019 • edited Loading

jingyih commented Oct 2, 2019

jingyih commented Oct 3, 2019

jpbetz commented Oct 3, 2019

jgraettinger commented Oct 3, 2019

jingyih commented Oct 3, 2019

jingyih commented Oct 3, 2019

jingyih commented Oct 3, 2019

juliantaylor commented Sep 27, 2019 •

edited

Loading

jingyih commented Sep 27, 2019 •

edited

Loading

juliantaylor commented Sep 27, 2019 •

edited

Loading

juliantaylor commented Sep 28, 2019 •

edited

Loading

juliantaylor commented Sep 28, 2019 •

edited

Loading

jingyih commented Oct 2, 2019 •

edited

Loading

jpbetz commented Oct 2, 2019 •

edited

Loading