Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd client with multiple endpoints fails certificate checks due to hostname mismatch #11180

Closed
jpbetz opened this issue Sep 24, 2019 · 2 comments · Fixed by #11184
Closed

etcd client with multiple endpoints fails certificate checks due to hostname mismatch #11180

jpbetz opened this issue Sep 24, 2019 · 2 comments · Fixed by #11184

Comments

@jpbetz
Copy link
Contributor

jpbetz commented Sep 24, 2019

(c.f. kubernetes/kubernetes#83028)

For a etcd cluster using hostnames in TLS cert SANs, the etcd client only connects to the 1st etcd member in the endpoints list. Attempts to connect to the other members fail with:

W0924 14:42:29.368784 248333 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://member3.etcd.local:32379 0 }. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for member3.etcd.local, not member1.etcd.local"

This causes an etcd client to become disconnected from the etcd cluster when the 1st etcd member is unavailable.

This appears to occur both with etcd 3.3.13 (which uses the old load balancer code) and 3.1.15 (which uses the new grpc load balancer)

Reproduction steps: https://github.com/jpbetz/etcd/blob/etcd-lb-dnsname-failover/reproduction.md

This can also be reproduced with etcdctl by stopping the 1st etcd and running:

$ etcdctl --cacert ${path-to-checkout-of-this-branch}/integration/fixtures/ca.crt --endpoints=https://member1.etcd.local:2379,https://member2.etcd.local:22379,https://member3.etcd.local:32379 get /

which outputs:

{"level":"warn","ts":"2019-09-24T15:52:25.256-0700","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-3a8f8d78-f6ba-4d72-b3d6-27cd6ce8c943/member1.etcd.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for member2.etcd.local, not member1.etcd.local""}
Error: context deadline exceeded

xref: db61ee1

cc @gyuho @jingyih @liggitt @nerzhul @wenjiaswe

@jpbetz jpbetz changed the title etcd client failover fails certificate check on hostname mismatch etcd client with multiple endpoints fails certificate checks due to hostname mismatch Sep 24, 2019
@gyuho
Copy link
Contributor

gyuho commented Sep 24, 2019

c.f. kubernetes/kubernetes#83028

@jpbetz
Copy link
Contributor Author

jpbetz commented Sep 25, 2019

Created grpc/grpc-go#3038 to discuss the issue with the gPRC team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants