-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency.NewSession hang after etcd server is killed with SIGSTOP(19) #14631
Comments
If kill etcd leader with
I think it should has same behaviour when kill with |
@ahrtr How about add a configuration item that controls the timeout of grpc function calls when create the etcd client? |
I might be completely wrong here, but I think the cause of this behavior is this option: |
Thanks, I changed |
I have no idea, I have found an old issue that proposed to expose some of those options, but it went on stale. Here is the issue: #13344 |
SIGSTOP(19) of the etcd leader, new leader won't be selected ? |
Yes, the |
|
@huangjiao-heart unfortunately you can't as it is a private variable, you have to change it from the package. To be more clear, you have to literally open the file where the variable is hardcoded and change it manually. Or either make a fork and change it from there. |
I'm still quite new to this, but adding a timeout to the LeaseGrant request seem to raise the appropriate error when the servers are killed. Would there be any negative repercussions if we do so? |
There are two cases:
I'm not sure that reason about pausing the process. It's freezing. It will impact all the ready connections. |
This bug can be replicated if the server is stopped by The freeze occurs during the creation of a new session, where client attempts to send a LeaseGrant request to the server. Because the grpc option I understand from the provided documentation that this option is set to minimise error responses from transient failures. In that case, I'd like to propose setting a timeout to the LeaseGrant gRPC request specifically during the creation of new sessions. I'm thinking of exposing the timeout duration as a configurable value (I'm thinking of naming it as |
@AngstyDuck Yes. gRPC has a background goroutine picker to collect available connection to new transporter. I think it's good option to user if they want to disable waitForReady. |
What happened?
concurrency.NewSession
hang after etcd server is kill by SIGSTOP(19)What did you expect to happen?
NewSession
can return error after server is killed.How can we reproduce it (as minimally and precisely as possible)?
main
with following codes.pidof etcd leader
Anything else we need to know?
If re-create etcd client after
kill -19
, it can return error. However, in our application, the client is created at the beginning and stored to use in the while lifecycle of the application.Etcd version (please run commands below)
$ etcdctl version
paste output here
Relevant log output
No response
The text was updated successfully, but these errors were encountered: