Skip to content

Not to send a discovery request at GrpcSubscriptionImpl destructor#4354

Closed
qiwzhang wants to merge 2 commits intoenvoyproxy:masterfrom
qiwzhang:grpc_mux_bug
Closed

Not to send a discovery request at GrpcSubscriptionImpl destructor#4354
qiwzhang wants to merge 2 commits intoenvoyproxy:masterfrom
qiwzhang:grpc_mux_bug

Conversation

@qiwzhang
Copy link
Contributor

@qiwzhang qiwzhang commented Sep 6, 2018

Description:

This change tries to fix #4352

The root cause with detail log is described at: #4167
This PR is a re-open of closed PR: #4178

Problem:
When GrpcSubscriptionImpl is deleted, it tries to send one more request update. Newly added sds_api is using GrpcSubscriptionImpl which is owned by a ListenerImpl. When server is killed, GoogleAsyncClientThreadLocal is removed before ListenerManager is deleted. It will cause sds_api to send a google grpc request with a deleted GoogleAsyncClientThreadLocal. Hence the crash.

Ideal solution (suggested by htuch)
When GoogleAsyncClientThreadLocal is removed, somehow notify all its clients;
GoogleAsyncClientImpl, not to send any more requests. The Google grpc code is fairly complicated. It is not easy to achieve. The change could be big and high risk.

Short-team solution:
When GrpcSubscriptionImpl is deleted, not to send last request update. The change is simple and low-risk.

Risk Level: Low

Testing: tested with
bazel test --runs_per_test=1000 //test/integration:sds_dynamic_integration_test

Docs Changes: No
Release Notes: No
[Optional Fixes #Issue]
[Optional Deprecated:]

Signed-off-by: Wayne Zhang <qiwzhang@google.com>
Signed-off-by: Wayne Zhang <qiwzhang@google.com>
@qiwzhang
Copy link
Contributor Author

qiwzhang commented Sep 6, 2018

@htuch Please help to review this change.

@qiwzhang
Copy link
Contributor Author

qiwzhang commented Sep 6, 2018

Replaced with #4356

@qiwzhang qiwzhang closed this Sep 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sporadic timeout in sds_dynamic_integration_test

1 participant