Closed
Conversation
This commit is a revival of envoyproxy#13516 which aimed to solve envoyproxy#12389. That PR was merged and then reverted due to a deadlock bug, which is still present in this commit. The fix will be applied in a subsequent commit in order to make the resolution clearer to PR reviewers. Signed-off-by: Michael Puncel <mpuncel@squareup.com>
The bug was caused by not queuing callbacks in addReadyCb() while holding the lock. This means that you could make the decision to NOT run the callback, get a secret concurrently which would run all callbacks, and then add it to the queue at which point it would never run. This also changed the name of addReadyCb() to addHealthCheckingReadyCb() on HostImpl, and made the logic add a callback for the health checking transport socket. That's the socket that we actually want health checks to wait on which may be different from the data plane socket. Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Michael Puncel <mpuncel@squareup.com>
* main: (687 commits) ci: set build debug information from env (envoyproxy#17635) ext_authz: do the authentication even the direct response is set (envoyproxy#17546) upstream: various cleanups in connection pool code (envoyproxy#17644) owners: promote Dmitry to maintainer (envoyproxy#17642) quiche: client session supports creating bidi stream (envoyproxy#17543) Update HTTP/2 METADATA documentation. (envoyproxy#17637) ext_proc: Check validity of the :status header (envoyproxy#17596) test: add ASSERT indicating that gRPC stream has not been started yet (envoyproxy#17614) ensure that the inline cookie header will be folded correctly (envoyproxy#17560) cluster_manager: Make ClusterEntry a class instead of a struct (envoyproxy#17616) owners: make Raúl a Thrift senior extension maintainer (envoyproxy#17641) quiche: update QUICHE dependency (envoyproxy#17618) Delete mock for removed RouteEntry::perFilterConfig() method (envoyproxy#17623) REPO_LAYOUT.md: fix outdated link (envoyproxy#17626) hcm: forbid use of detection extensions with use_remote_addr/xff_num_trusted_hops (envoyproxy#17558) thrift proxy: add request shadowing support (envoyproxy#17544) ext_proc: Ensure that timer is always cancelled (envoyproxy#17569) Proposal: Add CachePolicy interface to allow for custom cache behavior (envoyproxy#17362) proto: fix verify to point at v3 only (envoyproxy#17622) api: move generic matcher proto to its own package (envoyproxy#17096) ...
Contributor
Author
|
This PR is just #16236 reopened. It went stale while I was on parental leave. I haven't quite finished addressing PR feedback there, I'll add more context later today. |
Contributor
Author
|
had to force push to fix some signatures so will open a new PR again |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message: Make health check loop wait for any required SDS secrets to be loaded before starting.
Additional Description: This should avoid an issue where Envoy might take over a minute to warm clusters with default settings when SDS and active health checking are used. The issue was caused by health checks starting before secrets are ready and then waiting for no_traffic_interval (default 60s) before trying again. This change makes health checks wait for SDS secrets to be ready before starting.
Risk Level: Medium
Testing: Concurrency test covers the bug that caused this to be reverted initially, and was tested locally with --runs_per_test 100, confirming test failures when the bug is present and none when it was fixed.
Docs Changes: None
Release Notes: Included
Platform Specific Features: None
Fixes #12389, #15977, #17529