Backport of cache: prevent goroutine leak in agent cache into release/1.13.x#15018
Merged
hc-github-team-consul-core merged 1 commit intorelease/1.13.xfrom Oct 17, 2022
Conversation
3fb3c5e to
b6fb21f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport
This PR is auto-generated from #14908 to be assessed for backporting due to the inclusion of the label backport/1.13.
The below text is copied from the body of the original PR.
Description
There is a bug in the error handling code for the Agent cache subsystem discovered by @boxofrad :
NotifyCallbackcallsnotifyBlockingQuerywhich callsgetWithIndexin a loop (which backs off on-error up to 1 minute)getWithIndexcallsfetchif there’s no valid entry in the cachefetchstarts a goroutine which callsFetchon the cache-type, waits for a while (again with backoff up to 1 minute for errors) and then callsfetchto trigger a refreshThe end result being that every 1 minute
notifyBlockingQueryspawns an ancestry of goroutines that essentially lives forever.This PR ensures that the goroutine started by
fetchcancels any prior goroutine spawned by the same line for the same key.In isolated testing where a cache type was tweaked to indefinitely error, this patch prevented goroutine counts from skyrocketing.
Overview of commits