Fix CTS disposal in ProcessFetchInBackground that breaks cancellation and causes SemaphoreSlim convoy#6056
Closed
bgavrilMS wants to merge 2 commits into
Closed
Fix CTS disposal in ProcessFetchInBackground that breaks cancellation and causes SemaphoreSlim convoy#6056bgavrilMS wants to merge 2 commits into
bgavrilMS wants to merge 2 commits into
Conversation
… and causes SemaphoreSlim convoy The lambda passed to ProcessFetchInBackground uses 'using var' to create a linked CancellationTokenSource, but returns the Task without awaiting it. This causes the linked CTS to be disposed before the async operation completes, breaking the link to the parent cancellation token. As a result, WaitAsync on the static SemaphoreSlim(1,1) becomes permanently unkillable. When the token endpoint (IMDS) is temporarily unreachable, every proactive refresh background task becomes a permanent semaphore waiter, forming an unbounded convoy. Foreground threads that later need a token are blocked behind the convoy for hours. Fix: make the lambda async and await the inner call, so 'using var' disposal waits for the async operation to complete and parent cancellation propagates correctly through the linked token. Fixes all 4 affected locations introduced by PR #4471: - ClientCredentialRequest.cs - ManagedIdentityAuthRequest.cs - OnBehalfOfRequest.cs - CacheSilentStrategy.cs
Pattern tests proving the bug mechanism: non-async lambda with 'using var' on a linked CancellationTokenSource disposes it before the async operation completes, severing cancellation propagation through semaphore waits. - BugPattern test: demonstrates cancellation does NOT propagate (hangs) - FixPattern test: demonstrates cancellation DOES propagate (aborts promptly) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gladjohn
approved these changes
Jun 10, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a proactive background refresh cancellation bug caused by disposing a linked CancellationTokenSource before the returned async work completes (non-async lambda + using var). By making the background fetch lambdas async and awaiting the inner call, cancellation now properly propagates through the linked token, preventing long-lived uncancellable waits (e.g., semaphore convoy) when downstream endpoints are slow/unreachable.
Changes:
- Update 4 proactive-refresh call sites to use
asynclambdas andawaitso the linked CTS lifetime spans the async operation. - Add a regression test validating the C# language behavior difference between non-
asyncvsasynclambdas withusing varlinked CTS.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/client/Microsoft.Identity.Client/Internal/Requests/ClientCredentialRequest.cs | Make proactive refresh background fetch lambda async and await the inner token acquisition to preserve linked CTS lifetime. |
| src/client/Microsoft.Identity.Client/Internal/Requests/ManagedIdentityAuthRequest.cs | Same fix pattern for managed identity proactive refresh. |
| src/client/Microsoft.Identity.Client/Internal/Requests/OnBehalfOfRequest.cs | Same fix pattern for OBO proactive refresh. |
| src/client/Microsoft.Identity.Client/Internal/Requests/Silent/CacheSilentStrategy.cs | Same fix pattern for silent proactive refresh. |
| tests/Microsoft.Identity.Test.Unit/PublicApiTests/LinkedCancellationTokenTests.cs | Add regression tests demonstrating buggy vs fixed patterns for linked CTS disposal + cancellation propagation. |
Comment on lines
+72
to
+85
| // Act | ||
| Task backgroundTask = Task.Run(buggyLambda); | ||
|
|
||
| // Wait for the operation to start (semaphore wait is active) | ||
| await operationStarted.Task.ConfigureAwait(false); | ||
|
|
||
| // Small delay to guarantee the non-async lambda has returned and disposed the linked CTS. | ||
| // With RunContinuationsAsynchronously, we resume on a different thread, but the | ||
| // TaskRun thread may not have finished the `return` + `using var` disposal yet. | ||
| await Task.Delay(100).ConfigureAwait(false); | ||
|
|
||
| // Cancel the parent — this SHOULD propagate to the linked token... | ||
| parentCts.Cancel(); | ||
|
|
Comment on lines
+18
to
+20
| /// Pattern affected (4 files): | ||
| /// ClientCredentialRequest.cs, OboRequest.cs, SilentRequest.cs, ManagedIdentityAuthRequest.cs | ||
| /// |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #6053 — Background token refresh ignores cancellation, causing thread starvation via semaphore convoy.
The Bug
The lambda passed to
ProcessFetchInBackgroundusesusing varto create a linkedCancellationTokenSource, but returns theTaskwithout awaiting it. In a non-async method,using vardisposes at the end of the block (thereturnstatement), before the async operation completes. This severs the link to the parent cancellation token.Impact:
SemaphoreSlim.WaitAsync(cancellationToken)inManagedIdentityAuthRequestbecomes permanently uncancellable. When the token endpoint (IMDS) is temporarily unreachable, every proactive refresh background task becomes a permanent semaphore waiter → unbounded convoy → foreground threads blocked for hours.The Fix
Make the lambda
asyncandawaitthe inner call. The compiler-generated state machine keepsusing varalive until the awaited operation completes, so parent cancellation propagates correctly through the linked token.4 files fixed (all introduced by PR #4471):
ClientCredentialRequest.csManagedIdentityAuthRequest.csOnBehalfOfRequest.cs(OBO)CacheSilentStrategy.cs(Silent)Regression Test
Added
LinkedCancellationTokenTests.cswith two pattern tests:Attribution
Fix commit authored by @jayesh-a-shah (cherry-picked from PR #6054 where ADO builds were not triggering).