Adds caller cancellation token propagation in hedging and timeout strategies#3094
Conversation
martincostello
left a comment
There was a problem hiding this comment.
This looks a lot less involved than I thought it might have been.
Just a few comments.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3094 +/- ##
=======================================
Coverage 96.16% 96.16%
=======================================
Files 310 311 +1
Lines 7136 7139 +3
Branches 1005 1006 +1
=======================================
+ Hits 6862 6865 +3
Misses 221 221
Partials 53 53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. |
|
CI failure is fixed by #3096. |
|
Some test fixes needed for |
|
Yeah just noticed, didn't know that was still a target — will fix. |
|
Didn't think it was worth doing the whole |
|
Thanks for your contribution @DaRosenberg - the changes from this pull request have been published as part of version 8.7.0 📦, which is now available from NuGet.org 🚀 |
Updated [Polly](https://github.com/App-vNext/Polly) from 8.6.6 to 8.7.0. <details> <summary>Release notes</summary> _Sourced from [Polly's releases](https://github.com/App-vNext/Polly/releases)._ ## 8.7.0 ## Highlights * Adds caller cancellation token propagation in hedging and timeout strategies by @DaRosenberg in App-vNext/Polly#3094 * Telemetry refactoring by @martincostello in App-vNext/Polly#2985 ## What's Changed * Update zizmor to 1.22.0 by @martincostello in App-vNext/Polly#2955 * Increase test timeout by @martincostello in App-vNext/Polly#2956 * Disable secrets-outside-env audit by @martincostello in App-vNext/Polly#2969 * Update zizmor to 1.23.1 by @martincostello in App-vNext/Polly#2970 * Update .NET NuGet packages by @martincostello in App-vNext/Polly#2982 * Add AGENTS.md by @martincostello in App-vNext/Polly#2983 * Fix typo in HTTP client integrations documentation by @alexravenna in App-vNext/Polly#2984 * Remove unused constant by @martincostello in App-vNext/Polly#2986 * Fix non-deterministic branch coverage in HedgingExecutionContext hedging delay tests by @Copilot in App-vNext/Polly#2997 * Bump GitHubActionsTestLogger to 3.0.2 by @martincostello in App-vNext/Polly#3000 * Bump actionlint to v1.7.12 by @martincostello in App-vNext/Polly#3006 * Bump sign by @martincostello in App-vNext/Polly#3008 * Move Public API baselines by @martincostello in App-vNext/Polly#3016 * Formatting tweaks by @martincostello in App-vNext/Polly#3017 * Formatting tweaks by @martincostello in App-vNext/Polly#3018 * Remove ZIZMOR_VERSION by @martincostello in App-vNext/Polly#3025 * Assert nullable has result by @martincostello in App-vNext/Polly#3028 * Update deprecated action input by @martincostello in App-vNext/Polly#3035 * Move dependabot to Friday by @martincostello in App-vNext/Polly#3044 * Fix tag comment by @martincostello in App-vNext/Polly#3045 * Fix dependabot group by @martincostello in App-vNext/Polly#3047 * Pin runner images by @martincostello in App-vNext/Polly#3065 * Bump Refit to 10.2.0 by @martincostello in App-vNext/Polly#3096 * Disable Azure deployments by @martincostello in App-vNext/Polly#3105 ## New Contributors * @alexravenna made their first contribution in App-vNext/Polly#2984 * @DaRosenberg made their first contribution in App-vNext/Polly#3094 **Full Changelog**: App-vNext/Polly@8.6.6...8.7.0 Commits viewable in [compare view](App-vNext/Polly@8.6.6...8.7.0). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Updated [Polly](https://github.com/App-vNext/Polly) from 8.6.6 to 8.7.0. <details> <summary>Release notes</summary> _Sourced from [Polly's releases](https://github.com/App-vNext/Polly/releases)._ ## 8.7.0 ## Highlights * Adds caller cancellation token propagation in hedging and timeout strategies by @DaRosenberg in App-vNext/Polly#3094 * Telemetry refactoring by @martincostello in App-vNext/Polly#2985 ## What's Changed * Update zizmor to 1.22.0 by @martincostello in App-vNext/Polly#2955 * Increase test timeout by @martincostello in App-vNext/Polly#2956 * Disable secrets-outside-env audit by @martincostello in App-vNext/Polly#2969 * Update zizmor to 1.23.1 by @martincostello in App-vNext/Polly#2970 * Update .NET NuGet packages by @martincostello in App-vNext/Polly#2982 * Add AGENTS.md by @martincostello in App-vNext/Polly#2983 * Fix typo in HTTP client integrations documentation by @alexravenna in App-vNext/Polly#2984 * Remove unused constant by @martincostello in App-vNext/Polly#2986 * Fix non-deterministic branch coverage in HedgingExecutionContext hedging delay tests by @Copilot in App-vNext/Polly#2997 * Bump GitHubActionsTestLogger to 3.0.2 by @martincostello in App-vNext/Polly#3000 * Bump actionlint to v1.7.12 by @martincostello in App-vNext/Polly#3006 * Bump sign by @martincostello in App-vNext/Polly#3008 * Move Public API baselines by @martincostello in App-vNext/Polly#3016 * Formatting tweaks by @martincostello in App-vNext/Polly#3017 * Formatting tweaks by @martincostello in App-vNext/Polly#3018 * Remove ZIZMOR_VERSION by @martincostello in App-vNext/Polly#3025 * Assert nullable has result by @martincostello in App-vNext/Polly#3028 * Update deprecated action input by @martincostello in App-vNext/Polly#3035 * Move dependabot to Friday by @martincostello in App-vNext/Polly#3044 * Fix tag comment by @martincostello in App-vNext/Polly#3045 * Fix dependabot group by @martincostello in App-vNext/Polly#3047 * Pin runner images by @martincostello in App-vNext/Polly#3065 * Bump Refit to 10.2.0 by @martincostello in App-vNext/Polly#3096 * Disable Azure deployments by @martincostello in App-vNext/Polly#3105 ## New Contributors * @alexravenna made their first contribution in App-vNext/Polly#2984 * @DaRosenberg made their first contribution in App-vNext/Polly#3094 **Full Changelog**: App-vNext/Polly@8.6.6...8.7.0 Commits viewable in [compare view](App-vNext/Polly@8.6.6...8.7.0). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Pull Request
The issue or feature being addressed
Fixes #3086 (Timeout strategy does not propagate the caller's
CancellationToken)When a resilience pipeline contains a strategy that substitutes the execution
CancellationTokenwith an internal one (timeout, and also hedging), a caller-initiated cancellation surfaces anOperationCanceledExceptionwhoseCancellationTokenis Polly's internal token rather than the caller's. This breaks the common pattern of letting caller cancellation pass through unchanged while wrapping other failures, because callers cannot reliably compareOperationCanceledException.CancellationTokento their own token.Details on the issue fix or feature implementation
Goal
Polly should throw an
OperationCanceledExceptioncarrying the caller's token if and only if the cancellation was actually caused by a cancellation request on that token — for any pipeline, regardless of which strategies it is composed of or how they are nested.Approach
A repo-wide audit shows that within
Polly.Coreexactly two strategies substitute the execution token: timeout (TimeoutResilienceStrategy) and hedging (TaskExecutionviaResilienceContext.InitializeFrom). Every other strategy and the pipeline plumbing only readcontext.CancellationToken, so they already emit the correct token at their own level.The fix therefore lives in those two strategies, via a small shared helper:
TimeoutRejectedException) is unchanged.Because each substituting strategy normalizes back to its own previous token, the behavior composes correctly through arbitrary nesting: an inner timeout rewrites to the mid-level token, the outer timeout rewrites that to the caller's token, and so on. The simplest case (
AddTimeoutonly) and deeply nested cases both end up with the caller's token.Design decisions and trade-offs
new OperationCanceledException(callerToken).TrySetStackTrace(), matching the existing convention inDelegatingComponent,CompositeComponent, and hedging's pre-execution cancellation check. We deliberately did not chain the original exception as anInnerException; it keeps the behavior consistent with the rest of the codebase, at the cost of not preserving the original deep stack trace in this specific path.Polly.Core) only. The legacy v7PolicyAPI uses a combined linked token and already documents (inAsyncTimeoutEngine/TimeoutEngine) that the token on the exception is not reliable for this determination. We left v7 untouched to avoid changing long-stable behavior.callerToken.IsCancellationRequested, read after the exception has been produced. Because aCancellationTokenis a monotonic latch with no record of when or why it fired, there is an inherent, unavoidable race: if a non-caller cause produces the exception and the caller then cancels within the small window before we inspect the token, the cancellation is attributed to the caller. We chose not to add machinery to fight this, for three reasons:previousToken.IsCancellationRequestedproxy, so this change does not make it worse (it only makes the resulting token more coherent).Tests
Issues(IssuesTests.CancellationTokenPropagation_3086.cs) with end-to-end coverage: the exact issue repro, timeout±retry in both orders, hedging, nested timeouts, a no-substitution baseline, and two "only if" guards (a real timeout still throwsTimeoutRejectedException; an unrelatedOperationCanceledExceptionis preserved when the caller token is not cancelled)All
Polly.Core.Testspass; branch and method coverage remain at 100% and the new code is fully covered.Confirm the following