Skip to content

Adds caller cancellation token propagation in hedging and timeout strategies#3094

Merged
martincostello merged 4 commits into
App-vNext:mainfrom
OrgFlow:main
Jun 4, 2026
Merged

Adds caller cancellation token propagation in hedging and timeout strategies#3094
martincostello merged 4 commits into
App-vNext:mainfrom
OrgFlow:main

Conversation

@DaRosenberg

Copy link
Copy Markdown
Contributor

Pull Request

The issue or feature being addressed

Fixes #3086 (Timeout strategy does not propagate the caller's CancellationToken)

When a resilience pipeline contains a strategy that substitutes the execution CancellationToken with an internal one (timeout, and also hedging), a caller-initiated cancellation surfaces an OperationCanceledException whose CancellationToken is Polly's internal token rather than the caller's. This breaks the common pattern of letting caller cancellation pass through unchanged while wrapping other failures, because callers cannot reliably compare OperationCanceledException.CancellationToken to their own token.

Details on the issue fix or feature implementation

Goal

Polly should throw an OperationCanceledException carrying the caller's token if and only if the cancellation was actually caused by a cancellation request on that token — for any pipeline, regardless of which strategies it is composed of or how they are nested.

Approach

A repo-wide audit shows that within Polly.Core exactly two strategies substitute the execution token: timeout (TimeoutResilienceStrategy) and hedging (TaskExecution via ResilienceContext.InitializeFrom). Every other strategy and the pipeline plumbing only read context.CancellationToken, so they already emit the correct token at their own level.

The fix therefore lives in those two strategies, via a small shared helper:

// Polly.Core/Utils/OutcomeUtilities.cs
public static Outcome<T> WithCallerCancellation<T>(this Outcome<T> outcome, CancellationToken callerToken)
{
    if (callerToken.IsCancellationRequested
        && outcome.Exception is OperationCanceledException oce
        && oce.CancellationToken != callerToken)
    {
        return Outcome.FromException<T>(new OperationCanceledException(callerToken).TrySetStackTrace());
    }

    return outcome;
}
  • Timeout applies it to the outcome it returns, using the token that was on the context before it substituted its own. The existing timeout-detection branch (which produces TimeoutRejectedException) is unchanged.
  • Hedging applies it at its accepted-outcome return sites, using the upstream token it already captures for the duration of the hedged execution.

Because each substituting strategy normalizes back to its own previous token, the behavior composes correctly through arbitrary nesting: an inner timeout rewrites to the mid-level token, the outer timeout rewrites that to the caller's token, and so on. The simplest case (AddTimeout only) and deeply nested cases both end up with the caller's token.

Design decisions and trade-offs

  • Surgical fix We considered normalizing the exception once at the outermost pipeline execution boundary. That would also cover hypothetical third-party strategies that substitute the token, but it adds work to the universal execution hot path. We chose the surgical approach because it adds zero overhead to the common path (work happens only inside the two strategies, only on the cancellation branch) and provably covers every composition of the built-in strategies. The trade-off is that a custom strategy that substitutes the token would remain responsible for its own token handling.
  • Exception shape When the token must be corrected we create a bare new OperationCanceledException(callerToken).TrySetStackTrace(), matching the existing convention in DelegatingComponent, CompositeComponent, and hedging's pre-execution cancellation check. We deliberately did not chain the original exception as an InnerException; it keeps the behavior consistent with the rest of the codebase, at the cost of not preserving the original deep stack trace in this specific path.
  • Scope: v8 (Polly.Core) only. The legacy v7 Policy API uses a combined linked token and already documents (in AsyncTimeoutEngine/TimeoutEngine) that the token on the exception is not reliable for this determination. We left v7 untouched to avoid changing long-stable behavior.
  • Deliberately ignoring a benign race Detection relies on callerToken.IsCancellationRequested, read after the exception has been produced. Because a CancellationToken is a monotonic latch with no record of when or why it fired, there is an inherent, unavoidable race: if a non-caller cause produces the exception and the caller then cancels within the small window before we inspect the token, the cancellation is attributed to the caller. We chose not to add machinery to fight this, for three reasons:
    1. The timeout-vs-caller variant of this race is pre-existing — Polly's existing timeout classification already uses the same previousToken.IsCancellationRequested proxy, so this change does not make it worse (it only makes the resulting token more coherent).
    2. It cannot be fully eliminated: the token model exposes no causal/temporal information after the fact.
    3. In every misfire the caller's token genuinely is cancelled, so attributing the cancellation to the caller is defensible rather than harmful. The "only if" guarantee that matters in practice — not claiming caller cancellation when the caller's token was never cancelled — is fully preserved.

Tests

  • A regression test file under Issues (IssuesTests.CancellationTokenPropagation_3086.cs) with end-to-end coverage: the exact issue repro, timeout±retry in both orders, hedging, nested timeouts, a no-substitution baseline, and two "only if" guards (a real timeout still throws TimeoutRejectedException; an unrelated OperationCanceledException is preserved when the caller token is not cancelled)
  • Unit tests for the new helper covering all branches
  • Focused token-identity assertions added to the timeout and hedging strategy tests

All Polly.Core.Tests pass; branch and method coverage remain at 100% and the new code is fully covered.

Confirm the following

  • I started this PR by branching from the head of the default branch
  • I have targeted the PR to merge into the default branch
  • I have included unit tests for the issue/feature
  • I have successfully run a local build

@martincostello martincostello left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a lot less involved than I thought it might have been.

Just a few comments.

Comment thread src/Polly.Core/Utils/OutcomeUtilities.cs Outdated
@codecov

codecov Bot commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.16%. Comparing base (8d97812) to head (b367960).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3094   +/-   ##
=======================================
  Coverage   96.16%   96.16%           
=======================================
  Files         310      311    +1     
  Lines        7136     7139    +3     
  Branches     1005     1006    +1     
=======================================
+ Hits         6862     6865    +3     
  Misses        221      221           
  Partials       53       53           
Flag Coverage Δ
linux 96.16% <100.00%> (+<0.01%) ⬆️
macos 96.16% <100.00%> (+<0.01%) ⬆️
windows 96.15% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@martincostello

Copy link
Copy Markdown
Member

CI failure is fixed by #3096.

@martincostello

Copy link
Copy Markdown
Member

Some test fixes needed for net481 that doesn't have CancelAsync().

@DaRosenberg

Copy link
Copy Markdown
Contributor Author

Yeah just noticed, didn't know that was still a target — will fix.

@DaRosenberg

Copy link
Copy Markdown
Contributor Author

Didn't think it was worth doing the whole #if NET8_0_OR_GREATER in test code — let me know if you think that's worth it and I can add it.

@martincostello martincostello added this to the v8.7.0 milestone Jun 4, 2026
@martincostello martincostello merged commit e984839 into App-vNext:main Jun 4, 2026
27 checks passed
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for your contribution @DaRosenberg - the changes from this pull request have been published as part of version 8.7.0 📦, which is now available from NuGet.org 🚀

IhateTrains pushed a commit to ParadoxGameConverters/ImperatorToCK3 that referenced this pull request Jun 11, 2026
Updated [Polly](https://github.com/App-vNext/Polly) from 8.6.6 to 8.7.0.

<details>
<summary>Release notes</summary>

_Sourced from [Polly's
releases](https://github.com/App-vNext/Polly/releases)._

## 8.7.0

## Highlights

* Adds caller cancellation token propagation in hedging and timeout
strategies by @​DaRosenberg in
App-vNext/Polly#3094
* Telemetry refactoring by @​martincostello in
App-vNext/Polly#2985

## What's Changed

* Update zizmor to 1.22.0 by @​martincostello in
App-vNext/Polly#2955
* Increase test timeout by @​martincostello in
App-vNext/Polly#2956
* Disable secrets-outside-env audit by @​martincostello in
App-vNext/Polly#2969
* Update zizmor to 1.23.1 by @​martincostello in
App-vNext/Polly#2970
* Update .NET NuGet packages by @​martincostello in
App-vNext/Polly#2982
* Add AGENTS.md by @​martincostello in
App-vNext/Polly#2983
* Fix typo in HTTP client integrations documentation by @​alexravenna in
App-vNext/Polly#2984
* Remove unused constant by @​martincostello in
App-vNext/Polly#2986
* Fix non-deterministic branch coverage in HedgingExecutionContext
hedging delay tests by @​Copilot in
App-vNext/Polly#2997
* Bump GitHubActionsTestLogger to 3.0.2 by @​martincostello in
App-vNext/Polly#3000
* Bump actionlint to v1.7.12 by @​martincostello in
App-vNext/Polly#3006
* Bump sign by @​martincostello in
App-vNext/Polly#3008
* Move Public API baselines by @​martincostello in
App-vNext/Polly#3016
* Formatting tweaks by @​martincostello in
App-vNext/Polly#3017
* Formatting tweaks by @​martincostello in
App-vNext/Polly#3018
* Remove ZIZMOR_VERSION by @​martincostello in
App-vNext/Polly#3025
* Assert nullable has result by @​martincostello in
App-vNext/Polly#3028
* Update deprecated action input by @​martincostello in
App-vNext/Polly#3035
* Move dependabot to Friday by @​martincostello in
App-vNext/Polly#3044
* Fix tag comment by @​martincostello in
App-vNext/Polly#3045
* Fix dependabot group by @​martincostello in
App-vNext/Polly#3047
* Pin runner images by @​martincostello in
App-vNext/Polly#3065
* Bump Refit to 10.2.0 by @​martincostello in
App-vNext/Polly#3096
* Disable Azure deployments by @​martincostello in
App-vNext/Polly#3105

## New Contributors

* @​alexravenna made their first contribution in
App-vNext/Polly#2984
* @​DaRosenberg made their first contribution in
App-vNext/Polly#3094

**Full Changelog**:
App-vNext/Polly@8.6.6...8.7.0


Commits viewable in [compare
view](App-vNext/Polly@8.6.6...8.7.0).
</details>

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Polly&package-manager=nuget&previous-version=8.6.6&new-version=8.7.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
github-actions Bot pushed a commit to IntelliTect/EssentialCSharp.ListingManager that referenced this pull request Jun 11, 2026
Updated [Polly](https://github.com/App-vNext/Polly) from 8.6.6 to 8.7.0.

<details>
<summary>Release notes</summary>

_Sourced from [Polly's
releases](https://github.com/App-vNext/Polly/releases)._

## 8.7.0

## Highlights

* Adds caller cancellation token propagation in hedging and timeout
strategies by @​DaRosenberg in
App-vNext/Polly#3094
* Telemetry refactoring by @​martincostello in
App-vNext/Polly#2985

## What's Changed

* Update zizmor to 1.22.0 by @​martincostello in
App-vNext/Polly#2955
* Increase test timeout by @​martincostello in
App-vNext/Polly#2956
* Disable secrets-outside-env audit by @​martincostello in
App-vNext/Polly#2969
* Update zizmor to 1.23.1 by @​martincostello in
App-vNext/Polly#2970
* Update .NET NuGet packages by @​martincostello in
App-vNext/Polly#2982
* Add AGENTS.md by @​martincostello in
App-vNext/Polly#2983
* Fix typo in HTTP client integrations documentation by @​alexravenna in
App-vNext/Polly#2984
* Remove unused constant by @​martincostello in
App-vNext/Polly#2986
* Fix non-deterministic branch coverage in HedgingExecutionContext
hedging delay tests by @​Copilot in
App-vNext/Polly#2997
* Bump GitHubActionsTestLogger to 3.0.2 by @​martincostello in
App-vNext/Polly#3000
* Bump actionlint to v1.7.12 by @​martincostello in
App-vNext/Polly#3006
* Bump sign by @​martincostello in
App-vNext/Polly#3008
* Move Public API baselines by @​martincostello in
App-vNext/Polly#3016
* Formatting tweaks by @​martincostello in
App-vNext/Polly#3017
* Formatting tweaks by @​martincostello in
App-vNext/Polly#3018
* Remove ZIZMOR_VERSION by @​martincostello in
App-vNext/Polly#3025
* Assert nullable has result by @​martincostello in
App-vNext/Polly#3028
* Update deprecated action input by @​martincostello in
App-vNext/Polly#3035
* Move dependabot to Friday by @​martincostello in
App-vNext/Polly#3044
* Fix tag comment by @​martincostello in
App-vNext/Polly#3045
* Fix dependabot group by @​martincostello in
App-vNext/Polly#3047
* Pin runner images by @​martincostello in
App-vNext/Polly#3065
* Bump Refit to 10.2.0 by @​martincostello in
App-vNext/Polly#3096
* Disable Azure deployments by @​martincostello in
App-vNext/Polly#3105

## New Contributors

* @​alexravenna made their first contribution in
App-vNext/Polly#2984
* @​DaRosenberg made their first contribution in
App-vNext/Polly#3094

**Full Changelog**:
App-vNext/Polly@8.6.6...8.7.0


Commits viewable in [compare
view](App-vNext/Polly@8.6.6...8.7.0).
</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Timeout strategy does not propagate correct CancellationToken

2 participants