Skip to content

Stop daemon-internal cancellations from leaking into CatchUpAsync (closes #284)#285

Merged
jeremydmiller merged 1 commit into
mainfrom
fix/284-daemon-internal-cancellation-leak
May 18, 2026
Merged

Stop daemon-internal cancellations from leaking into CatchUpAsync (closes #284)#285
jeremydmiller merged 1 commit into
mainfrom
fix/284-daemon-internal-cancellation-leak

Conversation

@jeremydmiller

Copy link
Copy Markdown
Member

Summary

  • GroupedProjectionExecution.processRangeAsync (and groupEventRangeAsync) now short-circuit OCE / wrapped-OCE in their generic catch when the shard's private _cancellation CTS has fired, instead of pushing the noise through ReportCriticalFailureAsync. Mirrors the existing guards in applyBatchOperationsToDatabaseAsync and SubscriptionExecutionBase.executeRange.
  • JasperFxAsyncDaemon.CatchUpAsync filters cancellation-shaped exceptions out of the aggregated recorder.States set when the inbound cancellation token was never cancelled (defense in depth, covers the AggregateException-of-OCE shape from the issue trace).

Why

Per #284 (JasperFx-side follow-up to JasperFx/marten#4462): Marten's ForceAllMartenDaemonActivityToCatchUpAsync test helper calls StopAllAsync() immediately before CatchUpAsync. Daemon-internal cancellation of an in-flight batch build (Npgsql 57014 surfaced through FetchProjectionStorageAsync) was being recorded onto ShardState.Exception, aggregated at JasperFxAsyncDaemon.cs:716, and re-thrown — making the helper surface benign internal-cancellation as a misleading "exceptions should be empty but had 1 item" assertion. The user-supplied CT never fired, so this was always internal-lifecycle noise.

Matches fix-path (2) + (3) from the issue's suggested fixes.

Test plan

  • dotnet build src/JasperFx.Events/JasperFx.Events.csproj — clean (216 pre-existing nullability warnings)
  • dotnet test src/EventTests/EventTests.csproj — 270/270 pass on net9.0 and net10.0
  • dotnet test src/EventStoreTests/EventStoreTests.csproj — 72/72 net9.0; 70/72 net10.0 (2 pre-existing RecentlyUsedCacheTests flakes unrelated to daemon code; pass in isolation on both main and this branch)
  • Once merged + new JasperFx.Events alpha is published, Marten side can unskip Bug_4441_force_catch_up_with_outbox.force_catch_up_invokes_message_batch_lifecycle_with_custom_outbox (currently pointed at marten#4462)

🤖 Generated with Claude Code

…oses #284)

GroupedProjectionExecution.processRangeAsync ignored the caller token and used
its private _cancellation CTS for batch builds. When that CTS fired mid-batch
(StopAllAsync / HardStopAsync / Dispose during a rebuild), the resulting
OperationCanceledException (or Npgsql 57014 wrapper) flowed into the generic
catch and was reported via ReportCriticalFailureAsync, which Recorder captured
onto a ShardState and CatchUpAsync re-threw as an AggregateException. Marten's
ForceAllMartenDaemonActivityToCatchUpAsync test helper then surfaced this as a
spurious "exceptions should be empty" failure even though no caller cancellation
ever happened.

Two-layer fix:

1. Add catch guards in GroupedProjectionExecution.processRangeAsync and
   groupEventRangeAsync that short-circuit when _cancellation.IsCancellationRequested
   is true, matching the existing pattern in applyBatchOperationsToDatabaseAsync
   and SubscriptionExecutionBase.executeRange.

2. Defense in depth in JasperFxAsyncDaemon.CatchUpAsync: drop OCE-shaped
   exceptions (including AggregateExceptions whose inners are all OCE) from
   the aggregated set when the inbound cancellation token was never cancelled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant