Skip to content

fix(batched-query): enlist FetchForExclusiveWriting item before awaiting BeginTransactionAsync (#4589)#4590

Merged
jeremydmiller merged 1 commit into
JasperFx:masterfrom
steve-ziegler:fix/batchedquery-fetchforexclusivewriting-enlist-before-await
May 30, 2026
Merged

fix(batched-query): enlist FetchForExclusiveWriting item before awaiting BeginTransactionAsync (#4589)#4590
jeremydmiller merged 1 commit into
JasperFx:masterfrom
steve-ziegler:fix/batchedquery-fetchforexclusivewriting-enlist-before-await

Conversation

@steve-ziegler

Copy link
Copy Markdown
Contributor

Fixes #4589.

What

Reorders the two BatchedQuery.FetchForExclusiveWriting<T> overloads (Guid and string-key) to call AddItem(handler) synchronously before the await Parent.BeginTransactionAsync(...).

Why

Detail in #4589. TL;DR:

  • The current code awaits BeginTransactionAsync first, then calls AddItem. Under concurrency, BeginTransactionAsync yields before AddItem runs.
  • Wolverine HTTP [Aggregate(LoadStyle = Exclusive)] codegen calls FetchForExclusiveWriting without immediately awaiting it (the Task is captured for later resolution), then await batchQuery.Execute(ct).
  • Under the race, Execute runs while _items.Count == 0, returns immediately, and the BatchQueryItem.Result is never populated. The awaiter on the captured Task wedges forever.
  • The non-exclusive FetchForWriting<T> overloads in this same file (lines 83-131) don't have the bug because they're synchronous — AddItem always runs before the caller gets the Task back.

The fix is the minimum-mechanical-change form of the same insight that already keeps the non-exclusive overloads correct: enlist first, await second.

Diff

 public async Task<IEventStream<T>> FetchForExclusiveWriting<T>(Guid id) where T : class
 {
-    await Parent.BeginTransactionAsync(CancellationToken.None).ConfigureAwait(false);
-
+    // Enlist synchronously BEFORE the first await so the item is in _items
+    // by the time control returns to the caller. A subsequent Execute() is
+    // then guaranteed to see and process the item.
     _documentTypes.Add(typeof(IEvent));
     var plan = Parent.Events.As<EventStore>().FindFetchPlan<T, Guid>();
     if (plan.Lifecycle != ProjectionLifecycle.Live) { _documentTypes.Add(typeof(T)); }
     var handler = plan.BuildQueryHandler(Parent, id, true);
+    var resultTask = AddItem(handler);

-    return await AddItem(handler).ConfigureAwait(false);
+    await Parent.BeginTransactionAsync(CancellationToken.None).ConfigureAwait(false);
+    return await resultTask.ConfigureAwait(false);
 }

(Same shape for the string-key overload.)

The CancellationToken.None is preserved here to keep the public IBatchEvents.FetchForExclusiveWriting<T> interface unchanged. The fact that the caller's token is being dropped is a separate concern noted in the issue — happy to follow up with an interface-touching PR for that, or bundle here if you'd prefer.

Verification

Reproducer in the issue. A 20-parallel-loop concurrent test against a Wolverine HTTP [Aggregate(LoadStyle = Exclusive)] endpoint:

Run Result
Before fix hung past 60s (12 of 20 widgets ever progress past HANDLER-ENTER)
After fix (N=20) passes in 5s
After fix (N=50) passes in 5s

Risks

  • Behavior change is intentional and only affects the order of two operations on the local BatchedQuery instance. The transaction is still begun before Execute runs (Execute is only called by the caller AFTER all enlistment calls return, and our BeginTransactionAsync await completes before the method returns).
  • No interface signatures changed.
  • No new public API.
  • Companion non-exclusive FetchForWriting<T> overloads in the same file already use exactly this enlist-first pattern (they're sync, not async, which is why they don't have the bug) — the change here aligns the async overloads with the same correctness invariant.

Test coverage

Adding a fixture-level test for this would be ideal but requires Testcontainers + a Wolverine HTTP host to stage the codegen-flavored call site that exposes the race. I haven't included one in this PR to keep the surface minimal; happy to add a smaller-scope unit test that asserts AddItem-before-BeginTransactionAsync ordering via a stub Parent, or to wire up an integration test under MartenTests.AggregateHandlerWorkflow — let me know which (if either) you'd want before merge.

…ing BeginTransactionAsync

`BatchedQuery.FetchForExclusiveWriting<T>` (both Guid and string-key
overloads) had `await Parent.BeginTransactionAsync(...)` as its first
statement, before calling `AddItem(handler)` to enlist the query item.
Under concurrency, `BeginTransactionAsync` does not complete
synchronously — `AutoClosingLifetime.StartAsync` performs a real
`NpgsqlConnection.OpenAsync` socket round-trip — so the method yielded
before `AddItem` ran.

Wolverine HTTP's `[Aggregate(LoadStyle = Exclusive)]` codegen calls the
method without immediately awaiting:

    var task = batch.Events.FetchForExclusiveWriting<T>(id);
    await batch.Execute(ct);
    var stream = await task;

Under the race, `Execute` ran with `_items.Count == 0`, returned
immediately, and `task.Result` was never going to be populated
(`item.Result` is set only by `Execute`). The `await` on `task`
wedged forever.

The non-exclusive `FetchForWriting<T>` overloads in this same file
don't have the bug because they're synchronous — `AddItem` always
runs before the caller gets the Task back.

Fix: enlist via `AddItem` synchronously before any `await`, so the
item is in `_items` by the time control returns to the caller. Any
subsequent `Execute()` is then guaranteed to see and process it.

A 20-parallel-loop reproducer against a Wolverine HTTP
`[Aggregate(LoadStyle = Exclusive)]` endpoint hung past 60s before
this fix; passes in 5s with the fix. Full reproducer + diagnostic
trace in the linked issue.
@jeremydmiller jeremydmiller merged commit 6d96f64 into JasperFx:master May 30, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race in BatchedQuery.FetchForExclusiveWriting<T> wedges Wolverine HTTP [Aggregate(LoadStyle = Exclusive)] endpoints under concurrency

2 participants