Skip to content

Make TrackedSession bookkeeping thread-safe#3070

Merged
jeremydmiller merged 1 commit into
JasperFx:mainfrom
alesdvorakcz:thread-safe-tracked-session-bookkeeping
Jun 11, 2026
Merged

Make TrackedSession bookkeeping thread-safe#3070
jeremydmiller merged 1 commit into
JasperFx:mainfrom
alesdvorakcz:thread-safe-tracked-session-bookkeeping

Conversation

@alesdvorakcz

Copy link
Copy Markdown

What & why

When tracked sessions run with IncludeExternalTransports(), envelope records arrive concurrently from parallel transport listener threads (in our case ~90 Rabbit MQ queues on one host). EnvelopeHistory backed its records with an unsynchronized List<T>, so concurrent Add + LINQ iteration intermittently threw a NullReferenceException from inside the tracking bookkeeping itself, surfacing through AssertNoExceptionsWereThrown() and failing whatever test happened to have an active session (~1 in 4 runs of a ~50-test integration suite for us):

System.AggregateException : One or more errors occurred. (Object reference not set to an instance of an object.)
---- System.NullReferenceException
   at System.Linq.Enumerable.TryGetLast[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at Wolverine.Tracking.EnvelopeHistory.RecordCrossApplication(EnvelopeRecord record)
   at Wolverine.Runtime.WolverineRuntime.Received(Envelope envelope)
   at Wolverine.Runtime.HandlerPipeline.TryDeserializeEnvelope(Envelope envelope)
   ...

TrackedSession's _statuses and _exceptions lists have the same unguarded add/enumerate pattern (Record(...) / LogException(...) append from listener threads while AllRecordsInOrder() / AssertNoExceptionsWereThrown() enumerate).

The fix

  • EnvelopeHistory: all members that touch _records now synchronize on a private lock; the Records property returns a snapshot array instead of the live list.
  • TrackedSession: _statuses and _exceptions are appended under a lock on the collection instance, and all read sites go through snapshot helpers.

This is test-support code on cold paths, so the locking has no practical overhead.

Testing

  • New regression test envelope_history_thread_safety hammers RecordCrossApplication + the read paths from 8 threads. Against the previous implementation it fails deterministically (in under 150 ms) with the exact NullReferenceException from the report above; with this change it passes.
  • All 43 existing CoreTests.Tracking tests pass on net9.0 and net10.0.

Fixes #3069

EnvelopeHistory backed its records with an unsynchronized List<T>, but with
IncludeExternalTransports() records arrive concurrently from parallel transport
listener threads (e.g. multiple Rabbit MQ queues). Concurrent Add + LINQ
iteration intermittently threw NullReferenceException from
RecordCrossApplication, surfacing through AssertNoExceptionsWereThrown and
failing whatever test had an active tracked session.

TrackedSession's _statuses and _exceptions lists had the same unguarded
add/enumerate pattern.

All EnvelopeHistory members now synchronize on a private lock (Records returns
a snapshot), and the session lists are guarded by locking with snapshot reads.
Includes a regression stress test that reproduces the NullReferenceException
deterministically against the previous implementation.
Copilot AI review requested due to automatic review settings June 11, 2026 08:42

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR hardens Wolverine tracking against concurrency issues when multiple transport listener threads record activity for the same envelope/session.

Changes:

  • Add synchronization around TrackedSession status/exception collections via snapshot reads + locked writes
  • Make EnvelopeHistory thread-safe by guarding _records with a private lock and returning record snapshots
  • Add a stress test reproducing concurrent listener recording

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
src/Wolverine/Tracking/TrackedSession.cs Avoid concurrent enumeration/modification by snapshotting and locking around exception/status collections
src/Wolverine/Tracking/EnvelopeHistory.cs Serialize access to _records via a private lock; return snapshots to prevent concurrent iteration faults
src/Testing/CoreTests/Tracking/envelope_history_thread_safety.cs Add a multi-writer stress test to validate thread safety

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +169 to +183
private EnvelopeRecord[] statusesSnapshot()
{
lock (_statuses)
{
return _statuses.ToArray();
}
}

private Exception[] exceptionsSnapshot()
{
lock (_exceptions)
{
return _exceptions.ToArray();
}
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The camelCase is on purpose — the private methods in these files already use it
(lastOf, markLastCompleted, writeGrid, ...), so I just matched the surrounding code.
Can rename if the maintainers prefer PascalCase here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot is lying. Our naming convention is camelCasing for private members:)

Comment on lines +171 to +174
lock (_statuses)
{
return _statuses.ToArray();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking on the collections themselves is safe here — they're private and never leave the
class. But I have no strong feelings about it, happy to add dedicated lock fields if that's
the style maintainers prefer.

Comment thread src/Wolverine/Tracking/TrackedSession.cs
Comment thread src/Wolverine/Tracking/TrackedSession.cs
Comment thread src/Wolverine/Tracking/TrackedSession.cs
Comment thread src/Wolverine/Tracking/TrackedSession.cs
Comment on lines 580 to 584
var conditions = $"Conditions:\n{_conditions.Select(x => x.ToString())!.Join("\n")}";
var activity = $"Activity:\n{AllRecordsInOrder().Select(x => x.ToString()).Join("\n")}";
var exceptions = $"Exceptions:\n{_exceptions.Select(x => x.ToString()).Join("\n")}";
var exceptions = $"Exceptions:\n{exceptionsSnapshot().Select(x => x.ToString()).Join("\n")}";

return $"{conditions}\n\n{activity}\\{exceptions}";

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, but this PR only changed the line above it. Looks like it was meant to be \n\n.
I think it is out of scope for this PR, but if you want, I am happy to fix it.

Comment on lines +33 to +42
public IEnumerable<EnvelopeRecord> Records
{
get
{
lock (_lock)
{
return _records.ToArray();
}
}
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the signature could say this better. I can change it to EnvelopeRecord[] or IReadOnlyList<EnvelopeRecord> or leave it as is — whichever @jeremydmiller prefers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intermittent NullReferenceException from EnvelopeHistory.RecordCrossApplication during tracked sessions — EnvelopeHistory is not thread-safe

3 participants