perf: reduce ConcurrentDictionary closure allocations in hot paths#5210
perf: reduce ConcurrentDictionary closure allocations in hot paths#5210
Conversation
Add TryGetValue fast paths before GetOrAdd calls to avoid closure allocations and unnecessary ConcurrentDictionary contention on the hot path. Key changes: - TestExecutionGuard: avoid allocating TaskCompletionSource when a test is already executing by checking TryGetValue first - EventReceiverOrchestrator: add TryGetValue fast paths before GetOrAdd for first-test-in-session/assembly/class event tasks and for assembly/class test count counters, avoiding closure allocations from captured state - HookDelegateBuilder: mark GetCachedGenericTypeDefinition lambda as static to prevent implicit closure allocation
There was a problem hiding this comment.
Code Review
Overall, this is a well-reasoned performance optimization PR. The changes are generally correct and the approach is sound. Here are my findings:
✅ — lambda
Clean, correct change. Adding is self-documenting intent and prevents any future accidental closure introduction if the lambda is modified. No concerns.
✅ — TryGetValue fast path
The fast path is correct. Worth noting the subtle thread-safety contract here: calls / before (in ), so any TCS retrieved via is guaranteed to be either:
- Still in-progress (await will block until completion), or
- Already completed (await returns immediately)
Either way, behaves correctly. The optimization is valid — it avoids allocating a on the common path when the test is already tracked.
✅ — First-test event task fast paths
This is the highest-impact change. The closures in do capture , //, and — so every call without the fast path allocates a new closure object on the heap. The bypass avoids this on the hot path (i.e., every test after the first in a session/assembly/class). Correct and well-targeted.
⚠️ Minor: PR description slightly overstates savings for counter dictionaries
In and , the pre-existing lambdas are . In C#, lambdas with no captures compile to cached static delegate fields — the delegate itself is never heap-allocated on repeated calls. So the bypass here does not save a closure/delegate allocation; it only saves the internal hash-table lookup and potential lock acquisition.
This is still a valid win under contention, but the PR description frames these as "closure allocation" savings when they're really "lock contention" savings. Not a correctness issue — just worth being precise for future readers.
🔍 — cold path concern (pre-existing, not introduced here)
The counter dicts (, ) are not cleared during , unlike the first-test task dicts. If is ever called more than once (e.g., for re-runs), the accumulates on top of existing values. This pre-dates this PR, but the refactored code now makes this more visible. Worth confirming the lifecycle contract — is always called exactly once, or should these counters also be reset?
Summary
The core optimization is sound and the implementation is correct. The -before- pattern is a well-established idiom for hot paths. The most impactful part is the first-test event closure avoidance, and the TCS avoidance is clean. The only non-trivial note is the potential double-count concern, which pre-exists this PR but is worth confirming.
There was a problem hiding this comment.
Code Review
Overall, this is a well-reasoned performance optimization PR. The changes are generally correct and the approach is sound. Here are my findings:
HookDelegateBuilder.cs — static lambda
Clean, correct change. Adding static is self-documenting intent and prevents any future accidental closure introduction if the lambda is modified. No concerns.
TestExecutionGuard.cs — TryGetValue fast path
The fast path is correct. Worth noting the subtle thread-safety contract here: ExecuteAndCompleteAsync calls SetResult/SetException before TryRemove (in finally), so any TCS retrieved via TryGetValue is guaranteed to be either in-progress (await blocks) or already completed (await returns immediately). Either way, WaitForExistingExecutionAsync behaves correctly.
The optimization is valid — it avoids allocating a TaskCompletionSource<bool> on the common path when the test is already tracked.
EventReceiverOrchestrator.cs — First-test event task fast paths
This is the highest-impact change. The GetOrAdd closures in InvokeFirstTestIn{Session,Assembly,Class}EventReceiversAsync do capture context, assemblyContext/classContext/sessionContext, and cancellationToken — so every call without the fast path allocates a new closure object on the heap. The TryGetValue bypass avoids this on the hot path (every test after the first in a session/assembly/class). Correct and well-targeted.
Minor: PR description slightly overstates savings for counter dictionaries
In InvokeLastTestIn{Assembly,Class}EventReceiversAsync and InitializeTestCounts, the pre-existing lambdas are static _ => new Counter(). In C#, static lambdas with no captures compile to cached static delegate fields — the delegate itself is never heap-allocated on repeated calls. So the TryGetValue bypass here does not save a closure/delegate allocation; it only saves the GetOrAdd internal hash-table lookup and potential lock acquisition under contention.
This is still a valid win, but the PR description frames these as "closure allocation" savings when they're really "lock contention" savings. Not a correctness issue — just worth being precise for future readers.
InitializeTestCounts — cold path concern (pre-existing, not introduced here)
The counter dicts (_assemblyTestCounts, _classTestCounts) are not cleared during InitializeTestCounts, unlike the first-test task dicts. If InitializeTestCounts is ever called more than once (e.g., re-runs), counter.Add(group.Count()) accumulates on top of existing values. This pre-dates this PR, but the refactored pattern makes it more visible. Worth confirming the lifecycle contract — is InitializeTestCounts always called exactly once, or should these counters also be reset?
Summary
The core optimization is sound and the implementation is correct. The TryGetValue-before-GetOrAdd pattern is a well-established idiom for ConcurrentDictionary hot paths. The most impactful change is the first-test event closure avoidance; the TestExecutionGuard TCS avoidance and the static lambda are clean wins too. The only non-trivial note is a pre-existing potential double-count concern in InitializeTestCounts worth confirming.
| return new ValueTask(existingTask); | ||
| } | ||
|
|
||
| var task = _firstTestInSessionTasks.GetOrAdd("session", |
There was a problem hiding this comment.
@thomhurst FYI there is also an overload for this, which internally does almost exactly the same, see ConcurrentDictionary.cs:1223.
var task = _firstTestInSessionTasks.GetOrAdd("session",
static (_, args) => InvokeFirstTestInSessionEventReceiversCoreAsync(args.Item1, args.Item2, args.Item2),
(context, sessionContext, cancellationToken));
return new ValueTask(task);[//]: # (dependabot-start)⚠️ **Dependabot is rebasing this PR**⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Updated [TUnit.Core](https://github.com/thomhurst/TUnit) from 1.19.57 to 1.21.6. <details> <summary>Release notes</summary> _Sourced from [TUnit.Core's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.21.6 <!-- Release notes generated using configuration in .github/release.yml at v1.21.6 --> ## What's Changed ### Other Changes * perf: replace object locks with Lock type for efficient synchronization by @thomhurst in thomhurst/TUnit#5219 * perf: parallelize test metadata collection for source-generated tests by @thomhurst in thomhurst/TUnit#5221 * perf: use GetOrAdd args overload to eliminate closure allocations in event receivers by @thomhurst in thomhurst/TUnit#5222 * perf: self-contained TestEntry<T> with consolidated switch invokers eliminates per-test JIT by @thomhurst in thomhurst/TUnit#5223 ### Dependencies * chore(deps): update tunit to 1.21.0 by @thomhurst in thomhurst/TUnit#5220 **Full Changelog**: thomhurst/TUnit@v1.21.0...v1.21.6 ## 1.21.0 <!-- Release notes generated using configuration in .github/release.yml at v1.21.0 --> ## What's Changed ### Other Changes * perf: reduce ConcurrentDictionary closure allocations in hot paths by @thomhurst in thomhurst/TUnit#5210 * perf: reduce async state machine overhead in test execution pipeline by @thomhurst in thomhurst/TUnit#5214 * perf: reduce allocations in EventReceiverOrchestrator and TestContextExtensions by @thomhurst in thomhurst/TUnit#5212 * perf: skip timeout machinery when no timeout configured by @thomhurst in thomhurst/TUnit#5211 * perf: reduce allocations and lock contention in ObjectTracker by @thomhurst in thomhurst/TUnit#5213 * Feat/numeric tolerance by @agray in thomhurst/TUnit#5110 * perf: remove unnecessary lock in ObjectTracker.TrackObjects by @thomhurst in thomhurst/TUnit#5217 * perf: eliminate async state machine in TestCoordinator.ExecuteTestAsync by @thomhurst in thomhurst/TUnit#5216 * perf: eliminate LINQ allocation in ObjectTracker.UntrackObjectsAsync by @thomhurst in thomhurst/TUnit#5215 * perf: consolidate module initializers into single .cctor via partial class by @thomhurst in thomhurst/TUnit#5218 ### Dependencies * chore(deps): update tunit to 1.20.0 by @thomhurst in thomhurst/TUnit#5205 * chore(deps): update dependency nunit3testadapter to 6.2.0 by @thomhurst in thomhurst/TUnit#5206 * chore(deps): update dependency cliwrap to 3.10.1 by @thomhurst in thomhurst/TUnit#5207 **Full Changelog**: thomhurst/TUnit@v1.20.0...v1.21.0 ## 1.20.0 <!-- Release notes generated using configuration in .github/release.yml at v1.20.0 --> ## What's Changed ### Other Changes * Fix inverted colors in HTML report ring chart due to locale-dependent decimal formatting by @Copilot in thomhurst/TUnit#5185 * Fix nullable warnings when using Member() on nullable properties by @Copilot in thomhurst/TUnit#5191 * Add CS8629 suppression and member access expression matching to IsNotNullAssertionSuppressor by @Copilot in thomhurst/TUnit#5201 * feat: add ConfigureAppHost hook to AspireFixture by @thomhurst in thomhurst/TUnit#5202 * Fix ConfigureTestConfiguration being invoked twice by @thomhurst in thomhurst/TUnit#5203 * Add IsEquivalentTo assertion for Memory<T> and ReadOnlyMemory<T> by @thomhurst in thomhurst/TUnit#5204 ### Dependencies * chore(deps): update dependency gitversion.tool to v6.6.2 by @thomhurst in thomhurst/TUnit#5181 * chore(deps): update dependency gitversion.msbuild to 6.6.2 by @thomhurst in thomhurst/TUnit#5180 * chore(deps): update tunit to 1.19.74 by @thomhurst in thomhurst/TUnit#5179 * chore(deps): update verify to 31.13.3 by @thomhurst in thomhurst/TUnit#5182 * chore(deps): update verify to 31.13.5 by @thomhurst in thomhurst/TUnit#5183 * chore(deps): update aspire to 13.1.3 by @thomhurst in thomhurst/TUnit#5189 * chore(deps): update dependency stackexchange.redis to 2.12.4 by @thomhurst in thomhurst/TUnit#5193 * chore(deps): update microsoft/setup-msbuild action to v3 by @thomhurst in thomhurst/TUnit#5197 **Full Changelog**: thomhurst/TUnit@v1.19.74...v1.20.0 ## 1.19.74 <!-- Release notes generated using configuration in .github/release.yml at v1.19.74 --> ## What's Changed ### Other Changes * feat: per-hook activity spans with method names by @thomhurst in thomhurst/TUnit#5159 * fix: add tooltip to truncated span names in HTML report by @thomhurst in thomhurst/TUnit#5164 * Use enum names instead of numeric values in test display names by @Copilot in thomhurst/TUnit#5178 * fix: resolve CS8920 when mocking interfaces whose members return static-abstract interfaces by @lucaxchaves in thomhurst/TUnit#5154 ### Dependencies * chore(deps): update tunit to 1.19.57 by @thomhurst in thomhurst/TUnit#5157 * chore(deps): update dependency gitversion.msbuild to 6.6.1 by @thomhurst in thomhurst/TUnit#5160 * chore(deps): update dependency gitversion.tool to v6.6.1 by @thomhurst in thomhurst/TUnit#5161 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5163 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5162 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5166 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5167 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5168 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5169 * chore(deps): update dependency coverlet.collector to 8.0.1 by @thomhurst in thomhurst/TUnit#5177 ## New Contributors * @lucaxchaves made their first contribution in thomhurst/TUnit#5154 **Full Changelog**: thomhurst/TUnit@v1.19.57...v1.19.74 Commits viewable in [compare view](thomhurst/TUnit@v1.19.57...v1.21.6). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.19.57 to 1.21.6. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.21.6 <!-- Release notes generated using configuration in .github/release.yml at v1.21.6 --> ## What's Changed ### Other Changes * perf: replace object locks with Lock type for efficient synchronization by @thomhurst in thomhurst/TUnit#5219 * perf: parallelize test metadata collection for source-generated tests by @thomhurst in thomhurst/TUnit#5221 * perf: use GetOrAdd args overload to eliminate closure allocations in event receivers by @thomhurst in thomhurst/TUnit#5222 * perf: self-contained TestEntry<T> with consolidated switch invokers eliminates per-test JIT by @thomhurst in thomhurst/TUnit#5223 ### Dependencies * chore(deps): update tunit to 1.21.0 by @thomhurst in thomhurst/TUnit#5220 **Full Changelog**: thomhurst/TUnit@v1.21.0...v1.21.6 ## 1.21.0 <!-- Release notes generated using configuration in .github/release.yml at v1.21.0 --> ## What's Changed ### Other Changes * perf: reduce ConcurrentDictionary closure allocations in hot paths by @thomhurst in thomhurst/TUnit#5210 * perf: reduce async state machine overhead in test execution pipeline by @thomhurst in thomhurst/TUnit#5214 * perf: reduce allocations in EventReceiverOrchestrator and TestContextExtensions by @thomhurst in thomhurst/TUnit#5212 * perf: skip timeout machinery when no timeout configured by @thomhurst in thomhurst/TUnit#5211 * perf: reduce allocations and lock contention in ObjectTracker by @thomhurst in thomhurst/TUnit#5213 * Feat/numeric tolerance by @agray in thomhurst/TUnit#5110 * perf: remove unnecessary lock in ObjectTracker.TrackObjects by @thomhurst in thomhurst/TUnit#5217 * perf: eliminate async state machine in TestCoordinator.ExecuteTestAsync by @thomhurst in thomhurst/TUnit#5216 * perf: eliminate LINQ allocation in ObjectTracker.UntrackObjectsAsync by @thomhurst in thomhurst/TUnit#5215 * perf: consolidate module initializers into single .cctor via partial class by @thomhurst in thomhurst/TUnit#5218 ### Dependencies * chore(deps): update tunit to 1.20.0 by @thomhurst in thomhurst/TUnit#5205 * chore(deps): update dependency nunit3testadapter to 6.2.0 by @thomhurst in thomhurst/TUnit#5206 * chore(deps): update dependency cliwrap to 3.10.1 by @thomhurst in thomhurst/TUnit#5207 **Full Changelog**: thomhurst/TUnit@v1.20.0...v1.21.0 ## 1.20.0 <!-- Release notes generated using configuration in .github/release.yml at v1.20.0 --> ## What's Changed ### Other Changes * Fix inverted colors in HTML report ring chart due to locale-dependent decimal formatting by @Copilot in thomhurst/TUnit#5185 * Fix nullable warnings when using Member() on nullable properties by @Copilot in thomhurst/TUnit#5191 * Add CS8629 suppression and member access expression matching to IsNotNullAssertionSuppressor by @Copilot in thomhurst/TUnit#5201 * feat: add ConfigureAppHost hook to AspireFixture by @thomhurst in thomhurst/TUnit#5202 * Fix ConfigureTestConfiguration being invoked twice by @thomhurst in thomhurst/TUnit#5203 * Add IsEquivalentTo assertion for Memory<T> and ReadOnlyMemory<T> by @thomhurst in thomhurst/TUnit#5204 ### Dependencies * chore(deps): update dependency gitversion.tool to v6.6.2 by @thomhurst in thomhurst/TUnit#5181 * chore(deps): update dependency gitversion.msbuild to 6.6.2 by @thomhurst in thomhurst/TUnit#5180 * chore(deps): update tunit to 1.19.74 by @thomhurst in thomhurst/TUnit#5179 * chore(deps): update verify to 31.13.3 by @thomhurst in thomhurst/TUnit#5182 * chore(deps): update verify to 31.13.5 by @thomhurst in thomhurst/TUnit#5183 * chore(deps): update aspire to 13.1.3 by @thomhurst in thomhurst/TUnit#5189 * chore(deps): update dependency stackexchange.redis to 2.12.4 by @thomhurst in thomhurst/TUnit#5193 * chore(deps): update microsoft/setup-msbuild action to v3 by @thomhurst in thomhurst/TUnit#5197 **Full Changelog**: thomhurst/TUnit@v1.19.74...v1.20.0 ## 1.19.74 <!-- Release notes generated using configuration in .github/release.yml at v1.19.74 --> ## What's Changed ### Other Changes * feat: per-hook activity spans with method names by @thomhurst in thomhurst/TUnit#5159 * fix: add tooltip to truncated span names in HTML report by @thomhurst in thomhurst/TUnit#5164 * Use enum names instead of numeric values in test display names by @Copilot in thomhurst/TUnit#5178 * fix: resolve CS8920 when mocking interfaces whose members return static-abstract interfaces by @lucaxchaves in thomhurst/TUnit#5154 ### Dependencies * chore(deps): update tunit to 1.19.57 by @thomhurst in thomhurst/TUnit#5157 * chore(deps): update dependency gitversion.msbuild to 6.6.1 by @thomhurst in thomhurst/TUnit#5160 * chore(deps): update dependency gitversion.tool to v6.6.1 by @thomhurst in thomhurst/TUnit#5161 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5163 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5162 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5166 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5167 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5168 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5169 * chore(deps): update dependency coverlet.collector to 8.0.1 by @thomhurst in thomhurst/TUnit#5177 ## New Contributors * @lucaxchaves made their first contribution in thomhurst/TUnit#5154 **Full Changelog**: thomhurst/TUnit@v1.19.57...v1.19.74 Commits viewable in [compare view](thomhurst/TUnit@v1.19.57...v1.21.6). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Summary
TryGetValuefast path beforeGetOrAddto avoid allocating aTaskCompletionSource<bool>on every call when a test is already executingTryGetValuefast paths beforeGetOrAddcalls for first-test-in-session/assembly/class tasks, avoiding closure allocations from capturedcontext,cancellationToken, and other state on the hot path. Also added fast paths for assembly/class test count counters to reduce lock contention (these usestaticlambdas so the savings are contention-related, not closure-related)GetCachedGenericTypeDefinitionlambda asstaticto prevent implicit closure allocationRationale
Profiling shows ~4.1% exclusive CPU in
ConcurrentDictionaryoperations (TryAddInternal,AcquireAllLocks). These changes reduce allocations and contention by:TryGetValuebeforeGetOrAddso closures capturing state are never created on the fast path (when key already exists)TaskCompletionSourceallocations in the execution guardstaticlambdas (already cached by the compiler, butGetOrAddstill acquires internal locks)staticlambdas where the lambda body doesn't need captured stateNo behavioral changes — only allocation and contention reduction. Thread safety guarantees are preserved since
TryGetValueis lock-free onConcurrentDictionaryand the subsequentGetOrAddhandles the race correctly.Test plan