perf: eliminate per-test closure + GetOrAdd factory alloc (#5710)#5727
perf: eliminate per-test closure + GetOrAdd factory alloc (#5710)#5727
Conversation
`TestEntry<T>.ToTestMetadata` previously allocated a closure over `this` (lambda capturing `InvokeBody`/`MethodIndex` and `CreateAttributes`/ `AttributeGroupIndex`) the first time each entry was materialized, and the cached delegates were then invoked per test. That closure was the top user-code self-time hit in CPU sampling (2.62% incl / 2.11% self on `<ToTestMetadata>b__80_0`). Forward the class-shared static delegates and their indexes directly onto `TestMetadata<T>` via new `IndexedInvokeBody`/`MethodIndex` and `IndexedAttributeFactory`/`AttributeGroupIndex` fields. `ExecutableTest<T>. InvokeTestAsync` and `TestMetadata.GetOrCreateAttributes` prefer the indexed form, falling back to the legacy delegates. Result: zero closure allocations on the TestEntry path — the JIT inlines the static dispatch and there is no `this`-capturing `<>c__DisplayClass` emitted. `AttributeFactory` loses `required` so callers can pick either form; existing reflection/test callers continue to set it unchanged. Snapshot tests updated for the three affected TFMs.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
Summary
This is a well-motivated, targeted performance optimization that eliminates two per-test closure allocations in ToTestMetadata. The approach is sound — forwarding class-shared static delegates + indexes instead of wrapping them in a new closure each time. No existing review comments to address.
What Works Well
- Optimization is correct and meaningful. Replacing
(instance, args, ct) => InvokeBody(instance, MethodIndex, args, ct)and() => CreateAttributes(AttributeGroupIndex)closures with direct delegate+index forwarding is a legitimate win on the hot path (2.11% self-time is significant for a test framework's core loop). - Fallback compatibility preserved.
ExecutableTest\<T\>.InvokeTestAsyncandGetOrCreateAttributes()both retain the old path, so reflection mode and customTestMetadataconsumers are unaffected. _cachedInvokeTypedTest/_cachedAttributeFactorycleanup is correct. Since the new delegates are class-shared statics, there's nothing to cache per-entry — removing these fields is the right call.- Snapshot files updated correctly across all four targets.
Concerns
1. Removing required from AttributeFactory is a breaking public API change with no compile-time replacement
// Before
public required Func<Attribute[]> AttributeFactory { get; init; }
// After
public Func<Attribute[]>? AttributeFactory { get; init; }Any consumer that constructs TestMetadata now gets no compile-time guarantee that a factory is set. The error is deferred to GetOrCreateAttributes() at runtime. The new IndexedAttributeFactory is also optional, so it's easy to silently construct a broken TestMetadata that throws only when attributes are accessed.
Suggested improvement: Validate in the constructor or a factory method rather than relying on deferred runtime detection:
internal Attribute[] GetOrCreateAttributes()
{
if (_cachedAttributes is not null) return _cachedAttributes;
// Validate once on first access
if (IndexedAttributeFactory is null && AttributeFactory is null)
throw new InvalidOperationException($"No attribute factory configured for test '{TestName}'.");
return _cachedAttributes = IndexedAttributeFactory is not null
? IndexedAttributeFactory(AttributeGroupIndex)
: AttributeFactory!();
}Or, even better, consider an abstract factory method pattern that forces implementations to provide one.
2. Growing dual-protocol API surface on a public class
The PR adds four new public properties (IndexedAttributeFactory, AttributeGroupIndex, IndexedInvokeBody, MethodIndex) alongside the existing AttributeFactory and InvokeTypedTest. The class now has two parallel dispatch protocols — the legacy closure-based path and the new indexed path. This pattern will persist as a maintenance surface.
For TUnit's internal purposes this is fine since the source generator controls what gets set. But as a public API, third-party consumers see a confusing class where MethodIndex = 0 is both "not set" and "the first method". There's no way to distinguish them.
Architectural suggestion: If the indexed path is only ever set by the source generator and never by external code, consider making IndexedAttributeFactory, AttributeGroupIndex, IndexedInvokeBody, and MethodIndex internal instead of public. The only callers are TestEntry\<T\> (internal) and ExecutableTest\<T\> (internal). Making them internal would avoid expanding the public API surface and avoid the ambiguity.
3. Thread-safety of GetOrCreateAttributes() regressed slightly
The old code was:
return _cachedAttributes ??= AttributeFactory();The new code:
if (_cachedAttributes is not null) return _cachedAttributes;
var indexed = IndexedAttributeFactory;
if (indexed is not null) return _cachedAttributes = indexed(AttributeGroupIndex);
var factory = AttributeFactory ?? throw ...;
return _cachedAttributes = factory();Both are technically not thread-safe (multiple threads can pass the null check and all invoke the factory), but the old code was at least a single compound expression. The new multi-step form has slightly more read-then-write steps and is arguably easier to misread as safe. Neither form uses Interlocked or Volatile. If GetOrCreateAttributes() is called concurrently (e.g., parallel test discovery), the factory could be invoked multiple times. This is probably benign (since Attribute[] creation is idempotent), but worth a comment acknowledging the intentional trade-off.
Minor
- The three blank lines at the end of
TestMetadata1.csafterMethodIndex { get; init; }` are likely unintentional:+ public int MethodIndex { get; init; } + + +
Verdict
The performance goal is valid and the implementation achieves it correctly. The main architectural concern is whether these new properties should be internal rather than public — if they're only used by source-generated and engine-internal code, keeping them off the public API surface would avoid long-term maintenance burden and user confusion. The missing compile-time factory enforcement is the other actionable item. Otherwise, this is a clean, focused change worth merging.
- Restore `AttributeFactory` as `required` so external TestMetadata consumers keep compile-time enforcement; TestEntry supplies a static throwing sentinel since its metadata always resolves via the indexed factory. - Make `IndexedAttributeFactory`, `AttributeGroupIndex`, `IndexedInvokeBody`, and `MethodIndex` internal — only set/read inside TUnit.Core, so they no longer inflate the public API surface or leak the "MethodIndex = 0 means unset" ambiguity. - Replace the read-then-write in `GetOrCreateAttributes` with `Interlocked.CompareExchange` so the benign factory race publishes exactly one array without a lock. - Trim trailing blank lines in TestMetadata`1.cs and shrink PublicAPI snapshots accordingly.
|
Pushed 726910b addressing review feedback:
Source generator emits |
There was a problem hiding this comment.
Second-pass review — all prior feedback addressed ✓
The previous review raised four concerns; commit 726910b resolves every one of them cleanly. Here's the second-pass verdict:
What changed and why it matters
| Previous concern | Resolution |
|---|---|
required removed from AttributeFactory — silent runtime break |
Reverted to required; s_indexedAttributeFactoryPlaceholder satisfies the constraint at zero allocation |
| Four new properties were public, growing the API surface | All four (IndexedInvokeBody, MethodIndex, IndexedAttributeFactory, AttributeGroupIndex) are now internal |
_cachedAttributes write was not atomic |
Interlocked.CompareExchange with a documented benign-race comment |
| Trailing blank lines | Trimmed |
Assessment of the current implementation
s_indexedAttributeFactoryPlaceholder sentinel is the right call. Satisfying required with a static throwing delegate is an unconventional pattern, but the inline comment explains it well and the error message is actionable. Better than removing required or adding a factory method purely to paper over an internal implementation detail.
GetOrCreateAttributes() CAS pattern (Interlocked.CompareExchange(ref _cachedAttributes, produced, null) ?? produced) is correct:
- CAS wins → returns prior value (null) →
?? producedreturns the freshly built array (which is also now published) - CAS loses → returns the winner's non-null array directly
The comment acknowledging the double-invoke possibility under contention is appropriate. Since attribute arrays are immutable and factories are pure, the benign-race trade-off beats locking on a hot path.
Dual-protocol dispatch (IndexedInvokeBody preferred over InvokeTypedTest, IndexedAttributeFactory preferred over AttributeFactory) is clean precisely because both new fields are internal. External consumers never see the split, and within TUnit.Core the invariants are enforced by the single call site in TestEntry.ToTestMetadata. No further defensive layering is needed.
Null-forgiving ! in InvokeTestAsync for the legacy InvokeTypedTest! path is safe: ExecuteTestFactory already validates that at least one invoker is present before creating the factory, so reaching that line without a non-null InvokeTypedTest would require two independent bugs to align.
One minor observation (non-blocking)
AttributeGroupIndex and MethodIndex both default to 0. For metadata created via the reflection path or TestMetadataFactory.Create (where only AttributeFactory/InvokeTypedTest are set), those indexes are never read because IndexedAttributeFactory/IndexedInvokeBody remain null. This is correct, but if a reader is auditing later they might momentarily wonder whether a MethodIndex = 0 on a reflection-mode TestMetadata has a meaning. A brief note in the summary XML docs ("Unused unless IndexedInvokeBody is set") would help, though it's genuinely low-priority.
Verdict
Approve. The optimization is legitimate (2.1% self-time on 1013 tests is a real win at framework scale), the implementation is correct, the public API surface is unchanged, and every previous concern is fully resolved. This is ready to merge pending CI green.
- S2743 (ErrorProne): move `s_indexedAttributeFactoryPlaceholder` off the generic `TestEntry<T>` into a non-generic `TestEntrySentinel` holder so the placeholder delegate is shared across every closed generic instantiation instead of duplicated per closed type. - S2372 (BestPractice): extract the throwing path in `TestMetadata<T>.CreateExecutableTestFactory` into a `[DoesNotReturn]` helper so the property getter itself is throw-free while preserving the T-qualified diagnostic.
|
Fixed the 2 Codacy issues flagged on 1.
The 2.
The abstract base declares Neither fix is a suppression — both address the underlying complaint structurally. Verified green:
Commit: 883e5ba |
There was a problem hiding this comment.
Code Review
This is a clean, well-motivated performance optimization that's been iterated on responsibly through the commit history. The approach is sound and the implementation is correct. A few observations below.
What's working well
CAS pattern in GetOrCreateAttributes (TestMetadata.cs:80) is correct and idiomatic:
return Interlocked.CompareExchange(ref _cachedAttributes, produced, null) ?? produced;CAS returns the original value — null means this thread won and produced is now published; non-null means another thread won first and we should use their array. The ?? produced correctly handles both arms. The inline comment explaining the "benign race" is the right call given the subtlety.
TestEntrySentinel is the right fix for S2743. Hoisting the static-but-type-invariant delegate out of the generic class to avoid per-closed-type duplication is exactly the intended pattern.
[DoesNotReturn] ThrowMissingInvoker() is the correct idiom for making a property getter throw-free without changing the member kind (which would be a public API break). Well done.
Concerns
1. The sentinel's InvalidOperationException lacks T context
TestEntrySentinel.IndexedAttributeFactoryPlaceholder throws:
"TestEntry metadata must resolve attributes via IndexedAttributeFactory."If this is ever hit during a CI run or user debugging session, there's no indication of which test class triggered it. Since TestEntrySentinel is non-generic by design (S2743), it can't embed T — but the call site in GetOrCreateAttributes is on TestMetadata which does know TestClassType and TestName. A more actionable message would come from throwing at the call site instead:
// In GetOrCreateAttributes, after the null check:
var indexed = IndexedAttributeFactory;
if (indexed is not null)
return Interlocked.CompareExchange(ref _cachedAttributes, indexed(AttributeGroupIndex), null)
?? _cachedAttributes!;
// AttributeFactory is required, so if we got here without IndexedAttributeFactory it must be set
var produced = AttributeFactory();
return Interlocked.CompareExchange(ref _cachedAttributes, produced, null) ?? produced;The sentinel approach is clever but trades debuggability for compile-time constraint satisfaction. Alternatively, the message could be enriched at the throw site with enough context (TestName, TestClassType.Name) to be actionable.
2. Dual-dispatch adds permanent maintenance surface
InvokeTestAsync now has two code paths:
var indexed = _metadata.IndexedInvokeBody; // TestEntry path
if (indexed is not null) { ... return; }
await _metadata.InvokeTypedTest!(...); // reflection/generic pathAnd GetOrCreateAttributes mirrors the same fork. This is an inherent cost of keeping backward compatibility with reflection mode and per-method source-gen. It's documented, but any future change to invocation semantics (e.g., adding cancellation escalation, timeout wrapping) must be applied to both branches. Worth a comment on InvokeTestAsync to make the invariant explicit: "indexed path = TestEntry source-gen; legacy path = reflection/generic resolution" — similar to the comment already on GetOrCreateAttributes.
3. MethodIndex = 0 and AttributeGroupIndex = 0 silent defaults
Both internal int fields default to 0, which is a valid index. Since they're always set together with their companion delegate in TestEntry.ToTestMetadata(), this is safe today. But there's no compile-time enforcement of the pairing — someone could set IndexedInvokeBody in a unit test without setting MethodIndex and silently dispatch to method 0. Given these are internal, the blast radius is limited, but a brief comment pairing them ("must be set together with IndexedInvokeBody") would guard against future drift.
Summary
The optimization is correct and the iterative response to prior review feedback (API surface, thread safety, Codacy issues) is thorough. The main residual concern is debuggability of the sentinel path — if the engine ever misconfigures a TestMetadata and the sentinel throws, the error message won't point to the test that caused it. The dual-dispatch maintenance concern is inherent to the design and acceptable given the performance goal and good documentation.
Good work overall — this is a meaningful hot-path improvement with a clear before/after story.
Address third-pass PR review: - Throw at GetOrCreateAttributes() call site with TestName/TestClassType context when IndexedAttributeFactory is null and AttributeFactory is the sentinel — the sentinel delegate itself has no access to that context. - Relax the sentinel's fallback message (guarded path); CAS publication preserved. - Expand XML docs on MethodIndex/AttributeGroupIndex to note they are only read when the paired indexed delegate is non-null.
|
Third-pass review suggestions applied in 15f7479:
Build: OK. Snapshot tests ( |
There was a problem hiding this comment.
Code Review — Round 4
Summary
All blocking concerns from previous review rounds have been fully resolved. This PR is in good shape.
What was addressed since Round 3 (commit 15f74795)
Contextual sentinel error messages — GetOrCreateAttributes() now checks ReferenceEquals(AttributeFactory, TestEntrySentinel.IndexedAttributeFactoryPlaceholder) before invoking AttributeFactory and throws with TestName + TestClassType.FullName context. The sentinel's own fallback message is appropriately generic. This resolves the round-3 concern about the non-generic TestEntrySentinel having no test context on failure.
XML docs on MethodIndex/AttributeGroupIndex — Now state "Unused unless IndexedInvokeBody/IndexedAttributeFactory is set", addressing the round-2 and round-3 carryover request.
Remaining Minor Items (non-blocking)
InvokeTestAsync dispatch comment — Suggested in round 3 but not yet added:
var indexed = _metadata.IndexedInvokeBody;
if (indexed is not null)
{
// Indexed path: source-generated TestEntry (avoids per-test closure allocation)
await indexed((T)instance, _metadata.MethodIndex, Arguments, cancellationToken).ConfigureAwait(false);
return;
}
// Legacy path: reflection-based or generic source-gen fallback
await _metadata.InvokeTypedTest!((T)instance, Arguments, cancellationToken).ConfigureAwait(false);The two paths differ in origin and performance characteristics — a one-line comment per branch would help future maintainers without adding noise. Not a blocker.
Design Assessment
The overall approach is sound and the iterative review process caught all meaningful issues. Specifically:
- Thread safety: The
Interlocked.CompareExchange(ref _cachedAttributes, produced, null) ?? producedpattern is correct — the?? producedhandles the winning-thread case cleanly. - API surface: All four new properties (
IndexedInvokeBody,MethodIndex,IndexedAttributeFactory,AttributeGroupIndex) areinternal, so there is no public API churn. TestEntrySentinelpattern: Moving the static placeholder field out of the genericTestEntry<T>into a non-generic class correctly avoids the Codacy S2743 "static field in generic type" duplication issue. The pattern is unconventional but clearly scoped and documented.- Correctness: Both dispatch branches in
InvokeTestAsyncandGetOrCreateAttributes()cover all paths — source-gen indexed path and reflection/generic fallback — without regressions.
The optimization goal is valid (eliminates two heap closures per test per ToTestMetadata call) and the implementation achieves it without changing the public API or breaking existing functionality.
Verdict: Approved pending CI. The one remaining item (inline branch comments in InvokeTestAsync) is a minor suggestion and not a blocker.
|
Diagnosed the Windows CI failure on run 24889949190. Failing test: Root cause: Pre-existing timing flake. The test asserts total stopwatch duration < 1s after exercising Why unrelated to this PR: The diff touches only Re-running failed jobs. |
Closes #5710
Summary
CPU sampling (
dotnet-trace, 1013 tests, net10.0, Release) flagged<ToTestMetadata>b__80_0as the top user-code self-time hit at 2.62% inclusive / 2.11% self. The lambda insideTestEntry<T>.ToTestMetadatacaptures
this, so the compiler emits a<>c__DisplayClass80_0closure object on the first call for every entry, and the cached delegate is then invoked once per test. The matching() => CreateAttributes(AttributeGroupIndex)closure adds a second per-entry allocation and hot-path indirection.This change forwards the class-shared static delegates (
InvokeBody,CreateAttributes) and their per-test indexes (MethodIndex,AttributeGroupIndex) directly ontoTestMetadata<T>via new fields:TestMetadata<T>.IndexedInvokeBody+MethodIndexTestMetadata.IndexedAttributeFactory+AttributeGroupIndexExecutableTest<T>.InvokeTestAsyncandTestMetadata.GetOrCreateAttributesprefer the indexed form, falling back to the legacyInvokeTypedTest/AttributeFactorydelegates used by the per-method source-gen path and reflection mode.AttributeFactorylosesrequiredso each TestEntry-backed metadata only needs to populate one form; existing call sites that setAttributeFactory(reflection discovery, unit tests,TestMetadataFactory.Create) continue to work unchanged.Impact
ToTestMetadatacall on the TestEntry path — the<>c__DisplayClassis gone and the static class-level delegates flow through by reference.ExecutableTest<T>.InvokeTestAsync(indexed((T)instance, MethodIndex, args, ct)) with no captured state.GenericTestMetadata), and the per-method source-gen path (TestMetadataFactory.Create) are untouched and continue to setInvokeTypedTest/AttributeFactory.Verification
dotnet build TUnit.slnx -c Release— 0 errors, pre-existing warnings only.TUnit.Core.SourceGenerator.Tests— 116 passed, 1 skipped (pre-existing).TUnit.TestProject --treenode-filter "/*/*/BasicTests/*"— 3/3 passed (source-gen mode).TUnit.PublicAPI— 4/4 passed on net10; verified files for net9 / net8 / net472 updated consistently.Test plan
ToTestMetadatais no longer in the top self-time hits