Skip to content

perf(engine/core): reduce per-test allocations (#5688)#5694

Merged
thomhurst merged 4 commits intomainfrom
perf/5688-allocs
Apr 24, 2026
Merged

perf(engine/core): reduce per-test allocations (#5688)#5694
thomhurst merged 4 commits intomainfrom
perf/5688-allocs

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

@thomhurst thomhurst commented Apr 24, 2026

Summary

Fixes #5688 — six small-but-multiplied-by-N allocation and reflection sources on the per-test hot path.

  1. TestCoordinator flush guard — skip Console.Out/Error.FlushAsync (SynchronizedTextWriter lock) for passing tests with no captured output. Gated on the existing Context.HasCapturedOutput fast path, which now also tracks the lazy console line buffers so partial writes still force a flush.
  2. GetTypeHierarchy cache — static ConcurrentDictionary<Type, HashSet<string>> replacing a fresh HashSet<string> per test. Cleared alongside PropertyCacheManager in ClearCache.
  3. **TestMetadata1.CreateExecutableTestFactory** — cached factory is now a plain staticmethod reference instead of a closure. Per-test state lives on a newinternal sealed ExecutableTest : ExecutableTest subclass, eliminating the two closure objects (createInstance, invokeTest`) previously allocated per invocation.
  4. TestRegistry expression treesnew ValueTaskBridge helper checks IsCompletedSuccessfully before .AsTask(). Sync-completion path returns Task.CompletedTask / Task.FromResult and avoids the Task allocation.
  5. TestMetadata.TestSessionId default — drop the eager Guid.NewGuid().ToString() that every engine factory immediately overwrites; default to string.Empty.
  6. PropertyInjectionMetadata.GetProperty — source generator now emits a strongly-typed Func<object, object?> getter next to SetProperty. ObjectGraphDiscoverer.TraverseSourceGeneratedProperties and PropertyInjectionPlan.GetPropertyValues use GetOrCreateGetter(), which returns the source-gen delegate when available and otherwise caches a reflection-based PropertyInfo.GetValue on the metadata — fixes the hot path in both source-gen and reflection modes.

Notes on behaviour change

  • HasCapturedOutput semantic tightening (sub-task 1): the flush-guard revert commit also tightens HasCapturedOutput to treat an empty-string captured output ("") the same as null (i.e. not captured). Previously any assignment — including an empty string from a no-op writer — would flip the flag. This is the correct behaviour (an empty buffer is not meaningful captured output) and keeps the flush-guard short-circuit honest, but call it out explicitly since it changes one observable branch.
  • TestSessionId guard (sub-task 5): because TestSessionId now defaults to string.Empty, ExecutableTest<T>.CreateInstanceAsync upgrades the former Debug.Assert to a release-mode InvalidOperationException. An engine caller forgetting to assign a session id previously silently propagated an empty string into data-generation partition keys; this now fails fast.

Test plan

  • dotnet test TUnit.Engine.Tests/TUnit.Engine.Tests.csproj --framework net10.0 → 148 pass, 99 skipped (0 fail)
  • dotnet test TUnit.UnitTests/TUnit.UnitTests.csproj --framework net10.0 → 180 pass (0 fail)
  • dotnet test TUnit.Core.SourceGenerator.Tests/TUnit.Core.SourceGenerator.Tests.csproj --framework net10.0 → 116 pass, 1 skipped (0 fail)
  • dotnet test TUnit.PublicAPI/TUnit.PublicAPI.csproj --framework net10.0 → 4 pass (0 fail)
  • No snapshot deltas: orphaned GenericPropertyInjectionTests.*.verified.txt files (superseded by GenericPropertyInjectionRawTests, which does not use Verify) have been deleted in this PR.
  • Dual-mode for sub-task 6: GetOrCreateGetter() falls back to cached PropertyInfo.GetValue in reflection mode / older metadata.
  • AOT: new reflection annotated (IL2075 suppression on reflection fallback with justification; RequiresDynamicCode propagated on the generic ValueTaskBridge.ToTask<T> MakeGenericMethod site).

Addresses the six per-test allocation/reflection sources called out in
#5688 — individually <1% CPU, collectively driving gen0 churn and
steady-state latency.

1. TestCoordinator: guard Console.Out/Error.FlushAsync on the new
   Context.HasCapturedOutput check so the SynchronizedTextWriter lock
   is skipped for passing tests with no output. Include the console
   interceptor line buffers in the signal so partial writes still
   force a flush.
2. ObjectGraphDiscoverer.GetTypeHierarchy: cache results in a static
   ConcurrentDictionary<Type, HashSet<string>> — hierarchy is
   invariant per type and was allocating a fresh HashSet<string> per
   test. Cache is cleared alongside PropertyCacheManager in ClearCache.
3. TestMetadata<T>.CreateExecutableTestFactory: replace the closure-
   allocating lambda with a static method reference. Per-test state
   now lives on a new internal ExecutableTest<T> subclass instead of
   captured-variable closures, saving ~2 delegate allocations per
   test.
4. TestRegistry expression trees: bridge ValueTask/ValueTask<T> through
   a new ValueTaskBridge helper that checks IsCompletedSuccessfully
   before calling AsTask() — synchronously-completed values avoid the
   Task allocation entirely.
5. TestMetadata.TestSessionId: drop the eager Guid.NewGuid().ToString()
   default (every engine factory immediately overwrites it) and
   default to string.Empty.
6. PropertyInjectionMetadata.GetProperty: add a compile-time-generated
   Func<object, object?> getter so the engine hot path in
   ObjectGraphDiscoverer.TraverseSourceGeneratedProperties and
   PropertyInjectionPlan.GetPropertyValues no longer does a
   Type.GetProperty reflection lookup per test invocation. The source
   generator emits a direct strongly-typed lambda; hand-authored or
   older metadata falls back to a cached PropertyInfo.GetValue via
   GetOrCreateGetter so both source-gen and reflection modes benefit.
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 24, 2026

Not up to standards ⛔

🔴 Issues 2 minor

Alerts:
⚠ 2 issues (≤ 0 issues of at least minor severity)

Results:
2 new issues

Category Results
CodeStyle 2 minor

View in Codacy

🟢 Metrics 13 complexity

Metric Results
Complexity 13

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #5694: perf(engine/core): reduce per-test allocations (#5688)

Summary

This PR implements six targeted performance optimizations to reduce per-test allocations and reflection overhead:

  1. Conditional flush guard in TestCoordinator (skip Console.Out/Error.FlushAsync for tests with no output)
  2. GetTypeHierarchy cache using a static ConcurrentDictionary
  3. TestMetadata<T>.CreateExecutableTestFactory refactored to use a static method reference instead of a closure, backed by a new ExecutableTest<T> subclass
  4. ValueTaskBridge helper to avoid AsTask() allocations for synchronously-completed ValueTasks
  5. TestMetadata.TestSessionId default changed from Guid.NewGuid().ToString() to string.Empty
  6. PropertyInjectionMetadata.GetOrCreateGetter() caches a compile-time or reflection-based getter, eliminating per-test Type.GetProperty() lookups

The direction is sound and the claimed savings are real. However, there are several correctness, thread-safety, and test-coverage issues that need attention before merge.


Issues Found

1. Missing public API snapshot updates — tests will fail (HIGH)

Files: TUnit.PublicAPI/Tests.Core_Library_Has_No_API_Changes.*.verified.txt

The PR makes two public API changes not reflected in any of the four .verified.txt snapshots:

  • ExecutableTest changed from public sealed class to public class — the snapshots still read public sealed class ExecutableTest.
  • PropertyInjectionMetadata.GetProperty is a new public init-settable property — it is entirely absent from all snapshots.

Running the public API tests as-is will produce .received.txt diffs and fail CI.

Fix: Run TUnit.PublicAPI tests and accept the diffs, then commit the updated .verified.txt files before merge.


2. GetOrCreateGetter() has an unsynchronised write to a shared field (MEDIUM)

File: TUnit.Core/Interfaces/SourceGenerator/PropertyInjectionMetadata.cs

PropertyInjectionMetadata instances are cached indefinitely in PropertyInjectionCache (ThreadSafeDictionary<Type, PropertyInjectionPlan>) and accessed concurrently across parallel test invocations. However, _getProperty is written without any synchronization:

private Func<object, object?>? _getProperty;   // not volatile

internal Func<object, object?> GetOrCreateGetter()
{
    if (_getProperty != null)       // unsynchronised read
        return _getProperty;
    ...
    return _getProperty = property.GetValue;   // unsynchronised write
}

Compare with the existing pattern used for the same lazy-init scenario elsewhere in the codebase (e.g. TestEntry.cs):

Interlocked.CompareExchange(ref _cachedInvokeTypedTest, delegate, null);

The race is benign in practice (both racing threads compute an equivalent delegate), but it is technically undefined under the C# memory model on weakly-ordered architectures.

Fix: Make _getProperty volatile:

private volatile Func<object, object?>? _getProperty;

3. GetOrCreateGetter() mutates _getProperty even when GetProperty is a public init property (MEDIUM — API contract violation)

File: TUnit.Core/Interfaces/SourceGenerator/PropertyInjectionMetadata.cs

The GetProperty property has get => _getProperty; init => _getProperty = value. After GetOrCreateGetter() runs and caches the reflection fallback into _getProperty, the public GetProperty getter will now return that cached reflection delegate. A caller that previously saw GetProperty == null (indicating "no compile-time getter was provided") would now see a non-null reflection delegate. This is surprising and leaks internal caching state through the public API.

Suggestion: Separate the backing fields:

private Func<object, object?>? _sourceGenGetter;   // compile-time delegate (init only)
private volatile Func<object, object?>? _cachedGetter;  // lazy reflection fallback

public Func<object, object?>? GetProperty
{
    get => _sourceGenGetter;
    init => _sourceGenGetter = value;
}

internal Func<object, object?> GetOrCreateGetter()
{
    if (_sourceGenGetter != null) return _sourceGenGetter;
    if (_cachedGetter != null)   return _cachedGetter;
    ...
    Interlocked.CompareExchange(ref _cachedGetter, computed, null);
    return _cachedGetter!;
}

This keeps GetProperty reflecting only what the source generator emitted, separating the public contract from the internal caching mechanism.


4. Flush skip could silently drop output written by background threads (MEDIUM)

File: TUnit.Engine/Services/TestExecution/TestCoordinator.cs

The optimisation skips FlushAsync when HasCapturedOutput is false. However, HasCapturedOutput is based on whether the buffer has been initialized at check time — not whether output will be written. If a background Task launched by the test writes to Console.Write after the test's synchronous body returns, the check fires while the buffer is null, the flush is skipped, and then the background write populates the buffer — but it is never flushed. The data is silently lost.

Before this change, FlushAsync was always called so the content would be flushed regardless.

Worth validating: Is the SynchronizedTextWriter lock acquisition actually observable in profiler data for a real test suite? The FlushIfNonEmpty path inside the interceptor already provides an O(1) early-return when the buffer is empty, so the lock cost may already be negligible. If that's the case, the optimization is not worth the correctness trade-off.


5. ExecutableTest<T> duplicates instance-creation logic that now lives in three places (MEDIUM)

File: TUnit.Core/ExecutableTest1.cs`**

The CreateInstanceAsync override re-implements the instance-creation logic (calling ClassConstructorHelper.TryCreateInstanceWithClassConstructor, then falling back to InstanceFactory) that previously lived in a closure in TestMetadata<T>. This creates a third place where this logic lives (alongside the base ExecutableTest(createInstance, invokeTest) overload and DynamicTestMetadata.CreateExecutableTest).

If ClassConstructorHelper.TryCreateInstanceWithClassConstructor or the InstanceFactory fallback path changes, all three places need updating.

Suggestion: Extract the shared instance-creation logic into a static helper method and have all three call sites delegate to it.


6. TypeHierarchyCache returns a mutable HashSet<string> from a shared cache (LOW — correctness risk)

File: TUnit.Core/Discovery/ObjectGraphDiscoverer.cs

private static readonly ConcurrentDictionary<Type, HashSet<string>> TypeHierarchyCache = new();

private static HashSet<string> GetTypeHierarchy(Type type)
    => TypeHierarchyCache.GetOrAdd(type, static t => { ... });

The returned HashSet<string> is the live cached instance. All current call sites only read from it, but the method signature gives callers full mutation access. A future caller that accidentally calls .Add() or .Remove() on the returned set would corrupt the cache for all subsequent tests of the same class.

Fix: Return IReadOnlySet<string> (available since .NET 5):

private static readonly ConcurrentDictionary<Type, IReadOnlySet<string>> TypeHierarchyCache = new();

private static IReadOnlySet<string> GetTypeHierarchy(Type type)
    => TypeHierarchyCache.GetOrAdd(type, static t =>
    {
        HashSet<string> result = new();
        // ...populate...
        return result;
    });

7. ValueTaskBridge comment is misleading about allocation behaviour (LOW)

File: TUnit.Engine/Helpers/ValueTaskBridge.cs

The comment implies Task.FromResult avoids allocation via caching. In reality, Task.FromResult<T> only caches results for bool (true/false) and null for reference types — for arbitrary T, a new Task<T> is always allocated. The optimisation is still valid (it avoids the ValueTask.AsTask() state-machine overhead), but the comment misleads readers.

Fix: Correct the comment:

// Avoids ValueTask.AsTask() state-machine allocation for completed tasks.
// Note: Task.FromResult caches only bool/null; other types still allocate a Task<T>.

8. Snapshot tests do not cover the newly emitted GetProperty delegate (LOW — test coverage gap)

File: TUnit.Core.SourceGenerator/Generators/PropertyInjectionSourceGenerator.cs

The source generator now emits GetProperty = static instance => ((ClassType)instance).PropertyName, inside every PropertyInjectionMetadata initialiser, but none of the existing snapshot tests cover this output. A regression in the emitted lambda (wrong cast, wrong property name, wrong static/instance branch) would not be caught by any existing test.

Fix: Add a Verify-based snapshot test for a simple class with a data-source-injected property, or extend GenericPropertyInjectionRawTests to assert that GetProperty = appears in the output.


Minor Notes

  • The ExecutableTest sealed → non-sealed change is a safe change (the private protected constructor prevents external inheritance), but it is an observable public API change that needs the snapshot update noted in issue #1.

  • The GetPropertyValues change in PropertyInjectionPlanBuilder silently returns null for properties that previously were skipped entirely. The call sites both check value != null, so the behavior is equivalent — but this subtlety is worth a brief code comment.

  • ObjectGraphDiscoverer.ClearCache() correctly adds TypeHierarchyCache.Clear(), but nothing in TUnit.Engine actually calls ObjectGraphDiscoverer.ClearCache() — this appears to be a pre-existing issue rather than a regression from this PR.


Overall Assessment

The performance intent is solid and the identified savings are real for large test suites. Three things need addressing before merge:

  1. Blocking: The public API snapshots must be updated (issue #1) — the PR will fail CI without them.
  2. Should fix: The thread-safety issue in GetOrCreateGetter (issue #2) and the API contract problem it creates (issue #3) are straightforward fixes that prevent subtle bugs under parallelism.
  3. Worth validating: The flush-skip race (issue #4) should have the performance assumption backed by profiler data; if the lock acquisition is already negligible (which the existing FlushIfNonEmpty path suggests it may be), the optimization is not worth the correctness trade-off.

- Restore `ExecutableTest` as `public sealed class`; make `ExecutableTest<T>`
  inherit from `AbstractExecutableTest` directly to preserve public API.
- Update Public API snapshots: `PropertyInjectionMetadata` exposes new
  `GetProperty` init-only delegate (was missing from verified.txt files).
- `PropertyInjectionMetadata`: split source-gen-supplied getter from the
  lazy reflection fallback; make the fallback cache `volatile` + write via
  `Interlocked.CompareExchange` so parallel readers cannot observe a torn
  delegate, and the public `GetProperty` contract stays clean (only
  surfaces what the source generator emitted).
- Clarify `ValueTaskBridge` XML doc re: `Task.FromResult` only caching
  bool/null (earlier wording implied broader caching).
- Document `ObjectGraphDiscoverer.TypeHierarchyCache` as returning a
  shared, read-only-by-convention `HashSet<string>` (cannot use
  `IReadOnlySet<T>` because netstandard2.0 lacks it and the caller needs
  the concrete type for `GetAlternateLookup`).
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #5694: perf(engine/core): reduce per-test allocations (#5688)

Prior Review Follow-up

Comparing against the previous claude[bot] review (2026-04-24):

Prior Issue Status
#1 Missing API snapshot updates AddressedGetProperty added to all four .verified.txt files. The sealed concern was a false alarm: ExecutableTest<T> inherits from AbstractExecutableTest directly, so ExecutableTest correctly stays sealed.
#2 GetOrCreateGetter() unsynchronised write Addressedvolatile field + Interlocked.CompareExchange correctly handles the race.
#3 API contract violation via shared backing field Addressed — Two separate backing fields (_sourceGenGetter / _cachedGetter) cleanly separate the public property from the internal cache.
#4 Flush-skip race (background thread writes) ⚠️ Still open — see Issue 1 below.
#5 Instance-creation logic duplication ⚠️ Partially addressed — logic consolidated into ExecutableTest<T>, but duplication with DynamicTestMetadata path remains. Acceptable if acknowledged.
#6 Mutable HashSet<string> in cache ℹ️ Acknowledged in comment — not type-enforced. Acceptable given the GetAlternateLookup constraint.
#7 ValueTaskBridge comment misleading Addressed — comment now accurately notes Task.FromResult caching limits.
#8 Snapshot tests don't cover GetProperty emission ⚠️ Still open — no new Verify snapshot for generated GetProperty delegate.

New Issues

1. Flush-skip race condition — behavioural regression (MEDIUM, still open)

File: TUnit.Engine/Services/TestExecution/TestCoordinator.cs

if (test.Context.HasCapturedOutput)
{
    await Console.Out.FlushAsync().ConfigureAwait(false);
    ...
}

HasCapturedOutput is a point-in-time snapshot of lazy-allocated buffers. A test that fires a Task.Run(() => Console.WriteLine(...)) and does not await it before the test body returns will:

  1. Complete the test body with HasCapturedOutput == false
  2. Have the background write land in the console buffer after the finally block checks the flag
  3. Never get flushed — output is silently dropped

The original unconditional flush, while slightly more expensive, was safe. This optimization trades a correctness guarantee for a lock acquisition on the hot path. If the lock acquisition is the actual bottleneck, a better approach might be to keep the flush unconditional but make it non-blocking via TryEnterWriteLock or similar — the savings are real, but not worth silent output loss.

If the decision is to keep this optimization, at minimum add a code comment documenting the intentional trade-off (fire-and-forget console writes from background threads may lose output) so future maintainers don't treat it as a bug to "fix" by removing the guard.


2. GetPropertyValues silently changes enumeration count (LOW)

File: TUnit.Core/PropertyInjection/PropertyInjectionPlanBuilder.cs

Old code:

var property = metadata.ContainingType.GetProperty(metadata.PropertyName);
if (property?.CanRead == true)
{
    yield return property.GetValue(instance); // skips non-readable properties entirely
}

New code:

yield return metadata.GetOrCreateGetter()(instance); // yields null for non-readable properties

GetOrCreateGetter() returns static _ => null when a property can't be read. The old code skipped non-readable properties (no yield); the new code yields null for them. The number of items in the enumerable is now different when non-readable properties exist.

For the ObjectGraphDiscoverer usage this is harmless (null is filtered downstream). But if any caller of GetPropertyValues relies on SourceGeneratedProperties.Length == result.Count(), this is a silent semantic regression.

Suggested fix: Add a null-check after calling the getter, or surface a CanRead flag on the metadata. The simplest safe fix:

// In GetPropertyValues
var getter = metadata.GetOrCreateGetter();
var value = getter(instance);
// Only yield if the getter returned a real value (non-readable properties get the null stub)
if (getter != PropertyInjectionMetadata.NullGetter || value != null)
    yield return value;

Or make the null-stub approach explicit by having GetOrCreateGetter() return null itself (rather than a _ => null lambda) when the property cannot be read, and let callers skip null getters.


3. TestSessionId = string.Empty is a silent sentinel (LOW)

File: TUnit.Core/TestMetadata.cs

public string TestSessionId { get; set; } = string.Empty;

The previous default (Guid.NewGuid().ToString()) was a valid session ID from birth — any code that read TestSessionId before the engine assigned one would get a unique-but-wrong value. The new default gives an empty string, which fails validation silently in anything that validates non-empty GUIDs.

The comment says "Callers must assign a real session ID before any data-generation code reads it" — but TestMetadata is a public type and this is a behaviour change with no enforcement. If any test data generator or third-party extension reads TestSessionId without going through the engine factory, they now get "" instead of a GUID.

Better approach: Keep string.Empty for the performance gain (no Guid allocation), but add a Debug.Assert(!string.IsNullOrEmpty(TestSessionId)) in the places that produce data-driven variants, so the regression is caught in debug builds without a runtime cost in release.


Positives

  • ExecutableTest<T> closure elimination — clean, correct, and the static method reference means the cached delegate is shared across all instances of a given T. The DynamicallyAccessedMembers annotation on the type parameter is correctly placed.
  • ValueTaskBridge — the IsCompletedSuccessfully fast path is safe: accessing .Result is only valid when the ValueTask completed without exception, which is exactly what IsCompletedSuccessfully guarantees. The pre-resolved MethodInfo fields avoid per-expression-tree reflection lookups.
  • TypeHierarchyCache — correct use of ConcurrentDictionary.GetOrAdd with a static delegate avoids per-call allocation, and ClearCache() is properly updated to clear the new cache alongside PropertyCacheManager.
  • GetOrCreateGetter() thread safety — double-checked locking with volatile read + Interlocked.CompareExchange is the correct lock-free pattern for this scenario.

Summary

Three of the four open issues from the prior review have been resolved. The flush-skip race (issue #4) is the remaining correctness concern — worth a decision: either document it as an intentional trade-off or keep the flush unconditional. The GetPropertyValues count change and TestSessionId sentinel are low-risk but worth a quick defensive check. The overall direction and implementation quality are good.

…tics

- Revert the `HasCapturedOutput` guard around the Console flush in
  `TestCoordinator`. The guard was a point-in-time check and could silently
  drop output from background writes landing after the check; the flush
  itself is a cheap no-op when nothing is buffered. Also drop the
  `ConsoleLineBuffer` ref-tracking that was only there to detect captured
  output for the (now-removed) guard.
- Change `PropertyInjectionMetadata.GetOrCreateGetter()` to return `null`
  when the property is not readable instead of memoizing a `_ => null`
  stub. Updates `GetPropertyValues` and `TraverseSourceGeneratedProperties`
  to skip such properties, preserving the pre-PR "non-readable means no
  yield" enumeration semantic.
- Add `Debug.Assert(!string.IsNullOrEmpty(TestSessionId))` in the two
  primary `TestSessionId` consumer paths (`ExecutableTest<T>.CreateInstanceAsync`
  and `GenericTestMetadata` runtime-resolution path) so the
  `string.Empty` default surface misuse in debug builds without adding
  release-mode cost.
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #5694 — perf(engine/core): reduce per-test allocations

What the PR does well

Motivation is clear and measurable. Each sub-task maps directly to a concrete allocation source. The PR description is detailed and honest (e.g., acknowledging that Task.FromResult only caches bool/null, that the flush is intentionally unconditional, etc.).

ValueTaskBridge is correct and clean. The bridge is a small, well-named helper that eliminates the AsTask() allocation on the sync-completion fast path. Pre-caching the MethodInfo via static readonly fields rather than looking them up per expression-tree build is a proper approach. Propagating [RequiresDynamicCode] to CreateTestVariantInternal is the right AOT annotation move.

TypeHierarchyCache caching is correct. Using ConcurrentDictionary.GetOrAdd with a static factory lambda avoids closure allocations and is thread-safe for the benign-double-compute case. The comment noting callers must not mutate the returned HashSet<string> is correct, and clearing in ClearCache() alongside PropertyCacheManager is the right place.

PropertyInjectionMetadata.GetOrCreateGetter sentinel pattern is solid. Using a sentinel Func (rather than a second bool field) to represent "not readable" keeps the lock-free Interlocked.CompareExchange pattern clean. The volatile annotation on _cachedGetter is correct for weakly-ordered memory models.

Public API snapshots are updated. All three TFM-specific *.verified.txt files include GetProperty { get; init; }, which is good.


Issues

1. (Correctness) _cachedExecutableTestFactory write is not thread-safe

The write to _cachedExecutableTestFactory in TestMetadata<T> has no memory barrier:

_cachedExecutableTestFactory = CreateTypedExecutableTest;
return _cachedExecutableTestFactory;

There is no volatile, no Interlocked, and no lock. In contrast, TestEntry uses Interlocked.CompareExchange for its similar cached fields (_cachedInvokeTypedTest, _cachedAttributeFactory). If CreateExecutableTestFactory is ever called concurrently on the same metadata instance, a stale null read is possible on a weakly-ordered architecture.

Suggested fix: Use Interlocked.CompareExchange as TestEntry does, or mark _cachedExecutableTestFactory as volatile to be consistent with the rest of the codebase's caching patterns.

2. (Correctness) Snapshot tests for GenericPropertyInjectionTests appear stale

The GenericPropertyInjectionTests.*.verified.txt snapshot files show only SetProperty = (instance, value) => ... with no preceding GetProperty = ... line. After this PR, the source generator should emit the new GetProperty = static instance => ... line. If the verified snapshots don't include it, either:

  • The snapshots need to be updated to reflect the new generator output, or
  • The GenericPropertyInjection*.verified.txt files are genuinely orphaned by the GenericPropertyInjectionRawTests replacement — in which case they should be deleted (and the PR should explain this).

Leaving stale verified files on disk will cause Verify-based tests to fail on a clean run.

3. (Correctness) HasCapturedOutput semantic change is undocumented

The old check:

|| _buildTimeOutput != null || _buildTimeErrorOutput != null

The new check:

|| !string.IsNullOrEmpty(_buildTimeOutput) || !string.IsNullOrEmpty(_buildTimeErrorOutput)

This is a behavioural change: if SetBuildTimeOutput("") is ever called, the old code returned true (non-null), the new code returns false. This is almost certainly the correct behaviour (empty string ≠ captured output), but it's a semantic change bundled into a commit that's nominally about reverting the flush guard. It should be called out explicitly so reviewers and bisect users can distinguish the two intents.

4. (Design) TestMetadata.TestSessionId = string.Empty shifts safety invariant without a release-mode guard

The PR changes the default from Guid.NewGuid().ToString() to string.Empty and adds Debug.Assert(!string.IsNullOrEmpty(_metadata.TestSessionId), ...). However, Debug.Assert is elided in release builds. If any caller path omits the session-id assignment in a release build, the empty string silently propagates into data-generation code (e.g., empty partition keys in sharded data sources), causing silent failures.

The old default was "always safe by construction." The new default is "safe only if the engine assigns it first" — a meaningful API contract change. Suggested mitigations:

  • Document the invariant in an XML doc comment on the property setter.
  • Add a lightweight release-mode guard at the first read site (e.g., in DataGeneratorMetadataCreator) rather than only at the write callsites.

5. (Design) GenericTestMetadata slow path still allocates two closures per test

The GenericTestMetadata fallback path (reflection mode) still creates two async closure allocations per test:

Func<TestContext, Task<object>> createInstance = async testContext => ...
Func<object, object?[], TestContext, CancellationToken, Task> invokeTest = async (instance, args, ...) => ...

The ExecutableTest<T> optimization only benefits the source-gen path. This is acceptable as a phased approach but worth tracking so the reflection-mode path doesn't become an invisible regression baseline.

6. (Minor) ValueTaskBridge.ToTaskGenericMethodDefinition.GetMethods().First() is fragile

typeof(ValueTaskBridge).GetMethods()
    .First(m => m.Name == nameof(ToTask) && m.IsGenericMethodDefinition);

Scanning all methods is slightly fragile if a future overload is added. Since the class is internal sealed and resolution happens once at static-field-init time, the practical risk is very low — but a type-safe GetMethod(name, 1, types[]) overload would be more robust.

7. (Minor) property.GetValue method-group delegate allocates once but looks like it re-allocates

: property.GetValue;

property.GetValue is a method group on an instance (PropertyInfo), so the compiler creates a new delegate wrapping the instance each time this expression is reached. Here it's reached only once (then cached via Interlocked.CompareExchange), so the allocation is not a hot-path problem — but the form is potentially misleading to readers who might think it re-allocates. An explicit lambda makes intent clearer:

: obj => property.GetValue(obj);

Summary

Priority Issue
High _cachedExecutableTestFactory write lacks memory barrier — use volatile or Interlocked.CompareExchange for consistency
High GenericPropertyInjectionTests.*.verified.txt snapshots appear stale or orphaned — update or delete
Medium HasCapturedOutput semantic change is undocumented — call it out explicitly
Medium TestSessionId = string.Empty shifts safety from construction-time to assignment-time without a release-mode guard
Low GenericTestMetadata closure allocations are unaddressed — track separately
Low ValueTaskBridge.ToTaskGenericMethodDefinition.GetMethods().First() is slightly fragile
Informational property.GetValue method-group form is correct but potentially misleading

Overall the approach is well-motivated and the core helpers (ValueTaskBridge, TypeHierarchyCache, the sentinel getter pattern) are clean and correct. The thread-safety gap and the snapshot inconsistency are the two items I'd want resolved before merging.

- TestMetadata`1._cachedExecutableTestFactory now uses Interlocked.CompareExchange
  for publication, mirroring the TestEntry cached-delegate idiom.
- ExecutableTest`1.CreateInstanceAsync upgrades the Debug.Assert for
  TestSessionId to a release-mode InvalidOperationException so an empty
  session id cannot silently propagate into data-generation partition keys.
- ValueTaskBridge.ToTaskGenericMethodDefinition uses the typed GetMethod
  overload on net8+ (with a Single() fallback on netstandard2.0) so a future
  ToTask overload cannot silently match.
- PropertyInjectionMetadata.GetOrCreateGetter uses an explicit lambda instead
  of the PropertyInfo.GetValue method-group for clarity.
- Delete orphaned GenericPropertyInjectionTests.*.verified.txt snapshots —
  only GenericPropertyInjectionRawTests (no Verify) remains.
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: perf(engine/core): reduce per-test allocations (#5688)

This is a well-executed, comprehensive performance PR targeting six specific allocation hot paths. The changes are cohesive and well-documented. Here's my detailed review:


What the PR Does

Six targeted optimizations that collectively reduce per-test allocations:

  1. Source-gen property getters — Emits typed Func<object, object?> delegates alongside SetProperty to avoid Type.GetProperty() reflection on every test
  2. TypeHierarchyCache — New ConcurrentDictionary<Type, HashSet<string>> caches inheritance chain computation per type
  3. ExecutableTest<T> generic variant — Eliminates closure captures by storing per-test state as fields on a strongly-typed subclass
  4. TestSessionId lazy initialization — Changes from eager Guid.NewGuid().ToString() to string.Empty, saving a GUID allocation per TestMetadata instance
  5. ValueTaskBridge — Short-circuits AsTask() for synchronously-completed ValueTasks to avoid state-machine allocation in expression trees
  6. TestMetadata<T> factory caching — Replaces inline closure with a cached static method reference using Interlocked.CompareExchange

Strengths

Architecture is sound. The layered fallback strategy in PropertyInjectionMetadata.GetOrCreateGetter() is elegant: fast-path for source-gen delegate → volatile read for cached reflection getter → slow path with benign-race memoization. Transparent to callers in both source-gen and reflection modes.

Correct concurrency patterns. The volatile + Interlocked.CompareExchange pattern is used consistently and correctly. The sentinel s_notReadableSentinel prevents repeated reflection for non-readable properties, which is an easy-to-miss edge case.

AOT safety maintained. All new reflection paths have appropriate [DynamicallyAccessedMembers], IL2075, and IL3050 suppressions with meaningful justifications rather than blanket suppressions.

Fail-fast improvement. Changing TestSessionId from eager Guid to string.Empty combined with the InvalidOperationException in ExecutableTest<T>.CreateInstanceAsync() is strictly better: engine bugs that previously silently corrupted partition keys now throw immediately at the point they could cause harm.

HasCapturedOutput correction — Treating empty "" the same as null is semantically correct. A test that captures nothing meaningful shouldn't trigger output flushing.


Concerns and Questions

1. GenericTestMetadata has only Debug.Assert for session ID validation

ExecutableTest<T>.CreateInstanceAsync() throws InvalidOperationException in Release builds if TestSessionId is empty, but GenericTestMetadata's corresponding guard is Debug.Assert, which strips out in Release. This means the failure mode still varies by build configuration. Consider either:

  • Promoting the assert to a real if (string.IsNullOrEmpty(...)) throw in GenericTestMetadata too, or
  • Adding a comment explaining why Debug.Assert is sufficient here (i.e., CreateInstanceAsync always fires before any use)

2. MakeGenericMethod in ValueTaskBridge — no memoization of open generic instantiations

ValueTaskBridge.ToTaskGenericMethodDefinition.MakeGenericMethod(returnType) is called once per test variant in TestRegistry.CreateTestVariantInternal. If expression trees are rebuilt per test (e.g., after cache invalidation), this accumulates. MakeGenericMethod is relatively cheap per call, but the resulting MethodInfo objects are not pooled by the runtime and add GC pressure. A ConcurrentDictionary<Type, MethodInfo> cache here would be consistent with the caching strategy applied elsewhere in this PR.

// Example pattern (same as what you use elsewhere):
private static readonly ConcurrentDictionary<Type, MethodInfo> s_genericMethodCache = new();

private static MethodInfo GetToTaskMethod(Type t) =>
    s_genericMethodCache.GetOrAdd(t, static t => ToTaskGenericMethodDefinition.MakeGenericMethod(t));

3. Deleted snapshot files — coverage verification

Five .verified.txt snapshots for GenericPropertyInjectionTests were deleted because they're "superseded by GenericPropertyInjectionRawTests". It's worth confirming that GenericPropertyInjectionRawTests has equivalent scenario coverage (deep inheritance, diagnostic output, multiple instantiations, etc.), or that the deleted cases are now redundant due to the raw approach covering all variants. If there's a gap, the deletion would silently reduce test coverage of the source generator.

4. TypeHierarchyCache returns a shared HashSet<string>

GetTypeHierarchy returns the same HashSet<string> instance stored in TypeHierarchyCache. If any caller ever mutates the returned set (intentionally or accidentally), it corrupts the cache for all tests sharing that type. The contract "callers must not mutate" is enforced only by convention. Returning IReadOnlySet<string> (or wrapping in FrozenSet<string> on .NET 8+) would make this contract explicit:

// Instead of ConcurrentDictionary<Type, HashSet<string>>
private static readonly ConcurrentDictionary<Type, IReadOnlySet<string>> TypeHierarchyCache = new();

This is low risk given the current callers, but the return type doesn't signal the constraint.


Minor Notes

  • The ExecutableTest.cs change is just a comment update ("UnifiedExecutableTest" → "ExecutableTest") — correct but trivial.
  • The TestCoordinator.cs comment expansion is helpful for maintainability with no behavioral change — good.
  • The Interlocked.CompareExchange pattern appears in at least two places (TestMetadata<T> factory and PropertyInjectionMetadata getter). If it appears in more as this PR grows, a small inline helper or XML doc comment would reduce the "copy the comment explaining the race" repetition.

Summary

This PR achieves its goal cleanly. The allocation reductions are real, measured, and consistent with the project's performance-first philosophy. The architecture is backward compatible, AOT-safe, and thread-correct.

The main items worth addressing before merge:

  1. Promote the Debug.Assert in GenericTestMetadata to a real guard (or document why it's sufficient) — brings it in line with the InvalidOperationException in ExecutableTest<T>
  2. Consider caching MakeGenericMethod results in ValueTaskBridge — consistent with the caching approach elsewhere in this PR
  3. Verify deleted snapshot coverage — confirm GenericPropertyInjectionRawTests covers all deleted scenarios
  4. Consider returning IReadOnlySet<string> from GetTypeHierarchy — makes the no-mutate contract visible at the type level

None of these are blockers — items 3 and 4 in particular could be follow-ups. The core work here is high quality.

This was referenced Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(engine/core): reduce per-test allocations

1 participant