Skip to content

perf(engine): remove duplicate object-graph traversal in ObjectLifecycleService (#5718)#5729

Closed
thomhurst wants to merge 2 commits intomainfrom
perf/5718-object-graph-dedup
Closed

perf(engine): remove duplicate object-graph traversal in ObjectLifecycleService (#5718)#5729
thomhurst wants to merge 2 commits intomainfrom
perf/5718-object-graph-dedup

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

Closes #5718

Summary

  • Every test walks its property / IAsyncInitializer object graph twice: once during registration (TrackableObjectGraphProvider.GetTrackableObjectsObjectGraphDiscoverer.DiscoverAndTrackObjects) to populate TestContext.TrackedObjects, and again during execution inside InitializeObjectWithSpanAsync, which re-walks every tracked root's nested graph via InitializeNestedObjectsForExecutionAsync.
  • InitializeTrackedObjectsAsync iterates TrackedObjects in descending-depth order and every reachable nested object is already flattened into that sorted list at registration time, so deeper dependencies are already initialized by the time a shallower object is processed. The per-object nested walk is redundant.
  • Removed the InitializeNestedObjectsForExecutionAsync(obj) call inside InitializeObjectWithSpanAsync. Each IAsyncInitializer is still called exactly once — ObjectInitializer.InitializeAsync is deduplicated via Lazy<Task>, and initialization order (deepest-first) is preserved by the existing depth-descending iteration.
  • InitializeObjectForExecutionAsync (the class-data init called before the test class constructor) keeps its nested walk, since TrackedObjects has not yet been iterated at that point.

Impact

Estimated ~0.3-0.5% CPU across the suite; halves the inclusive time of InitializeObjectWithSpanAsync and TraverseInitializerProperties. No public API change.

Correctness analysis

DiscoverAndTrackObjects (tracking) and DiscoverNestedObjectGraph (execution) share the same traversal functions (TraverseInjectableProperties + TraverseInitializerProperties). They differ only in one early-out (useSourceRegistrarCheck) that is a pure micro-optimization when there are no injectable properties — both code paths still descend into TraverseInitializerProperties. Tracking is therefore a (super)set of what execution-phase discovery would find, so every object reachable at execution time is already present in TrackedObjects.

Test plan

  • dotnet build TUnit.slnx -c Release — succeeds, 0 errors
  • Snapshot tests: TUnit.Core.SourceGenerator.Tests — 116 passed / 1 skipped (pre-existing)
  • TUnit.UnitTests — 180/180 pass
  • TUnit.TestProject property-injection / nested-graph smoke tests on net10.0:
    • /*/*/BasicTests/* — 3/3 pass
    • /*/*/NestedPropertyInjectionTests/* — 3/3 pass
    • /*/*/ParallelPropertyInjectionTests/* — 2/2 pass
    • /*/*/AbstractBaseClassPropertyInjectionTests/* — 1/1 pass
    • /*/*/GenericPropertyInjectionTests/* — 1/1 pass
    • /*/*/ImplicitOperatorPropertyInjectionTests/* — 1/1 pass
    • /*/*/InitializableTestClassTests/* — 6/6 pass
    • /*/*/InjectedClassDataSourceWithAsyncInitializerTests/* — 3/3 pass
    • /*/*/NestedClassDataSource*/* — 12/12 pass
    • /*/*/KeyedDataSource*/* — 4/4 pass
    • /*/*/PropertyInitializationTests/* — 2/2 pass
    • /*/*/*PropertyInjection*/* — 8 pass + 7 [EngineTest(ExpectedResult.Failure)] (identical to baseline on main)

…cleService (#5718)

Every test walked its property/initializer object graph twice: once during
registration (TrackableObjectGraphProvider.GetTrackableObjects) to populate
TestContext.TrackedObjects, and again during execution inside
InitializeObjectWithSpanAsync, which re-walked every tracked root's nested
graph via InitializeNestedObjectsForExecutionAsync.

Because InitializeTrackedObjectsAsync iterates TrackedObjects in descending
depth order and every reachable nested object is already flattened into that
sorted list at registration time, deeper dependencies are already initialized
by the time a shallower object is processed. The per-object nested walk is
redundant — each IAsyncInitializer will be called exactly once either way
(ObjectInitializer.InitializeAsync is deduplicated via Lazy<Task>).

InitializeObjectForExecutionAsync (the data-source init before the test class
constructor runs) still walks the nested graph there, since TrackedObjects
has not yet been iterated at that point.

Estimated ~0.3-0.5% CPU across the suite; halves the inclusive time of
InitializeObjectWithSpanAsync and TraverseInitializerProperties.
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 24, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #5729 — Remove Duplicate Object-Graph Traversal in ObjectLifecycleService

Summary

This PR removes the InitializeNestedObjectsForExecutionAsync(obj, cancellationToken) call from InitializeObjectWithSpanAsync, on the basis that every reachable nested object has already been flattened into TestContext.TrackedObjects during registration (DiscoverAndTrackObjects) and will be reached by the depth-descending loop in InitializeTrackedObjectsAsync.


Correctness Analysis

The core claim holds. Both DiscoverNestedObjectsForTracking (tracking path, useSourceRegistrarCheck=true) and DiscoverNestedObjects (execution path, useSourceRegistrarCheck=false) call the same two traversal methods:

  1. TraverseInjectableProperties — walks [DataSource]-annotated properties
  2. TraverseInitializerProperties — walks all IAsyncInitializer-typed properties

The useSourceRegistrarCheck parameter in TraverseInjectableProperties looks alarming at first glance but is purely a micro-optimization shortcut. When !plan.HasProperties && !useSourceRegistrarCheck the method returns early — but if !plan.HasProperties, both plan.SourceGeneratedProperties and plan.ReflectionProperties are empty arrays, so no objects would have been added anyway. The tracking superset property holds.

The remaining call to InitializeNestedObjectsForExecutionAsync (inside InitializeObjectForExecutionAsync) is the CreateInstance path for class-data constructor arguments — this pre-dates TrackedObjects population for those objects, so keeping the nested walk there is correct and appropriately documented.

One subtle concern worth raising: the correctness argument implicitly assumes that DiscoverAndTrackObjects has already been called for every test before InitializeTrackedObjectsAsync is reached. Tracing through RegisterTestAsync_objectTracker.TrackObjectstrackableObjectGraphProvider.GetTrackableObjects_discoverer.DiscoverAndTrackObjects confirms this invariant is upheld. However, this architectural coupling — "phase 1 must complete for the optimization in phase 2 to be safe" — is now implicit and invisible. See suggestions below.


Observations and Suggestions

1. Document the phase-ordering invariant in InitializeTrackedObjectsAsync

The correctness of removing the nested walk in InitializeObjectWithSpanAsync depends on RegisterTestAsync having been called for all tests in the session before execution begins. That invariant is maintained by the engine orchestration layer, not enforced by types or assertions.

A future refactor that changes the order of registration vs. execution (e.g., lazy-loading, streaming test discovery) could silently break this invariant without any compile-time or runtime signal. Consider adding a structured comment in InitializeTrackedObjectsAsync:

// Precondition: RegisterTestAsync must have been called for this TestContext before this method.
// TrackedObjects is populated during RegisterTestAsync → TrackObjects → DiscoverAndTrackObjects.
// Removing InitializeNestedObjectsForExecutionAsync from InitializeObjectWithSpanAsync
// (see #5718) relies on every reachable nested IAsyncInitializer already being in this list.

This makes the invariant visible to future maintainers without changing behaviour.

2. Add a targeted regression test for nested IAsyncInitializer inside a data source

The PR's test plan covers existing smoke tests, which is good. However, there is no test that specifically exercises the "nested IAsyncInitializer inside a data source object" scenario — the exact case that would silently regress if the invariant in point 1 is ever violated. Consider adding:

// A data source whose constructor arg has a nested IAsyncInitializer dependency:
public class OuterDataSource : IAsyncInitializer {
    public InnerDependency Inner { get; } = new();
    public Task InitializeAsync() { ... }
}
public class InnerDependency : IAsyncInitializer {
    public bool IsInitialized { get; private set; }
    public Task InitializeAsync() { IsInitialized = true; return Task.CompletedTask; }
}

A test asserting Inner.IsInitialized == true when injected through [ClassDataSource] would lock in the initialization order guarantee and catch regressions from the removed walk. The existing InjectedClassDataSourceWithAsyncInitializerTests tests the flat case but not the nested case.

3. Consider documenting InitializeNestedObjectsForExecutionAsync's narrowed scope

InitializeNestedObjectsForExecutionAsync is still called from InitializeObjectForExecutionAsync (the CreateInstance/class-data constructor path), but after this PR its scope is narrower and less obvious. The excellent <remarks> added to InitializeObjectWithSpanAsync explaining why it no longer calls into the nested walk would pair well with a symmetrical note on InitializeNestedObjectsForExecutionAsync explaining why it is still needed (objects created inline during execution, before TrackedObjects is populated for those args). This is minor but aids future navigation.

4. Missing benchmark for the performance claim

The PR description claims 0.3–0.5% CPU reduction and halving of inclusive time for the method, but no benchmark is attached. For a change whose primary motivation is performance, a BenchmarkDotNet output or profiler screenshot would make the PR easier to evaluate and serve as a regression guard. Not a blocker given the comprehensive smoke test coverage, but worth noting.


What's Done Well

  • The change is minimal and surgical — 5 lines deleted, 12 added (all documentation/comments).
  • The new <remarks> block on InitializeObjectWithSpanAsync clearly explains the invariant the method now relies on and cross-references the relevant issue. This is exactly the right level of explanation for a non-obvious optimization.
  • The updated call-site comment ("Its injected/data-source dependencies are already in TrackedObjects…") provides the key context right where it matters.
  • The PR description's correctness analysis section is thorough and demonstrates clear understanding of the code paths involved.
  • The test plan is unusually comprehensive for an internal optimization, covering reflection-mode, source-gen, parallel, and abstract-base-class variants.

Verdict

The change is correct and the optimization is sound. The two issues most worth addressing before merge are: (1) documenting the phase-ordering invariant in InitializeTrackedObjectsAsync so future readers understand the implicit precondition, and (2) adding at least one targeted test for the nested-IAsyncInitializer-inside-data-source scenario to lock in the behaviour. The missing benchmark is worth noting but is not a blocker given the thorough smoke test coverage.

CI on PR #5729 reproduced locally: the optimization caused 5 test
failures across reflection-mode engine tests, all for nested
IAsyncInitializer objects that never got InitializeAsync called:

  - Test_DirectDataSource_WorksCorrectly(Data2): Data1.Value == ""
  - CombinedDataSource_WithNestedPropertyInjection... (True/False):
    address.Location.IsGeolocated == false
  - Test_ParallelPropertyInitialization...(WebApplicationFactory):
    factory.Redis.InitializedAt == default
  - Test_NestedParallelPropertyInitialization...(ComplexWebFactory):
    timing assertion (derivative of above)

The removed InitializeNestedObjectsForExecutionAsync walk was not
actually redundant in all cases despite TrackedObjects ostensibly
containing every nested object. Some initializer paths rely on the
runtime depth-first walk to initialize deeper dependencies before
their owning object's InitializeAsync runs — the tracked-objects
iteration alone doesn't guarantee that for data-source-provided
method arguments with nested [ClassDataSource]-populated
IAsyncInitializer properties.

Restoring the pre-PR behavior; the ~0.3-0.5% CPU savings is not
worth the correctness regression.
@thomhurst
Copy link
Copy Markdown
Owner Author

CI failures reproduced locally — this PR regresses nested IAsyncInitializer initialization.

Running /**[EngineTest=Pass] --reflection against TUnit.TestProject with this PR's change applied gives 5 consistent failures:

  • Test_DirectDataSource_WorksCorrectly(Data2)data2.Data1.Value == "" (Data1.InitializeAsync never called)
  • CombinedDataSource_WithNestedPropertyInjectionAndMultipleIAsyncInitializers(True/False, InitializableAddressWithNestedInjection)address.Location.IsGeolocated == false
  • Test_ParallelPropertyInitialization_ShouldInitializeContainersInParallel(WebApplicationFactory)factory.Redis.InitializedAt == default
  • Test_NestedParallelPropertyInitialization_ShouldInitializeAllLevelsInParallel(ComplexWebFactory) — downstream timing assertion (derivative of the above: no actual parallel init occurred)

All five failures share the same signature: a nested IAsyncInitializer property of a data-source-provided method argument never had InitializeAsync called. Reverting InitializeObjectWithSpanAsync to re-add InitializeNestedObjectsForExecutionAsync(obj) makes the full 5415-test suite pass again.

The PR's assumption was that every reachable nested object is flattened into TestContext.TrackedObjects during RegisterTestAsync and therefore initialized by InitializeTrackedObjectsAsync's descending-depth loop. In practice, for [ClassDataSource]-populated method arguments with nested IAsyncInitializer properties, some deeper dependencies aren't in TrackedObjects at execution time — the runtime walk in InitializeObjectWithSpanAsync was covering that gap.

Pushed a revert (commit 8a8ab77ca). The ~0.3–0.5% CPU savings isn't worth the correctness regression. If we want to chase this optimization later it'll need either (a) an audit of TrackableObjectGraphProvider to ensure it truly captures the full initializer graph at registration, or (b) a narrower change that only skips the runtime walk when we can prove TrackedObjects is complete for that object.

The macOS failure on this run is unrelated — pre-existing TUnit.Mocks.SourceGenerator.Tests snapshot flake. The Windows AutoWires_TagsAspNetCoreSpans_WithTestId NRE on WebApplicationFactory.DisposeAsync also doesn't reproduce locally; leaving that for the rerun to confirm whether it's a separate flake or a second symptom of the same regression.

@thomhurst
Copy link
Copy Markdown
Owner Author

Closing — reverted. The premise that 'nested objects are already in TrackedObjects' turned out to be false for nested IAsyncInitializer properties of data-source-provided method arguments (reproduced locally: data2.Data1, address.Location, factory.Redis failing to initialize). TrackedObjects covers the registration-phase graph but not the per-argument execution-phase nested graph. Branch now zero-diff vs main. Leaving #5718 open for a future attempt that accounts for the data-source-arg nested graph.

@thomhurst thomhurst closed this Apr 24, 2026
@thomhurst thomhurst deleted the perf/5718-object-graph-dedup branch April 24, 2026 12:46
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 24, 2026

Follow-up Review — Post-Revert Analysis

The PR has been reverted (commit 8a8ab77ca) after CI failures confirmed a correctness regression. My earlier review incorrectly assessed the optimization as safe. This comment corrects that assessment and documents the root cause so a future attempt can avoid the same trap.


Root Cause of the Regression

The core assumption — "TrackedObjects already contains the full flattened, depth-sorted graph from registration time" — is false for data-source-provided method arguments with nested IAsyncInitializer properties.

The two object graphs have different lifetimes and owners:

Path When populated Contains
TrackedObjects Registration phase (source-gen / reflection metadata scan) Properties declared on the test class and its class-level data
Runtime walk in InitializeObjectWithSpanAsync Execution phase, per-test-invocation Per-argument objects from [ClassDataSource] / [MethodDataSource], including their nested IAsyncInitializer properties

Removing the runtime walk meant that nested IAsyncInitializer instances inside data-source-provided method arguments never had InitializeAsync called — exactly the failures thomhurst reproduced.


Why the Earlier Review Missed This

The registration-time walk (TrackableObjectGraphProvider.GetTrackableObjects) uses static/reflected type information. Data-source arguments are resolved dynamically at execution time, so their object graphs can only be fully known during execution. The superset property I relied on only holds for the class-level graph, not the per-invocation argument graph.


Path Forward for Issue #5718

A correct optimization must distinguish the two graph populations:

  1. Identify what TrackedObjects actually covers. Before removing any runtime walk, audit whether TrackedObjects is populated for method argument objects (not just class-level properties). If it isn't, the runtime walk cannot be safely removed for that path.

  2. Scope the optimization to the class-level path only. InitializeObjectForExecutionAsync (the class-data init path) was intentionally kept with its nested walk. The same logic applies in reverse: the optimization is only safe on the path where TrackedObjects is demonstrably complete.

  3. Add regression tests before merging. The five failures thomhurst listed are now documented. Add them as explicit test cases guarding the [ClassDataSource] + nested IAsyncInitializer combination before any future optimization attempt:

    • Test_DirectDataSource_WorksCorrectly(Data2)Data1.InitializeAsync must be called for nested data
    • CombinedDataSource_WithNestedPropertyInjectionAndMultipleIAsyncInitializersaddress.Location.IsGeolocated must be true
    • Test_ParallelPropertyInitialization_ShouldInitializeContainersInParallelfactory.Redis.InitializedAt must not be default
  4. Consider a two-phase tracking model. Rather than removing the runtime walk, a safer optimization could deduplicate calls by tracking which objects have already had InitializeAsync called (e.g., a HashSet<object> keyed by reference identity on the execution context). This avoids the double-walk cost without relying on TrackedObjects completeness.


Verdict

The revert was the right call. The optimization premise was subtly incorrect for the data-source-argument path. Issue #5718 should remain open with the root cause documented above — the fix is tractable, but requires either scoping to the class-level-only path or implementing deduplication at the call site rather than by removing the walk entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: avoid duplicate object-graph traversal in ObjectLifecycleService

1 participant