perf(engine): collapse async state-machine layers on hot test path (#5687)#5690
perf(engine): collapse async state-machine layers on hot test path (#5687)#5690
Conversation
…5687) Every test traversed 8+ stacked async state machines. Four of those layers were pure overhead wrappers with no-op fast paths that can short-circuit the state machine entirely. - TestCoordinator.ExecuteTestInternalAsync: fast-path past RetryHelper whenever retryLimit == 0, regardless of whether a timeout is configured. Timeout is already applied inside TestExecutor.ExecuteAsync per attempt, so the retry wrapper was pure overhead when no retries are requested. - TimeoutHelper.ExecuteWithTimeoutAsync: drop the outer async wrapper + `return true` lambda that only adapted Func<CT,Task> to Func<CT,Task<bool>>. Inline the core directly into the non-generic Task-returning method, which removes one state machine + closure allocation per timed test. - TestRunner.ExecuteTestAsync: stop wrapping ExecuteTestInternalAsync in an outer async method whose only job was to fulfil a TaskCompletionSource. Start the inner ValueTask, fast-path synchronous completion into the TCS, otherwise attach a synchronous continuation that mirrors the outcome. - TestCoordinator.DisposeTestInstanceWithSpanAsync: gate the OTEL RunWithSpanAsync wrapper on TUnitActivitySource.Source.HasListeners(). When no listener is attached, call DisposeTestInstanceCoreAsync directly and skip the Func<Task> closure + state machine.
There was a problem hiding this comment.
Code Review
This is a well-motivated and generally correct PR. The four changes are logically independent, each removing a real layer of async overhead on the hot test path. A few things worth discussing:
TestCoordinator.cs — retryLimit == 0 fast path (correctness fix bundled with perf)
This is actually a bug fix, not just a perf tweak. The old condition retryLimit == 0 && !testTimeout.HasValue incorrectly routed timeout-but-no-retry tests through the RetryHelper wrapper, which did nothing useful for them. The timeout was always applied inside TestExecutor.ExecuteAsync per-attempt, so the retry wrapper was pure overhead. The new condition is correct.
TimeoutHelper.cs — generic overload collapsed
Clean and correct. The <T> overload existed only to thread a return true through the lambda adapter; inlining it removes one state machine and closure allocation per timed test with no semantic change.
TestRunner.cs — TCS wrapper collapsed ⚠️
The approach is sound, but a few subtle points:
AsTask() allocation in the async path: ValueTask.AsTask() on an async ValueTask method (backed by an IValueTaskSource pool in .NET 6+) allocates a fresh Task<T>. In the non-sync path, you're trading the old wrapper state machine box for a new Task allocation — roughly equivalent cost. The sync fast-path (IsCompletedSuccessfully) is the real win here (zero allocation). Worth being aware of.
Cancellation state behavior change: The original catch-and-rethrow path did tcs.SetException(ex) for all exceptions including OperationCanceledException, leaving the TCS in Faulted state. The new ContinueWith does TrySetCanceled() for the IsCanceled case, leaving it in Canceled state. This means dedup callers (who await existingTcs.Task) get a proper task-cancellation exception rather than a faulted aggregate — actually more semantically correct, but it is a behavior change. Given that ExecuteTestInternalAsync has broad internal exception handling and almost never throws, this is low-risk in practice.
Discarded ContinueWith return value: The Task<bool> returned by ContinueWith is not observed. If the TCS is in an unexpected state (shouldn't happen given the logic), any resulting exception from TrySet* would be silently swallowed. Minor robustness concern, though unlikely to trigger.
An alternative that avoids the AsTask() allocation while keeping a similar structure:
// Option: keep async wrapper but avoid it in the sync fast-path
if (innerTask.IsCompletedSuccessfully)
{
tcs.SetResult(true);
return default;
}
// For the async path, fall back to the simple wrapper — the allocation cost
// is the same as the old approach, and you still win on sync-completion.
return WrapAsync(innerTask, tcs);
private static async ValueTask WrapAsync(ValueTask inner, TaskCompletionSource<bool> tcs)
{
try { await inner.ConfigureAwait(false); tcs.SetResult(true); }
catch (Exception ex) { tcs.SetException(ex); throw; }
}This preserves all the sync-path gains, avoids the AsTask() allocation, and restores the original faulted-vs-canceled semantics. The current approach is still correct — just worth considering if allocations on the async path matter.
TestCoordinator.cs — OTEL dispose guard
Correct. Worth noting: RunWithSpanAsync already calls HasListeners() internally (line 144 of TUnitActivitySource.cs), so the outer guard added here is redundant for correctness but avoids the Func<Task> closure allocation before that internal check fires. The PR's logic is right.
Overall
The changes are correct, the comments are clear, and the motivations are sound. The retryLimit fix may be the most impactful correctness change here. The AsTask() allocation and cancellation-state behavior change in TestRunner.cs are worth a second look, but neither is a blocker.
There was a problem hiding this comment.
Code Review
This is a well-motivated and generally correct PR. The four changes are logically independent, each removing a real layer of async overhead on the hot test path. A few things worth discussing:
TestCoordinator.cs — retryLimit == 0 fast path (correctness fix bundled with perf)
This is actually a bug fix, not just a perf tweak. The old condition retryLimit == 0 && !testTimeout.HasValue incorrectly routed timeout-but-no-retry tests through the RetryHelper wrapper, which did nothing useful for them. The timeout was always applied inside TestExecutor.ExecuteAsync per-attempt, so the retry wrapper was pure overhead. The new condition is correct.
TimeoutHelper.cs — generic overload collapsed
Clean and correct. The <T> overload existed only to thread a return true through the lambda adapter; inlining it removes one state machine and closure allocation per timed test with no semantic change.
TestRunner.cs — TCS wrapper collapsed ⚠️
The approach is sound, but a few subtle points:
AsTask() allocation in the async path: ValueTask.AsTask() on an async ValueTask method (backed by an IValueTaskSource pool in .NET 6+) allocates a fresh Task<T>. In the non-sync path, you're trading the old wrapper state machine box for a new Task allocation — roughly equivalent cost. The sync fast-path (IsCompletedSuccessfully) is the real win here (zero allocation). Worth being aware of.
Cancellation state behavior change: The original catch-and-rethrow path did tcs.SetException(ex) for all exceptions including OperationCanceledException, leaving the TCS in Faulted state. The new ContinueWith does TrySetCanceled() for the IsCanceled case, leaving it in Canceled state. This means dedup callers (who await existingTcs.Task) get a proper task-cancellation exception rather than a faulted aggregate — actually more semantically correct, but it is a behavior change. Given that ExecuteTestInternalAsync has broad internal exception handling and almost never throws, this is low-risk in practice.
Discarded ContinueWith return value: The Task<bool> returned by ContinueWith is not observed. If the TCS is in an unexpected state (shouldn't happen given the logic), any exception from TrySet* would be silently swallowed. Minor robustness concern.
An alternative that avoids the AsTask() allocation while keeping all sync-path gains:
```csharp
// Capture sync fast-path win, then fall back to simple async wrapper for the rest
if (innerTask.IsCompletedSuccessfully)
{
tcs.SetResult(true);
return default;
}
return WrapAsync(innerTask, tcs);
private static async ValueTask WrapAsync(ValueTask inner, TaskCompletionSource tcs)
{
try { await inner.ConfigureAwait(false); tcs.SetResult(true); }
catch (Exception ex) { tcs.SetException(ex); throw; }
}
```
This avoids the AsTask() allocation, restores the original faulted-vs-canceled exception semantics, and the continuation exception is surfaced naturally via the async throw. Still wins on the sync-completion path.
TestCoordinator.cs — OTEL dispose guard
Correct. Worth noting: RunWithSpanAsync already calls HasListeners() internally (line 144 of TUnitActivitySource.cs), so the outer guard here is redundant for correctness but avoids the Func<Task> closure allocation before that internal check fires. The motivation is right.
Overall
The changes are correct, the comments are clear, and the motivations are well-documented. The retryLimit condition fix may be the most impactful correctness improvement. The AsTask() allocation and cancellation-state behavior change in TestRunner.cs are worth a second look — neither is a blocker, but the alternative WrapAsync approach would sidestep both concerns cleanly.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Replace AsTask()+ContinueWith plumbing with a minimal async WrapAsync helper so the async path no longer allocates a Task via AsTask(). - No .AsTask() allocation on the async-completion path. - Restores original Faulted-vs-Canceled semantics (SetException mirrors all non-success outcomes; the outer ValueTask still observes failure because we rethrow after SetException). - Removes the discarded ContinueWith return value. - Sync fast path remains zero-alloc. CI failure on modularpipeline (ubuntu/macos) is a snapshot mismatch in TUnit.Mocks.SourceGenerator.Tests (Class_Implementing_Static_Abstract_Interface) that reproduces on main at the same commit — pre-existing, unrelated to this PR.
|
Addressed review feedback: WrapAsync refactor (1ae4da7)
CI Local |
There was a problem hiding this comment.
Code Review (follow-up)
The previous review concerns have all been addressed. Here's the updated assessment.
Previous concerns — resolved ✓
** — / allocation concern**: The PR has replaced with the pattern suggested in the last review. This eliminates the AsTask() allocation and the discarded continuation return value, and restores the original faulted-state semantics (tcs.SetException(ex) for all exceptions, including OperationCanceledException, rather than TrySetCanceled()). Good call.
Remaining observations
**Comment accuracy in **: The inline comment says "without allocating a Task via AsTask()". That's true, but an async ValueTask method still allocates a state machine object in the non-synchronous path (pooled on .NET 6+, but still allocated). The real win is the sync fast-path (IsCompletedSuccessfully) which avoids the state machine entirely. The comment slightly overstates what the async path saves — minor, but worth clarifying so readers don't have incorrect expectations about allocation on the non-sync path.
** return type**: Changing from async ValueTask to Task is correct and avoids the unnecessary state machine allocation. The caller already awaits it, so this is a transparent change. No issues.
#if NET placement: The fallthrough return DisposeTestInstanceCoreAsync(test) correctly sits outside the #if NET block, covering both NET and netstandard2.0 targets. This is a subtle layout to get right and it's done correctly.
Overall
This is a clean, well-motivated perf PR with strong correctness guarantees preserved. All four hot-path changes are independently correct and the previous review concerns have been addressed cleanly. The retryLimit == 0 condition change remains the most impactful correctness fix here — the sync fast-path in TestRunner is the most impactful allocation win. LGTM.
There was a problem hiding this comment.
Code Review (follow-up)
The previous review concerns have all been addressed. Here is the updated assessment.
Previous concerns — resolved
TestRunner.cs — ContinueWith / AsTask() allocation concern: The PR has replaced ContinueWith with the async ValueTask WrapAsync(ValueTask inner, ...) pattern suggested in the last review. This eliminates the AsTask() allocation and the discarded continuation return value, and restores the original faulted-state semantics (tcs.SetException(ex) for all exceptions, including OperationCanceledException, rather than TrySetCanceled()). Good call.
Remaining observations
Comment accuracy in ExecuteTestAsync: The inline comment says "without allocating a Task via AsTask()". That is true, but an async ValueTask method still allocates a state machine object in the non-synchronous path (pooled on .NET 6+, but still allocated). The real win is the sync fast-path (IsCompletedSuccessfully) which avoids the state machine entirely. The comment slightly overstates what the async path saves — minor, but worth clarifying so readers do not have incorrect expectations about allocations on the non-sync path.
DisposeTestInstanceWithSpanAsync return type: Changing from async ValueTask to Task is correct and avoids the unnecessary state machine allocation. The caller already awaits it, so this is a transparent change. No issues.
#if NET placement: The fallthrough return DisposeTestInstanceCoreAsync(test) correctly sits outside the #if NET block, covering both NET and netstandard2.0 targets. This is a subtle layout to get right and it is done correctly.
Overall
This is a clean, well-motivated performance PR with correctness guarantees preserved throughout. All four hot-path changes are independently correct and the previous review concerns have been addressed cleanly. The retryLimit == 0 condition change is the most impactful correctness fix (previously, timeout-but-no-retry tests were routed through RetryHelper for no reason). The sync fast-path in TestRunner is the most impactful allocation win. LGTM.
Summary
Closes #5687.
Every test traverses 8+ stacked async state machines; four of those layers are pure overhead wrappers with no-op fast paths. This PR short-circuits them on the hot path.
RetryHelper.ExecuteWithRetrywheneverretryLimit == 0, regardless of whether a timeout is configured. Timeout is already applied per-attempt insideTestExecutor.ExecuteAsync, so the retry wrapper was pure overhead for any test with only a timeout attribute. Previously reported at ~8.3% inclusive CPU.asyncmethod +return truelambda whose only job was to adaptFunc<CT, Task>toFunc<CT, Task<bool>>. Fold the core straight into the non-genericTask-returning entry point; eliminates one state machine + closure allocation per timed test. Previously ~4.7% inclusive CPU.ExecuteTestInternalAsyncin an outerasyncmethod whose only job was fulfilling aTaskCompletionSource. Start the innerValueTask, fast-path synchronous completion into the TCS with noTaskallocation, otherwise attach a synchronous continuation that mirrorsFaulted/Canceled/Completedonto the TCS. Previously ~11.9% inclusive / ~0% exclusive — 100% wrapper overhead.TUnitActivitySource.Source.HasListeners()before callingRunWithSpanAsyncfor test disposal, mirroring the gate already used inTestExecutor.ExecuteAsync. When no OTEL listener is attached, callDisposeTestInstanceCoreAsyncdirectly and skip the per-testFunc<Task>closure + state machine allocation.Test plan
dotnet build TUnit.Engine/TUnit.Engine.csproj— builds clean acrossnetstandard2.0;net8.0;net9.0;net10.0, 0 warnings/errors.dotnet test TUnit.UnitTests/TUnit.UnitTests.csproj --framework net10.0— 180/180 passed.dotnet test TUnit.Engine.Tests/TUnit.Engine.Tests.csproj --framework net10.0— 146/146 non-subprocess tests passed, 99 skipped. The 2 failures (VB.Test,FSharp.Test) reproduce onorigin/mainand are pre-existing environmental issues with the VB/F# subprocess test harness — not caused by this change.Correctness notes
TestExecutor.ExecuteAsync;RetryHelperstill wraps whenretryLimit > 0, so tests configured with both timeout and retry retain prior semantics.ContinueWithinTestRunnerruns withExecuteSynchronouslyand forwardsFaulted/Canceled/Completedstates onto the TCS; the returnedValueTask(asTask)surfaces the exception to the immediate awaiter.DisposeTestInstanceCoreAsyncswallows exceptions internally, so the span-gate change does not alter observable behavior.