perf(aspnetcore): prevent thread pool starvation during parallel WebApplicationTest server init by thomhurst · Pull Request #5329 · thomhurst/TUnit

thomhurst · 2026-03-31T18:35:33Z

Summary

WebApplicationFactory.Server is a synchronous property that blocks the calling thread while building the DI container and starting the host
When many tests start in parallel, each holds an async continuation thread for the full duration of app startup, exhausting the thread pool and causing cascading slowdowns
Fix offloads the synchronous Server access to Task.Run (freeing the async thread immediately) and caps concurrent builds with a SemaphoreSlim (ProcessorCount * 2) to prevent flooding the thread pool

Details

The degradation with more concurrent tests is classic thread pool starvation: N tests × synchronous server startup time = N threads held, hitting the min-thread threshold and triggering the 250ms-per-thread spin-up delay for any new work.

There is no public async initialization path in WebApplicationFactory<T> (no InitializeAsync equivalent), so Task.Run with a concurrency cap is the correct workaround.

Test Plan

All 39 existing TUnit.AspNetCore.NugetTester tests pass
Verify with a large parallel test suite that startup time no longer degrades with concurrency

claude

Code Review

Good fix for a real thread pool starvation problem. The diagnosis is correct — WebApplicationFactory<T>.Server is synchronous and blocks threads for the full host startup duration, which is painful under parallel test execution. The Task.Run + semaphore pattern is the right tool here.

Issues

1. Missing max count on `SemaphoreSlim` — documents wrong intent

// Current
private static readonly SemaphoreSlim _serverInitSemaphore = new(Environment.ProcessorCount * 2);

SemaphoreSlim(int) sets initialCount but leaves maxCount as int.MaxValue. This works correctly in practice (the try/finally guarantees balanced acquire/release), but it silently hides intent and would allow accidental over-release to push the count above ProcessorCount * 2. Use the two-argument form:

private static readonly SemaphoreSlim _serverInitSemaphore =
    new(Environment.ProcessorCount * 2, Environment.ProcessorCount * 2);

2. No cancellation token on `WaitAsync`

If a test is cancelled (e.g. timeout or user abort), the WaitAsync call will block indefinitely rather than respecting cancellation. The testContext parameter is already in scope:

await _serverInitSemaphore.WaitAsync(testContext.CancellationToken);

This is important for a test framework where fast, clean shutdown is a correctness concern.

3. Semaphore scope is per closed generic type, not global

Because _serverInitSemaphore is static on a generic class, C# creates a separate static per instantiation. Two test classes inheriting WebApplicationTest<FactoryA, ProgramA> and WebApplicationTest<FactoryB, ProgramB> each get their own ProcessorCount*2 semaphore. Total concurrent initializations = N factories × ProcessorCount × 2.

For most projects this is fine (single factory type), but if the goal is to globally cap server startups across all factories, a non-generic base class should hold the semaphore:

// In the non-generic WebApplicationTest base:
private static readonly SemaphoreSlim _serverInitSemaphore =
    new(Environment.ProcessorCount * 2, Environment.ProcessorCount * 2);

This is the more correct home for it anyway — the concurrency pressure is global, not per-factory-type.

Summary

The approach is sound and fixes a real problem. The SemaphoreSlim two-arg constructor and cancellation token are straightforward improvements. The generic-vs-base scoping question is worth deciding intentionally — moving the semaphore to the non-generic base gives global control, which better matches the stated goal of preventing thread pool starvation when many tests start in parallel (regardless of their factory type).

codacy-production · 2026-03-31T18:38:16Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results

Complexity 0

View in Codacy

_{TIP This summary will be updated as you push new changes. Give us feedback}

… server init WebApplicationFactory.Server is a synchronous property that blocks the calling thread while building the DI container and starting the host. When many tests start in parallel, this holds async continuation threads for the full duration of app startup, exhausting the thread pool and causing cascading slowdowns as tests queue for available threads. Fix by offloading the synchronous Server access to Task.Run (freeing the async thread immediately) and capping concurrent builds with a SemaphoreSlim (ProcessorCount * 2) to prevent flooding the thread pool.

claude

Code Review

The PR correctly diagnoses and addresses a real thread pool starvation issue. The approach — using WaitAsync() to avoid blocking async continuation threads, combined with Task.Run to offload the synchronous work — is sound. Good fix overall.

A few things worth discussing:

1. `internal` visibility is unnecessary

internal static readonly SemaphoreSlim ServerInitSemaphore = ...

The comment says this is needed to "share across all generic instantiations of WebApplicationTest<TFactory, TEntryPoint>" — but that sharing happens automatically because it's a static field on the non-generic base class. internal only controls cross-assembly access, it doesn't affect the sharing behavior at all. This should be private static readonly unless there's a specific external consumer intended (there doesn't appear to be one).

2. `ProcessorCount * 2` is the wrong heuristic for app startup concurrency

ProcessorCount * 2 is a sensible cap for CPU-bound I/O multiplexing, but WebApplicationFactory startup is primarily I/O + DI container reflection work that doesn't scale with CPU count. On a 64-core CI machine this allows 128 concurrent app builds, which could still cause noticeable thread pool pressure. On a 2-core dev machine, you get 4 — which might be unnecessarily restrictive.

A small fixed ceiling (e.g., Math.Min(Environment.ProcessorCount * 2, 8)) or a constant like 4 with a comment explaining the rationale would be more defensible. The key property here is "don't flood the thread pool", not "saturate all cores".

3. Is `Task.Run` redundant given the semaphore?

With the semaphore limiting concurrency to N, at most N async continuations could reach _factory.Server at once. Task.Run moves the blocking from async state-machine threads to explicit thread pool work items — but those come from the same thread pool. The starvation risk with just the semaphore (and no Task.Run) would already be bounded to N blocked threads.

That said, Task.Run is still the right pattern here for two reasons:

It makes the blocking intent explicit (this is sync-over-async work)
It avoids a subtle class of deadlocks in single-threaded or custom synchronization contexts

Worth keeping, but a brief comment explaining why Task.Run matters in addition to the semaphore would help future readers.

4. Global semaphore across all factory types

The semaphore is shared across all factory types (all TFactory/TEntryPoint combinations). This means a test suite with two different WebApplicationFactory types (AppAFactory and AppBFactory) will compete for the same slots, even though their DI builds are fully independent.

A per-factory-type semaphore (e.g. a ConcurrentDictionary<Type, SemaphoreSlim> keyed on typeof(TFactory)) would give independent concurrency budgets. This is more complex, but prevents one factory type from starving another. For most projects this won't matter, but it's worth noting as a potential follow-up.

Summary: The fix is correct and addresses a real problem. The main actionable item is changing internal to private. The semaphore sizing and Task.Run comments are worth revisiting but not blocking.

…init slot If a test is cancelled while waiting for a semaphore slot, WaitAsync now propagates the cancellation rather than blocking indefinitely.

claude

Code Review (Follow-up)

The previous review raised four issues. Here is a status check on each, plus one new observation.

Addressed (good work)

Two-arg SemaphoreSlim — now uses new(Environment.ProcessorCount * 2, Environment.ProcessorCount * 2). This makes the max count explicit and prevents accidental over-release above the intended ceiling.

Cancellation token on WaitAsync — now passes testContext.Execution.CancellationToken. This is the correct fix; a test timeout or user-abort will now propagate cleanly through the semaphore wait.

Semaphore moved to non-generic base — ServerInitSemaphore now lives on the non-generic WebApplicationTest class. This means all factory types share the same global concurrency budget, which is the right call given the stated goal of preventing thread pool starvation across the entire test session.

Still open

1. `internal` visibility serves no purpose

internal static readonly SemaphoreSlim ServerInitSemaphore = ...

This was flagged in the previous review and has not been changed. The field is static on the non-generic base, so all subclasses (in the same assembly) share the same instance regardless of access modifier. internal only controls cross-assembly visibility; there is no known external consumer. This should be private static readonly to keep encapsulation tight.

2. `ProcessorCount * 2` sizing rationale remains unjustified

WebApplicationFactory startup is dominated by DI container reflection, host boot, and I/O — not by CPU work. On a 64-core CI machine this allows 128 concurrent app builds, which may still cause meaningful thread pool pressure. On a dual-core dev laptop it allows 4, which could serialize test startup more than necessary.

A tighter fixed upper bound is more defensible:

// Cap at a value that prevents thread pool exhaustion without unnecessarily
// serializing independent server builds. ProcessorCount * 2 is used to
// scale slightly with available parallelism, but capped to avoid flooding
// high-core-count machines.
internal static readonly SemaphoreSlim ServerInitSemaphore =
    new(Math.Min(Environment.ProcessorCount * 2, 8), Math.Min(Environment.ProcessorCount * 2, 8));

Alternatively, a documented constant (e.g. 4) would make the intent clearer. As written, reviewers and future maintainers have no signal for why ProcessorCount * 2 was chosen over any other value.

Net assessment

The three core issues from the previous round are resolved. The fix is correct and functional. The remaining items (internal visibility and sizing documentation) are minor polish but worth addressing before merge — particularly the private vs internal change, which is a pure improvement with no downside.

… server builds at 8 - Change ServerInitSemaphore from internal to protected: private would break subclass access, internal was broader than needed - Cap concurrent builds at Min(ProcessorCount*2, 8): startup is reflection/I/O-bound so ProcessorCount alone over-provisions on high-core-count CI machines

claude

Code Review

This PR addresses a real and well-understood problem: WebApplicationFactory.Server is a blocking synchronous property, and hitting it from many concurrent async continuations causes thread pool starvation. The diagnosis is accurate and the general approach (Task.Run to free the async thread, SemaphoreSlim to cap concurrency) is a legitimate workaround for the absence of an async initialization path in WebApplicationFactory<T>.

That said, there are several issues worth discussing before merging.

Issue 1: The semaphore is shared across ALL `WebApplicationTest` subclasses globally

ServerInitSemaphore is a static field on the non-generic base class WebApplicationTest. This means tests for completely unrelated applications all compete for the same _maxConcurrentServerInits slots simultaneously.

This could actually make the starvation problem worse in a suite with many different TEntryPoint types, since all factory builds are serialised through one global semaphore even when they have no contention with each other. It also makes the cap harder to reason about.

A more targeted approach would be a per-entry-point semaphore keyed on TEntryPoint (or on the factory type), stored in a ConcurrentDictionary<Type, SemaphoreSlim> on the generic WebApplicationTest<TFactory, TEntryPoint>:

// One semaphore per concrete entry point type, not one for all.
private static readonly SemaphoreSlim _serverInitSemaphore =
    new(MaxConcurrent, MaxConcurrent);

Because WebApplicationTest<TFactory, TEntryPoint> is a distinct generic instantiation per TEntryPoint, a static field there is already scoped to each distinct entry point — which is the correct granularity.

Issue 2: `protected static` exposes implementation detail unnecessarily

ServerInitSemaphore is protected static, making it part of the public API surface that subclasses can observe and acquire outside the controlled code path. There is no documented reason for subclasses to hold or release this semaphore themselves.

Since the semaphore is only used inside InitializeFactoryAsync, it should be private static:

private static readonly SemaphoreSlim ServerInitSemaphore = ...

If future extensibility is desired, a virtual / protected hook method would be a better API boundary.

Issue 3: The cap value `Math.Min(ProcessorCount * 2, 8)` is an undifferentiated magic constant

The comment explains the reasoning (reflection/I/O-bound, not CPU-bound), but the value 8 will behave differently on a 2-core CI machine (cap = 4) versus a 16-core developer machine (cap = 8). This creates an asymmetry: the fix may not be observable in environments where contention is highest.

Some options worth considering:

A const int or static readonly field with a more descriptive name (e.g., MaxConcurrentServerInits) would at least make the intent clearer.
An environment variable override (e.g., TUNIT_MAX_CONCURRENT_FACTORY_INITS) would let CI environments tune this without a code change.
Defaulting higher and documenting the rationale for users who hit the limit in both directions would be more robust than a blanket cap of 8.

Issue 4: `Task.Run` does not pass a `CancellationToken`

The cancellation token from testContext.Execution.CancellationToken is correctly used for the semaphore wait, but not forwarded into Task.Run:

await Task.Run(() => _ = _factory.Server); // no cancellation token

If the test run is cancelled between acquiring the semaphore and completing Server access, the task will continue running on a thread pool thread, holding the semaphore slot, until Server finishes building — potentially delaying teardown.

Passing the token is straightforward:

await Task.Run(() => _ = _factory.Server, testContext.Execution.CancellationToken);

Note: this will not abort the synchronous work mid-flight (it would only prevent the task from starting if already cancelled), but it at least participates correctly in cooperative cancellation.

Issue 5: Semaphore is held during `Task.Run` but not during factory creation

The semaphore is acquired after _factory is created via GetIsolatedFactory(...) / WithWebHostBuilder(...). That call itself may also do synchronous work (configuration builders, etc.). If the intent is to prevent thread pool starvation during all synchronous startup work, the semaphore scope may need to start earlier. If the intent is specifically for Server access, the current scoping is fine — but that should be made explicit in a comment.

Minor: Correctness of `_ = _factory.Server` for starvation prevention

Accessing _factory.Server (on TracedWebApplicationFactory) delegates to _inner.Server. The WebApplicationFactory<T>.Server property lazily builds and starts the host. The Task.Run wrapping ensures this blocking work happens on a thread pool thread rather than an async continuation thread, which is the correct approach. No issues here — just noting it for completeness.

Summary

The problem being solved is real and the approach is sound. The main concerns are:

Correctness: The global semaphore serialises unrelated factories — move to per-TEntryPoint scope (use static on the generic subclass).
API hygiene: Make ServerInitSemaphore private static.
Cancellation: Forward the CancellationToken to Task.Run.
Magic constant: Consider a named constant or env-var override for the cap.

Items 1, 2, and 3 are the most important to address before merging.

…build Passes the test cancellation token to Task.Run so that if a test is already cancelled when the slot is acquired, the task is not started. Also adds a comment clarifying the semaphore guards only Server access, not the factory configuration above it.

claude

Code Review (Round 5 — Follow-up)

Good progress across the previous rounds. Let me run through each outstanding item against the latest diff.

Resolved since last review

Two-arg SemaphoreSlim with explicit max count — done.
Cancellation token on WaitAsync — done.
Cancellation token forwarded to Task.Run — done.
Semaphore placed on non-generic base WebApplicationTest — done.
Math.Min(Environment.ProcessorCount * 2, 8) cap with explanatory comment — done.

Still open: semaphore visibility should be `private protected`, not `protected`

The latest commit message explains that private was rejected because the generic subclass WebApplicationTest<TFactory, TEntryPoint> needs access to the field. That reasoning is correct — private fields on a base class are not accessible from derived classes.

However, protected goes too far in the other direction: it becomes part of the public API surface for any user-defined subclass outside this assembly. A user extending WebApplicationTest<TFactory, TEntryPoint> can now call ServerInitSemaphore.Release() or ServerInitSemaphore.WaitAsync(...) from their test class, bypassing the controlled code path entirely.

The right access modifier here is private protected:

private protected static readonly SemaphoreSlim ServerInitSemaphore =
    new(_maxConcurrentServerInits, _maxConcurrentServerInits);

private protected means "accessible to derived classes, but only within this assembly." It satisfies the subclass access requirement while keeping the field invisible to external consumers. This is exactly the encapsulation boundary needed here.

Similarly, _maxConcurrentServerInits should be private static readonly (the leading underscore suggests it's already intended as private — just confirm the modifier matches).

Net assessment

All functional and correctness concerns from previous reviews have been addressed. The remaining item — protected → private protected — is a pure API hygiene improvement with no behavioral change, but it matters for long-term maintainability: once protected ships in a public NuGet package, removing or restricting it is a breaking change.

protected would expose ServerInitSemaphore to user-defined subclasses outside this assembly once shipped as a NuGet package. private protected restricts access to derived classes within this assembly only, which is the correct boundary since WebApplicationTest has an internal constructor.

Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.23.7 to 1.24.13. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.24.13  ## What's Changed ### Other Changes * perf(mocks): optimize MockEngine for lower allocation and faster verification by @thomhurst in thomhurst/TUnit#5319 * Remove defunct `UseTestingPlatformProtocol` reference for vscode by @erwinkramer in thomhurst/TUnit#5328 * perf(aspnetcore): prevent thread pool starvation during parallel WebApplicationTest server init by @thomhurst in thomhurst/TUnit#5329 * fix TUnit0073 for when type from from another assembly by @SimonCropp in thomhurst/TUnit#5322 * Fix implicit conversion operators bypassed in property injection casts by @Copilot in thomhurst/TUnit#5317 * fix(mocks): skip non-virtual 'new' methods when discovering mockable members by @thomhurst in thomhurst/TUnit#5330 * feat(mocks): IFoo.Mock() discovery with generic fallback and ORP resolution by @thomhurst in thomhurst/TUnit#5327 ### Dependencies * chore(deps): update tunit to 1.24.0 by @thomhurst in thomhurst/TUnit#5315 * chore(deps): update aspire to 13.2.1 by @thomhurst in thomhurst/TUnit#5323 * chore(deps): update verify to 31.14.0 by @thomhurst in thomhurst/TUnit#5325 ## New Contributors * @erwinkramer made their first contribution in thomhurst/TUnit#5328 **Full Changelog**: thomhurst/TUnit@v1.24.0...v1.24.13 ## 1.24.0  ## What's Changed ### Other Changes * perf: optimize TUnit.Mocks hot paths by @thomhurst in thomhurst/TUnit#5304 * fix: resolve System.Memory version conflict on .NET Framework (net462) by @thomhurst in thomhurst/TUnit#5303 * fix: resolve CS0460/CS0122/CS0115 when mocking concrete classes from external assemblies by @thomhurst in thomhurst/TUnit#5310 * feat(mocks): parameterless Returns() and ReturnsAsync() for async methods by @thomhurst in thomhurst/TUnit#5309 * Fix typo in NUnit manual migration guide by @aa-ko in thomhurst/TUnit#5312 * refactor(mocks): unify Mock.Of<T>() and Mock.OfPartial<T>() into single API by @thomhurst in thomhurst/TUnit#5311 * refactor(mocks): clean up Mock API surface by @thomhurst in thomhurst/TUnit#5314 * refactor(mocks): remove generic/untyped overloads from public API by @thomhurst in thomhurst/TUnit#5313 ### Dependencies * chore(deps): update tunit to 1.23.7 by @thomhurst in thomhurst/TUnit#5305 * chore(deps): update dependency mockolate to 2.1.1 by @thomhurst in thomhurst/TUnit#5307 ## New Contributors * @aa-ko made their first contribution in thomhurst/TUnit#5312 **Full Changelog**: thomhurst/TUnit@v1.23.7...v1.24.0 Commits viewable in [compare view](thomhurst/TUnit@v1.23.7...v1.24.13). </details> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=TUnit&package-manager=nuget&previous-version=1.23.7&new-version=1.24.13)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

thomhurst temporarily deployed to Pull Requests March 31, 2026 18:35 — with GitHub Actions Inactive