feat: timeout diagnostics with stack traces and deadlock detection#4947
feat: timeout diagnostics with stack traces and deadlock detection#4947
Conversation
There was a problem hiding this comment.
Code Review
This PR adds valuable diagnostic information to timeout exceptions. The intent is great — knowing why a test timed out is genuinely useful. However, there are some fundamental issues with the current implementation that will make it misleading rather than helpful.
|
|
||
| try | ||
| { | ||
| stackTrace = Environment.StackTrace; |
There was a problem hiding this comment.
Critical: Environment.StackTrace captures the wrong call stack
Environment.StackTrace only captures the stack of the current thread — the timeout handler continuation — not the stack of the executionTask that actually hung.
By the time BuildTimeoutDiagnosticsMessage is called (after Task.WhenAny returns + grace period expires), the current thread's stack looks roughly like:
TimeoutDiagnostics.BuildTimeoutDiagnosticsMessage
TimeoutHelper.ExecuteWithTimeoutAsync
... async continuation frames ...
The blocking patterns you're scanning for (Monitor.Enter, SemaphoreSlim.Wait, etc.) would appear on an entirely different thread — the one that's actually stuck — or may not appear on any active thread at all (if the task is awaiting a TaskCompletionSource with no active thread holding it).
The diagnostic output will label this the "Timeout Stack Trace" and claim to detect deadlocks, but it's showing the timeout infrastructure's own call stack rather than the user's blocked code.
Better approach: Capture Environment.StackTrace at the start of the task delegate (or use AsyncLocal<string> to propagate it), then include that in the timeout message. For example:
// Inside the test executor, before starting the task:
var capturedStack = Environment.StackTrace;
// Store in context, include in timeout message if it firesAlternatively, for IsFaulted tasks the existing code already captures the exception — that's the most useful diagnostic for those cases. For still-running tasks on another thread, there's no safe public API to capture their stack without a debugger.
| ("AutoResetEvent.WaitOne", "AutoResetEvent.WaitOne was called. The event may never be signaled."), | ||
| ("Task.Wait()", "Task.Wait() (synchronous) was called inside an async context. This can cause deadlocks. Use 'await' instead."), | ||
| (".Result", "Task.Result was accessed synchronously. This can cause deadlocks in async contexts. Use 'await' instead."), | ||
| (".GetAwaiter().GetResult()", "GetAwaiter().GetResult() was called synchronously. This can cause deadlocks in async contexts. Use 'await' instead."), |
There was a problem hiding this comment.
Pattern ".GetAwaiter().GetResult()" will never match any real .NET stack trace
In .NET, Environment.StackTrace produces one line per method call frame. When user code writes task.GetAwaiter().GetResult(), this appears as separate frames:
at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at UserCode.BlockingMethod() in ...
The chained string ".GetAwaiter().GetResult()" (combining two separate call sites into one substring) will never appear on a single line of any stack trace. This pattern is effectively dead code.
Fix: Match only the final call:
("GetAwaiter", "GetAwaiter().GetResult() was called synchronously. This can cause deadlocks in async contexts. Use 'await' instead."),or match "GetResult" (more specific to the blocking action).
| ("ManualResetEvent.WaitOne", "ManualResetEvent.WaitOne was called. The event may never be signaled."), | ||
| ("AutoResetEvent.WaitOne", "AutoResetEvent.WaitOne was called. The event may never be signaled."), | ||
| ("Task.Wait()", "Task.Wait() (synchronous) was called inside an async context. This can cause deadlocks. Use 'await' instead."), | ||
| (".Result", "Task.Result was accessed synchronously. This can cause deadlocks in async contexts. Use 'await' instead."), |
There was a problem hiding this comment.
Pattern ".Result" does not match how Task.Result appears in stack traces
When user code accesses task.Result, .NET represents this in stack traces as a property getter call: Task'1.get_Result(). The literal substring ".Result" (with leading dot) does not match get_Result — they share no common substring.
This pattern will not detect the anti-pattern it is designed to catch.
Fix:
("get_Result", "Task.Result was accessed synchronously. This can cause deadlocks in async contexts. Use 'await' instead."),| ("SemaphoreSlim.Wait()", "SemaphoreSlim.Wait() (synchronous) was called. Consider using SemaphoreSlim.WaitAsync() instead."), | ||
| ("ManualResetEvent.WaitOne", "ManualResetEvent.WaitOne was called. The event may never be signaled."), | ||
| ("AutoResetEvent.WaitOne", "AutoResetEvent.WaitOne was called. The event may never be signaled."), | ||
| ("Task.Wait()", "Task.Wait() (synchronous) was called inside an async context. This can cause deadlocks. Use 'await' instead."), |
There was a problem hiding this comment.
Patterns "Task.Wait()" and "SemaphoreSlim.Wait()" only match zero-argument overloads
.NET's Environment.StackTrace includes parameter type names in each frame, e.g.:
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.SemaphoreSlim.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
The pattern "Task.Wait()" (with empty parens) only matches the parameterless Task.Wait() overload. The most commonly blocking overloads — Wait(int), Wait(CancellationToken), Wait(int, CancellationToken) — will not be detected because their signatures don't contain "()" as a substring.
Same applies to "SemaphoreSlim.Wait()" on line 18.
Fix: Drop the () from these patterns:
("SemaphoreSlim.Wait", "SemaphoreSlim.Wait() (synchronous) was called. Consider using SemaphoreSlim.WaitAsync() instead."),
// ...
("Task.Wait", "Task.Wait() (synchronous) was called inside an async context. This can cause deadlocks. Use 'await' instead."),50cd1c6 to
27d3d92
Compare
27d3d92 to
886638a
Compare
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. The implementation is clean and well-structured. A few observations worth noting:
Overall a solid diagnostic improvement for issue #4888. |
886638a to
e53777e
Compare
When a test or hook times out, the exception message now includes: - The current stack trace at timeout time for debugging context - The faulted task's status and exception details (if available) - Detection of common deadlock patterns (Monitor.Enter, SemaphoreSlim.Wait, Task.Result, GetAwaiter().GetResult(), etc.) with actionable hints Closes #4888
- Rename header to "Timeout Handler Stack Trace" and add disclaimer that this is the handler's stack, not the blocked test's stack - Fix deadlock pattern matching to use actual stack frame signatures: "get_Result" instead of ".Result", "TaskAwaiter" instead of ".GetAwaiter().GetResult()", remove "()" from "Task.Wait" and "SemaphoreSlim.Wait" to match all overloads
e53777e to
b027d1d
Compare
Closes #4888. Adds TimeoutDiagnostics with deadlock pattern detection.