Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame when they are higher than the top managed frame in stack #118015

eterekhin · 2025-07-24T10:55:52Z

Hey, folks!

This PR fixes a crash because of eval abort, when abort called before the method actually runs (no managed frames related to the eval are on the stack at this moment, please see the stack trace attached below).

[0x0]   coreclr!SfiInit+0x1f1   0x5f5ab98c00   0x7ffa8804a1de   
[0x1]   System_Private_CoreLib!System.Runtime.ExceptionServices.InternalCalls.RhpSfiInit(System.Runtime.StackFrameIterator ByRef, Void*, Boolean, Boolean*)+0x7e   0x5f5ab98ff0   0x7ffa88012a46   
[0x2]   System_Private_CoreLib!System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)+0xc6   0x5f5ab990e0   0x7ffa880126c9   
[0x3]   System_Private_CoreLib!System.Runtime.EH.RhThrowEx(System.Object, ExInfo ByRef)+0x49   0x5f5ab99220   0x7ffa89e5d043   
[0x4]   coreclr!CallDescrWorkerInternal+0x83   0x5f5ab99250   0x7ffa89952820   
[0x5]   coreclr!CallDescrWorkerWithHandler+0x130   0x5f5ab99290   0x7ffa899537bc   
[0x6]   coreclr!DispatchCallSimple+0x26c   0x5f5ab992f0   0x7ffa89d4e638   
[0x7]   coreclr!DispatchManagedException+0x388   0x5f5ab99480   0x7ffa89d4e247   
[0x8]   coreclr!DispatchManagedException+0x67   0x5f5ab9aa80   0x7ffa89c0e72f   
[0x9]   coreclr!Thread::HandleThreadAbort+0x1df   0x5f5ab9b000   0x7ffa8995280e   
[0xa]   coreclr!CallDescrWorkerWithHandler+0x11e   0x5f5ab9b190   0x7ffa899533bb   
[0xb]   coreclr!MethodDescCallSite::CallTargetWorker+0xb8b   0x5f5ab9b1f0   0x7ffa893cfea2   
[0xc]   coreclr!MethodDescCallSite::CallWithValueTypes_RetArgSlot+0x32   0x5f5ab9b9e0   0x7ffa893d7d2f   
[0xd]   coreclr!`FuncEvalWrapper'::`3'::__Body::Run+0x7f   0x5f5ab9ba10   0x7ffa893d24f7   
[0xe]   coreclr!FuncEvalWrapper+0x97   0x5f5ab9ba70   0x7ffa893d17df   
[0xf]   coreclr!DoNormalFuncEval+0x9af   0x5f5ab9bb30   0x7ffa893d3357   
[0x10]   coreclr!GCProtectArgsAndDoNormalFuncEval+0x657   0x5f5ab9c2d0   0x7ffa893d1acd   <-- func eval abort exception handler here
[0x11]   coreclr!FuncEvalHijackRealWorker+0x8d   0x5f5ab9ca60   0x7ffa893d996a   
[0x12]   coreclr!FuncEvalHijackWorker+0x50a   0x5f5ab9cef0   0x7ffa893eb4bd   
[0x13]   coreclr!FuncEvalHijack+0xd   0x5f5ab9d250   0xcccccccc

Debugger calls ICorDebugEval::CallFunction, ICorDebugProcess::Continue and ICorDebugEval::Abort sequentially.
When a being evaluated method already runs at the moment when Abort is performed, ThreadAbortException is propagated to the exception handler added in GCProtectArgsAndDoNormalFuncEval, but when it doesn't, we unwound the exception to a frame where debugger was stopped, missing the exception handler, that leads to a process crash during first exception pass here

It reproduces in .NET 9 installation on all OS and in main, .NET 8 works fine for me

This problem araises when runtime executes the class constructor before the call, so it takes some time to get to the call itself

In this PR I check for this situation in SfiInit and set pfIsExceptionIntercepted flag, by that we skip SfiNext calls and go straight to CallCatchFunclet. I also added check for DebuggerU2MCatchHandlerFrame because I guess we may hit this issue for it as well

@janvorli, may I ask you to review it, please? I have probably missed some pieces :) Also not sure the tests will be green

…when they are higher than the top managed frame in stack

janvorli · 2025-07-24T11:56:30Z

@eterekhin thank you for looking into this issue and creating a fix! I think the fix should be made differently though. For example, the pfIsExceptionIntercepted has a special use for exception interception by the debugger and I think reusing it for a different purpose may cause troubles. I am currently working on a fix for processing unhandled exceptions and this problem is in the same bucket. I want to fix it in a unified manner.
Do you happen to have a repro project that I can use to test it?

eterekhin · 2025-07-24T14:25:02Z

@janvorli, Thank you! That's great you are going to fix this case!
Unfortunately repro is very random, I will try to find a stable repro steps within a couple of days and let you know

eterekhin · 2025-07-27T12:48:04Z

@janvorli, Hello! I've found a stable repro, Win x64, .NET 9

Open NotCaughtThreadAbortReproProject\NotCaughtThreadAbortReproProject folder in VS code
Set breakpoint in Program.cs 18 line
Run debug and wait until debugger stops at the breakpoint
Evaluate "SomeFunc()" in watches

After 6 seconds waiting I see (please see the screenshot)

before operation
after operation
Fatal error. Internal CLR error. (0x80131506)
   at System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)
   at System.Runtime.EH.RhThrowEx(System.Object, ExInfo ByRef)
   at Repro.Program.Main(System.String[])
The target process exited with code -2146233082 (0x80131506) while evaluating the function 'Repro.Program.SomeFunc'.

NotCaughtThreadAbortReproProject.zip

janvorli · 2025-08-04T12:19:17Z

@eterekhin thank you! I'll use it to verify my changes.

@eterekhin

There is a problem with threadabort in funceval in case there is no managed frame on the stack between the abortion point and the `FuncEvalFrame`. That can happen e.g. when invoking a static method via funceval for a type with static constructor that was not invoked yet and takes a long time to complete. The problem is caused by the fact that when EH is called to propagate the ThreadAbortException, it starts at the first managed frame and so it skips the try/catch in the funceval native code. This change fixes it by using `RaiseTheExceptionInternalOnly` to raise the `ThreadAbortException` in the `Thread::HandleThreadAbort`. The `Thread::HandleThreadAbort` is always called by native code that has a native catch or (on Windows) ends up calling the `ProcessCLRException`. I have originally made the change to call the `DispatchManagedException` from the `Thread::HandleThreadAbort`, but this issue shows it is problematic. I have verified that the repro provided by @eterekhin in the issue report no longer causes the process to crash with failfast, but reports the funceval as timed out as expected. Close dotnet#118015

@eterekhin

There is a problem with threadabort in funceval in case there is no managed frame on the stack between the abortion point and the `FuncEvalFrame`. That can happen e.g. when invoking a static method via funceval for a type with static constructor that was not invoked yet and takes a long time to complete. The problem is caused by the fact that when EH is called to propagate the ThreadAbortException, it starts at the first managed frame and so it skips the try/catch in the funceval native code. This change fixes it by using `RaiseTheExceptionInternalOnly` to raise the `ThreadAbortException` in the `Thread::HandleThreadAbort`. The `Thread::HandleThreadAbort` is always called by native code that has a native catch or (on Windows) ends up calling the `ProcessCLRException`. I have originally made the change to call the `DispatchManagedException` from the `Thread::HandleThreadAbort`, but this issue shows it is problematic. I have verified that the repro provided by @eterekhin in the issue report no longer causes the process to crash with failfast, but reports the funceval as timed out as expected. Close #118015

@eterekhin

There is a problem with threadabort in funceval in case there is no managed frame on the stack between the abortion point and the `FuncEvalFrame`. That can happen e.g. when invoking a static method via funceval for a type with static constructor that was not invoked yet and takes a long time to complete. The problem is caused by the fact that when EH is called to propagate the ThreadAbortException, it starts at the first managed frame and so it skips the try/catch in the funceval native code. This change fixes it by using `RaiseTheExceptionInternalOnly` to raise the `ThreadAbortException` in the `Thread::HandleThreadAbort`. The `Thread::HandleThreadAbort` is always called by native code that has a native catch or (on Windows) ends up calling the `ProcessCLRException`. I have originally made the change to call the `DispatchManagedException` from the `Thread::HandleThreadAbort`, but this issue shows it is problematic. I have verified that the repro provided by @eterekhin in the issue report no longer causes the process to crash with failfast, but reports the funceval as timed out as expected. Close dotnet#118015

Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame …

38ae5ec

…when they are higher than the top managed frame in stack

github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 24, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jul 24, 2025

teo-tsirpanis added area-ExceptionHandling-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jul 24, 2025

This was referenced Jul 24, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

System.Diagnostics.Tests.ProcessTests.TestCheckChildProcessUserAndGroupIds fails on Alpine jobs with "Operation not permitted" #117811

Closed

janvorli mentioned this pull request Aug 4, 2025

Fix thread abort issue with funceval #118354

Merged

janvorli closed this in #118354 Aug 4, 2025

github-actions bot locked and limited conversation to collaborators Sep 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame when they are higher than the top managed frame in stack #118015

Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame when they are higher than the top managed frame in stack #118015

Uh oh!

eterekhin commented Jul 24, 2025 •

edited

Loading

Uh oh!

janvorli commented Jul 24, 2025

Uh oh!

eterekhin commented Jul 24, 2025

Uh oh!

eterekhin commented Jul 27, 2025

Uh oh!

janvorli commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame when they are higher than the top managed frame in stack #118015

Deliver exceptions to FuncEvalFrame and DebuggerU2MCatchHandlerFrame when they are higher than the top managed frame in stack #118015

Uh oh!

Conversation

eterekhin commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janvorli commented Jul 24, 2025

Uh oh!

eterekhin commented Jul 24, 2025

Uh oh!

eterekhin commented Jul 27, 2025

Uh oh!

janvorli commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eterekhin commented Jul 24, 2025 •

edited

Loading