-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SafeSocketHandle.CloseAsIs hanging in finalizer thread #40301
Comments
Tagging subscribers to this area: @dotnet/ncl |
cc: @tmds |
Smaller repro for dotnet/runtime#40301
Did my best to narrow this down to a smaller problem. Created a branch where you can see the delta between our unit tests passing and hanging. This commit shows that it's simply enabling one new test to run on Linux that causes this hang. To repro this do the following: > git clone https://github.com/jaredpar/roslyn -o jaredpar
> git checkout -B repro jaredpar/repro/pipe-hang
> cd src/Compilers/Server/VBCSCompilerServerTests
> dotnet build
> dotnet msbuild /t:Test Couple notes:
A brief description of what this particular test is doing:
|
Since .NET 5, Aborting the on-going runtime/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs Lines 196 to 235 in be197ad
Note that async |
I'm currently trying to figure that out, but my gut feel is that spinning in the finalizer thread is a bad idea in any case, and we should probably rethink the way |
Really wish I could've found a simpler repro here that would more clearly identify the operation occurring. Spent several hours on this but was unable to do so. More than happy to help out with any questions on the current repro. |
Looks like the issue is not caused by outstanding blocking calls. I have a smaller repro now: Here is what is happening in my understanding:
I have an idea for a PR that would remove the spinning. For some reason the issue does not happen with 3.1. |
This should not be happening in the product anymore. After filing the bug I was able to find a few other places these were leaking, made a push to get rid of them, and I believe I got them all at this point. It's possible it's still happening in the tests though. |
👍 Makes sense.
The spinning deals with a race condition that occurs as follows:
If Thread B does not call TryUnblockSocket again (which it does now by spinning), the operation on thread A is not aborted. |
@tmds I may miss something important but: |
When the |
Anton, I think we could |
@tmds what will happen to those ongoing operations if we close the underlying handle? I think we really need to distinguish between the finalizer and a normal Dispose. It already indicates a user bug, if a finalizer is being executed. Instead of being 100% correct handling blocking operations I would only aim for preventing handle leak in those cases. I believe the way
I would prefer to find a way to solve this in Sockets instead of just changing the way NamedPipes are utilizing sockets. As an alternative, we can document that user code should never |
With 3.1 on-going operations didn't ref the
In case of a finalizer, we may assume there are no on-going operations, because those would prevent the |
What I mean is that in a finalizer (instead of waiting for the handle to be released) we may want force-close the handle, independently of the ref-count/release mechanism.
Unfortunately, this is not true. In this particular case we are in the finalizer of the Edit: on the other hand, this should be a very rare corner-case. We expect blocking socket operations to be triggered from managed |
I just noticed this comment in
Edit: I found #37873 now, which answers my question. Still hoping we can find a way to remove the spinning instead of working around this in |
I wonder if the concern of indefinite spinning (or blocking |
Just an illustration of the bigger issue we have here. It's this easy to create a finalizer thread hang now: static void Main(string[] args)
{
CreateSocket();
Console.WriteLine("Dying!");
GC.Collect();
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void CreateSocket()
{
Socket socket = new Socket(SocketType.Stream, ProtocolType.Tcp);
bool dummy = false;
socket.SafeHandle.DangerousAddRef(ref dummy);
} |
This is definitely related to the fix we made in dotnet/corefx#38499. Before that we didn't really have a socket finalizer, so we would not have hit this issue. That said, we cannot undo the change in dotnet/corefx#38499 (at least not without making other, extensive changes). The reasoning is explained here: #29327 (comment) I think the best fix for 5.0 is to simply not try to cancel outstanding operations when the Socket finalizer is called, as was suggested above. It's possible that this could lead to some unexpected behavior in weird corner cases, but I think those corner cases are restricted to the case where you do a DangerousAddRef without a corresponding DangerousRelease. Since we are currently hanging the finalizer thread in that case, we can probably live with a few quirky corner cases for now. I do think we should revisit all of this in the future. Ideally we shouldn't need to have a finalizer on Socket. But changing that would take some effort and research etc. |
The spinning was introduced as part of dotnet/corefx#38804. In that PR, Because previously there were no out-standing references, the handle was released as soon as the Due to the use of The spinning was added to deal with the race described in #40301 (comment). But I did not consider someone other than Based on what was said I think we are safe to not call |
🎉 |
This issue comes from investigating test failures on this Roslyn PR dotnet/roslyn#46510
At the conclusion of running the unit tests for VBCSCompiler server the xUnit process will refuse to exit. The xUnit output will indicate that the tests have completed running but the process itself will not exit. Attaching the debugger to the xUnit process and there are two threads of note that are still running:
GC Finalizer
.NET Sockets
The VBCSCompiler server makes heavy use of named pipes. Looking through the
Socket
on the finalizer thread I can confirm it's a Unix Domain socket related to the named pipes the compiler is creating (the path in the end point matches the paths we create in the tests).Unfortunately after a day of debugging I have not been able to narrow this problem down any further:
NamedPipeServerStream.Dispose
on any instance that was hung in aWaitForConnectionAsync
call.None of these has had any impact though. I've also been unsuccessful in constructing a more concise repro. 😦
More than happy to provide any info to make tracking this down easier.
Repro Information:
The text was updated successfully, but these errors were encountered: