-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow thread suspension in busy loop #94767
Comments
Tagging subscribers to this area: @mangod9 Issue DetailsI have encountered a case where thread suspension before a GC takes multiple seconds (up to 70 seconds in the worst case). A minimal repro can be found here: https://gist.github.com/szehetner/47515ee0f28e2ca9d4990d60ac230a07 This starts a thread in a busy spin loop and triggers GCs. In a lot of iterations the GC takes multiple seconds. PerfView confirms that the time is spent during thread suspension: Tested with .NET 7 and .NET 8 on x64 Windows - suspension times seem to be lower in .NET 8, but still significant. My understanding is that it shouldn't be possible for user code to prevent thread suspension for such a long time. Is this an issue in the runtime? Or is there something I can do to speed up the suspension without giving up control of the thread (so without Thread.Yield() or Thread.Sleep())?
|
cc @VSadov |
Interesting. I will take a look. |
It looks a lot like a case of a tight loop with a very fast call in it. I can reporduce this and behavior is also sensitive to OS (Win11 seems less affected than Win10) and also depends on the suspension implementation (NativeAOT seems more robust)
|
This should not be generally happening and this situation if fairly uncommon. The scenario here is one of the hardest cases for suspension. Since there is a call in the loop, suspension is supposed to catch the thread when it returns from the call, but since the call does literally nothing, there is an extremely short opportunity. Such loops typically do not run for long, since they are clearly wasting CPU, but this one does... Then depending on OS API latencies and how suspension performs the retries the problem could be amplified to take seconds or minutes to suspend. Since NativeAOT performs better here, it will be worth looking into what is happening in CoreCLR.
As a temporary workaround making a polling call could help. There is no specific API to do a suspension poll, but many OS/interop services do a poll. Calling int someCounter = 0;
while (!_cts.IsCancellationRequested)
{
_idleStrategy.Idle(0);
// call Thread.Yield() once in a few iterations.
//
// if (someCounter++ % 1024 == 0)
// Thread.Yield();
// or call some cheap OS API for the sideeffect of suspension poll
_ = Environment.TickCount;
} |
Thanks for the explanation. Using Environment.TickCount looks promising, I will test this further. To provide some background on the busy spinning loop: Our actual system uses https://github.com/AdaptiveConsulting/Aeron.NET for receiving messages. At the core of that is a loop to poll for incoming messages. Keeping this thread spinning is a deliberate choice to achieve lower latency at the cost of wasted CPU cycles. |
In my understanding, if this is the case, you don't want the GC to suspend this thread, do you? I think in such a latency sensitive scenario, it might be better to separate the application into busy state and idle state, disable GC completely in the first state, and only do GC in the second. |
With #95565 and #94767 changes, the repro scenario sees suspension times in sub-millisecond range, which is below the benchmark's sensitivity. See: #95565 (comment) Theoretically it may still be possible to observe a difficult-to-suspend loop, but such scenario would be hard to construct even intentionally. I think we can close this issue now as addressed. |
I have encountered a case where thread suspension before a GC takes multiple seconds (up to 70 seconds in the worst case). A minimal repro can be found here: https://gist.github.com/szehetner/47515ee0f28e2ca9d4990d60ac230a07
This starts a thread in a busy spin loop and triggers GCs. In a lot of iterations the GC takes multiple seconds. PerfView confirms that the time is spent during thread suspension:
Tested with .NET 7 and .NET 8 on x64 Windows - suspension times seem to be lower in .NET 8, but still significant.
My understanding is that it shouldn't be possible for user code to prevent thread suspension for such a long time. Is this an issue in the runtime? Or is there something I can do to speed up the suspension without giving up control of the thread (so without Thread.Yield() or Thread.Sleep())?
The text was updated successfully, but these errors were encountered: