-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account for availability of multiple processor groups on Windows 11+ #68639
Account for availability of multiple processor groups on Windows 11+ #68639
Conversation
Tagging subscribers to this area: @mangod9 Issue Details
Fixes #67180.
|
What should we be doing around APIs like: While these have been incomplete since machines could have more than 64 processors, they've at least been consistent with the runtime defaults, and would only end up being inconsistent if you specified these environment variables to opt in to a larger number of processors. With this change, it seems these APIs will now be problematic if, for example, code is roundtripping the value, e.g. read the affinity, temporarily change it, set it back to what it was before. |
@stephentoub That is an interesting question; however, I have doubts regarding claimed consistency at present. For instance, in some runs on an 80 core machine (with no environment variables set) I see the following output: // 16
Console.WriteLine(Environment.ProcessorCount);
// 0xffffffffffffffff
Console.WriteLine($"0x{Process.GetCurrentProcess().ProcessorAffinity:x}"); Please note that this change does not opt in to use a larger number of processors. Instead, it reflects the changes introduced by Windows 11. For compatibility reasons regarding old affinity APIs, Windows 11 assigns a "primary processor group" to all processes and threads, which should reduce the number of broken apps. Of course, there is no guarantee. |
Returning 0xffffffffffffffff when the ProcessorCount returns 16 seems like a bug. But regardless my concern isn't if ProcessorAffinity returns a bitset representing a larger number of cores than is available, but rather ProcessorAffinity only allowing you to specify up to 64. If ProcessorCount actually returned 80 in your example, and by default the process was affinitized to all 80 cores, then someone doing At a minimum, we should validate that ProcessorAffinity and friends are consistent with themselves, e.g. if the getter is returning the value for a single processor group, then setting it should affect the same processor group, and leave other processor group affinities untouched, in which case my example wouldn't be problematic. |
|
I do not think the old affinity API that we use allows that.
How frequently is this pattern used in real apps? |
So normally apps use var self = Process.GetCurrentProcess();
self.ProcessorAffinity = new IntPtr((1L << limitCoresTo) - 1); Then implementing your suggestion:
would actually break those apps. For example, instead of limiting the process to 2 cores, it would limit the process to 2+16 cores. Am I missing something? |
Are you sure? Today Process.ProcessorAffinity on Windows just calls SetProcessAffinityMask, which is documented as:
That suggests it would already have the behavior of limiting it to 2+16 cores, on what I assume is your example system of an 80 core machine. Either that, or throwing an exception. I don't have an 80 core system on which to test. |
That means if the CLR runtime starts assigning thread affinities (controlled by |
Ok, in that case we're back to the issue I described, of roundtripping the value as in various of those examples downgrading how many cores the process will actually be able to use. |
The roundtripping issue cannot be solved using existing managed API, such as |
It's not just temporarily downgrading. It's reading the mask to know what's currently affinitized, tweaking the value, and writing it back. Most of the links above fit that pattern, and I found those in just a few minutes of searching. |
Oh, I was thinking about restoring the original affinity. Tweaking the affinity will continue to work with the caveat I mentioned. There are many issues in those code links though. For instance, the first link has the following code, which will not work as intended if private static IntPtr FixAffinity(IntPtr processorAffinity)
{
int cpuMask = (1 << Environment.ProcessorCount) - 1; |
Can you elaborate? If setting Processor.ProcessAffinity results in scoping down to just a single processor group, then someone doing something like: |
Yes, this affinitizes the process to a single processor group: currentProcess.ProcessorAffinity = currentProcess.ProcessorAffinity; Not sure what we could do here without requesting code changes. I think treating the full mask "magically" is not an option. |
@Maoni0 @kouvel @mangod9 To minimize the impact, we may enable assigning threads to processor groups only when both the Do you think we should minimize the impact and avoid the business of explicitly setting thread affinities unless requested as described above? |
Why does that happen?
On an OS that schedules threads across processor groups (or some other similar abstraction), I would expect this setting to be disabled because the OS knows much better where to schedule threads. Should this setting be disabled by default on Win11 when there are multiple CPU groups (similarly to Linux)?
There are cases where the thread-spreading that the CLR does is somewhat effective, but it's not great and it generally [edit] is more effective with server GC due to explicitly assigned thread affinities for GC threads, where the scheduler appears to otherwise schedule threads across CPU groups in round-robin fashion (not sure if that has been fixed). I don't know if there is a generic solution, but a possibility may be to change the defaults for those boolean config switches to |
Is there another way to implement the method that avoids that, e.g. a newer function it could call? |
That error is reported by
@kunalspathak has made some performance measurements on an Ampere machine. The performance was better when thread affinities were assigned by the CLR runtime. That machine has a single NUMA node and we have not done extensive testing though.
The |
I think there is no good way other than iterate over all threads in the process and reset their affinities. |
It sounds like
I figure it would only help to avoid the affinity API issue when not using server GC because GC threads are also affinitized. I don't think it's worth doing the above just to avoid the affinity API error. Also due to the scheduling issue when there are multiple CPU groups and server GC is enabled, it would probably make sense to have As for the |
Yes.
Correct.
I am afraid that may lead to surprising and non-deterministic behavior in scenarios where an app attempts to restrict its affinity. Regarding your idea of having three values for the switch, I think another possible option is to keep |
It's also a bit awkward that the affinity can be restricted or expanded dynamically, and to honor it the CPU group info would need to be updated, and thread affinities and thread group affinities reassigned. For the way the affinity APIs work currently, it seems like it would only make sense when using multiple CPU groups is disabled. So perhaps another option for now is to have a way to disable using multiple CPU groups and expect that they be disabled in those scenarios. Then affinity changes when using multiple CPU groups could throw an exception. Also wonder if the CPU set APIs could be used to change affinities instead. Not sure how they would interoperate with thread affinities and thread group affinities though. |
@stephentoub @kouvel Thank you for your very valuable feedback! Do you think we need a meeting to discuss this further? Note that we have wanted to start using all processors on Windows by default for a long time (#13465). Windows 11 makes that easier for us by no longer constraining processes to a single processor group. The behavior of some apps may change on Windows 11 with or without this PR. As @kouvel noticed, even without changes, setting process’s affinity in the middle of the process’s lifetime may not work as intended. For instance, we neither update the As a workaround, users may force running a given process on a single processor group with I would like to have opportunity to receive user feedback on this change, so please let me know if I should proceed with this PR or/and schedule a meeting to discuss possible alternatives. |
It seems there's two modes of failure here:
It's the latter mode that concerns me most (although the silently-doing-something-different-than-intended aspect of the first also concerns me), in particular because folks are only going to face this on certain hardware. Why don't we have a short meeting to discuss. I don't want to block progress here, and it'd be great if we could come up with a mitigation. If we proceed with setting the affinity by default such that any call to Process.ProcessorAffinity is going to throw on some hardware, we need to change the implementation of ProcessorAffinity in some way. |
Another option may be to not affinitize GC threads and not set group affinity for other CLR-created threads by default on Win11, though that would probably need more extensive testing. With threads not affinitized explicitly, the scheduling issue would likely go away, and |
Wouldn't that negatively affect server GC performance? It would be unfortunate to sacrifice performance of most apps for the sake of a rarely used feature. |
I'm skeptical. I don't know the reason why server GC threads are affinitized. Maybe it has something to do with pre-Win11 scheduler behavior. Maybe that reason doesn't hold anymore with the Win11 scheduler changes. |
@AntonLapounov - any update on this PR? Is there anything left? |
I have split off the clean-up part and merged. The rest cannot be merged without breaking existing apps that set the process affinity mask on Windows. Setting affinities for threads would break existing apps, not setting them would degrade performance. Another possible option may be to use CPU Sets API to set 'soft' affinities for threads, which I am still investigating. |
Closing this PR as we will likely use a different approach. |
Do you think we will be doing anything in .NET 7? |
9ce0750
to
4cbd7a1
Compare
23ed07f
to
c8dcb34
Compare
On Windows 11+ and Windows Server 2022+, a process is no longer restricted to a single processor group by default. If more than one processor group is available to the process (a non-affinitized process on Windows 11+), default to using multiple processor groups; otherwise, default to using a single processor group. This default behavior may be overridden by the
DOTNET_GCCpuGroup
andDOTNET_Thread_UseAllCpuGroupsconfiguration
configuration values.Fix comments according to dotnet/coreclr#3896 (comment).
Fixes #67180.