-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow overriding processor count via configuration setting #52492
Allow overriding processor count via configuration setting #52492
Conversation
Tagging subscribers to this area: @tannergooding Issue DetailsIntroduce the
|
// - GetCurrentProcessCpuCount() on Unixes tries to take into account cgroups CPU quota limits where applicable | ||
processorCount = GetCurrentProcessCpuCount(); | ||
} | ||
processorCount = GetCurrentProcessCpuCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere the CanEnableThreadUseAllCpuGroups
check is moved inside GetCurrentProcessCpuCount()
.
src/coreclr/gc/gc.cpp
Outdated
#ifndef TARGET_WINDOWS | ||
// Limit the GC heaps to the number of processors available in the system. | ||
nhp = min (nhp, GCToOSInterface::GetTotalProcessorCount()); | ||
#endif // !TARGET_WINDOWS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clipping was needed by the previous version of the code. The number of GC heaps is still limited by MAX_SUPPORTED_CPUS
(always) and process_affinity_set->Count()
(if affinitizing).
{ | ||
m_enableGCCPUGroups = TRUE; | ||
m_threadUseAllCpuGroups = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_Thread_UseAllCpuGroups) != 0; | ||
m_threadAssignCpuGroups = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_Thread_AssignCpuGroups) != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No functional changes in this function. Simplifying the code.
if (IsInitialized()) | ||
return; | ||
|
||
if (InterlockedCompareExchange(&m_initialization, 1, 0) == 0) | ||
{ | ||
InitCPUGroupInfo(); | ||
m_initialization = -1; | ||
VolatileStore(&m_initialization, -1L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading/writing m_initialization
without any barriers seemed like a potential race condition, so I am changing the code to use VolatileLoad
/VolatileStore
. There should be no performance-critical code path calling this function.
return m_threadUseAllCpuGroups; | ||
} | ||
|
||
/*static*/ BOOL CPUGroupInfo::CanAssignCpuGroupsToThreads() | ||
{ | ||
LIMITED_METHOD_CONTRACT; | ||
_ASSERTE(m_enableGCCPUGroups || !m_threadAssignCpuGroups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some places in the old code used
if (CPUGroupInfo::CanEnableGCCPUGroups() && CPUGroupInfo::CanEnableThreadUseAllCpuGroups())
while other places used
if (CPUGroupInfo::CanEnableThreadUseAllCpuGroups())
Those conditions are equivalent: we set the second flag only when the first flag is set. For that reason I simplified the code everywhere and added this assert.
return m_threadAssignCpuGroups; | ||
} | ||
#endif // HOST_WINDOWS | ||
|
||
extern SYSTEM_INFO g_SystemInfo; | ||
|
||
int GetTotalProcessorCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has been refactored from GCToOSInterface::GetTotalProcessorCount
so we can use it in vm\appdomain.cpp
.
// For server GC this value indicates the number of GC heaps used in circular order to allocate sized | ||
// ref handles. It must not exceed the array size allocated by the handle table (see getNumberOfSlots | ||
// in objecthandle.cpp). We might want to use GetNumberOfHeaps if it were accessible here. | ||
m_iNumberOfProcessors = min(GetCurrentProcessCpuCount(), GetTotalProcessorCount()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the comment in getNumberOfSlots
:
runtime/src/coreclr/gc/objecthandle.cpp
Lines 525 to 536 in ebd695f
int getNumberOfSlots() | |
{ | |
WRAPPER_NO_CONTRACT; | |
// when Ref_Initialize called, IGCHeap::GetNumberOfHeaps() is still 0, so use #procs as a workaround | |
// it is legal since even if later #heaps < #procs we create handles by thread home heap | |
// and just have extra unused slots in HandleTableBuckets, which does not take a lot of space | |
if (!IsServerHeap()) | |
return 1; | |
return GCToOSInterface::GetTotalProcessorCount(); | |
} |
1add54d
to
cc1c6cd
Compare
I am puzzled why I am hitting this linker error on macOS:
The |
The reason is that mscordbi doesn't use PAL, but relies on symbols exported by mscordaccore instead. So you'll need to add the PAL_GetTotalCpuCount to the exported symbols of mscordaccore (in https://github.com/dotnet/runtime/blob/main/src/coreclr/dlls/mscordac/mscordac_unixexports.src) |
cc @dotnet/gc for the GC related changes. |
Any additional feedback? |
I'll take a look as well |
Do you plan to add some tests to validate this, or will that be a separate PR? |
There will be another PR to respect the job CPU limit. I can include a test that will test both settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Thank you all for the feedback and great suggestion that allowed to reduce code duplication. |
Introduce the
DOTNET_PROCESSOR_COUNT
configuration settings that allows to override the number of processors available for the process. Move theGetCurrentProcessCpuCount
function fromGCToOSInterface
toGCToEEInterface
to reduce code duplication. Fixes #48094.