Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Environment.ProcessorCount to return total machine processors in containers #48094

Closed
richlander opened this issue Feb 10, 2021 · 14 comments · Fixed by #52492
Closed

Enable Environment.ProcessorCount to return total machine processors in containers #48094

richlander opened this issue Feb 10, 2021 · 14 comments · Fixed by #52492

Comments

@richlander
Copy link
Member

richlander commented Feb 10, 2021

There has been significant discussion and feedback about the value of Environment.ProcessorCount presented in containers. We know that there are scenarios that are underserved by the behavior of this API. We haven't exposed a new API because we haven't developed enough insight on what exactly it would do (as simple as it may seem).

This discussion relates to scenarios where the --cpus or related arguments are used with docker run. There are other ways of controlling CPU usage with docker like CPU affinity that are not related to this proposal.

  • Current behavior: Environment.ProcessorCount returns the rounded up decimal value presented via --cpus. For example --cpus 0.4 will result in a return value of 1.
  • Proposed OPT-IN behavior: Environment.ProcessorCount returns the value specified by an environment variable.

In terms of the the opt-in mode, for now I'm proposing we expose an environment variable: DOTNET_PROCESSOR_COUNT

The value is converted to an integer and returned as the new value of Environment.ProcessorCount. It must be at > 0 and <= the actual processor count. If those requirements are not met, the actual processor count is returned.

Notes:

  • There will be no check that the runtime is running in a container. This new ENV will be generic.
  • This value is expected to affect Environment.ProcessorCount for managed code, and any analogues to it in the native runtime code.

Assuming we collect more information on this behavior providing significant benefit, we may provide a new API that provides this information. If you like this idea, please help motivate it with usage information.

We previously (extensively) researched offering the unclipped behavior as the default value for this property. The challenge is that we found that some subsystems always work best when honoring the clipped behavior and other subsystems provide neutral to (potentially much) better behavior when honoring the unclipped behavior. It's this distinction that is the challenge. We have to decide which value each subsystem should honor, and whether we need to expose more configuration options to further tune behavior. We need more findings in the real world to help us on this journey. Provided we have compelling data, multiple avenues are open on this topic.

Related issues:

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 10, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@stephentoub
Copy link
Member

stephentoub commented Feb 10, 2021

There are lots of places influenced directly or indirectly by the result of ProcessorCount. Is changing it globally for all consumers going to have the desired effect and something someone can reason about? As you said, different subsystems benefit differently from the behaviors. Or is the idea it's just another knob someone can try to see if it helps performance in aggregate?

@richlander
Copy link
Member Author

Great question, @stephentoub. The goal of this is to enable experimentation. For now, you'd get the performance in aggregate, as you call out. Assuming we get adequate and compelling information, we may offer something different.

@stephentoub
Copy link
Member

To be clear, I'm fine with us adding the environment variable.

Just trying to understand what that "adequate and compelling information" information would look like. Given that this is a global switch, what would cause us to think we needed to offer something different? Certainly I can imagine the simple case of "I flipped the switch and my perf got better", and we get enough of that signal to decide to change the default (though we already hit issues with that, as you call out). What would feedback look like that would cause us to consider something more fine-grained? Someone flips the switch, gets slightly better perf, takes traces and shares them with us, and that influences our understanding? Something else?

@richlander
Copy link
Member Author

richlander commented Feb 10, 2021

I think there is near zero chance we'll change the default on this property. The experience of doing that last time was bad for everyone involved, and nothing has changed to suggest a different outcome (other than that people would be less tolerant of trying it a second time). I know you probably know that. I'm just wanting to send a clear message to anyone reading this that changing the default isn't on the table.

Certainly I can imagine the simple case of "I flipped the switch and my perf got better", and we get enough of that signal

That's it. If we get interesting information, then we can potentially offer additional customization under a different configuration or something else. We're about to start our container performance testing, soon. I have a doc on that that I intend to publish. As part of that, we should dust off our investigation from #622. We produced a lot of useful data (mostly trapped in email) that suggests that there is "a there, there" with "max procs" enabled. We just didn't get to the point of productizing that, for a variety of reasons (almost entirely schedule problems). This would all lead to exposing one or more different APIs or configuration or something else.

@jkotas was directly involved in those investigations. I'm curious if he has a similar view, that the investigation data suggested that there is a different performance profile enabled by max procs (or at least higher procs than the default) that would be beneficial for some apps. ?? In fairness, I need to re-read those mails to ensure that my memory of that investigation is sound.

@jkotas
Copy link
Member

jkotas commented Feb 10, 2021

Agree. We know that there are likely some workloads where this setting can be used to optimize performance. This setting will allow us to find out more data about it and unblock the workloads where the default policy works particularly poorly.

This setting can be used as a mitigation for #47427 breaking change.

Certainly I can imagine the simple case of "I flipped the switch and my perf got better"

Yes, we have seen some of that signal e.g. in #11933 (comment) where the container had to be given artificially high quota to make the runtime allow sufficient level of parallelism.

Both Go and Java have equivalent configuration option. It suggests that it is something generally useful.

@kouvel kouvel removed the untriaged New issue has not been triaged by the area owner label Feb 25, 2021
@kouvel kouvel added this to the 6.0.0 milestone Feb 25, 2021
@AntonLapounov
Copy link
Member

AntonLapounov commented Apr 24, 2021

The proposal text implies we want to override the value returned by Environment.ProcessorCount without affecting the native runtime, which does not use managed properties. Is that true intention here? I doubt Java's -XX:ActiveProcessorCount works that way. As a side note, Java does not seem to limit the option value by the total number of processors available either.

Does the "actual processor count" in the proposal text refer to the total number of processors available disregarding affinity, cgroup, and other limitations? (Otherwise, we will not be able to mitigate #47427 breaking change.)

DOTNET_ProcessorCount would be a more consistent configuration setting name. We use the underscore in configuration setting names mostly for groups of related settings, e.g., Thread_* or ThreadPool_*.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2021

The proposal text implies we want to override the value returned by Environment.ProcessorCount without affecting the native runtime

The intention of this proposal is to use the configured count everywhere the runtime uses processor count.

The split between native and managed implementation is accidental and it is changing over time. As you have pointed out, it would not make sense to say that the configuration should be applicable to parts that happen to be implemented in the managed code at the moment.

Java does not seem to limit the option value by the total number of processors available either

I agree that limiting it like this is unnecessary restriction. It makes sense to respect any reasonable positive value, say up to 0xFFFF hex.

DOTNET_ProcessorCount would be a more consistent configuration setting name

Depends what we want to be consistent with. The intentionally designed .NET Core DOTNET_* environment variables are all upper case, a few examples from many DOTNET_ROOT, DOTNET_MULTILEVEL_LOOKUP, DOTNET_SYSTEM_GLOBALIZATION_INVARIANT or DOTNET_SYSTEM_NET_SOCKETS_THREAD_COUNT. The runtime env variables in the CoreCLR configuration system inherited from .NET Framework are typically Pascal case.

@AntonLapounov
Copy link
Member

Thank you, that makes more sense. The proposal text has to be clarified accordingly.

Depends what we want to be consistent with.

Very few settings in clrconfigvalues.h, besides profiler ones, follow ALL_CAPS style. If there is no global consistency, it might be a good idea to keep at least local consistency. All mentioned DOTNET_ALL_CAPS settings are defined elsewhere.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2021

besides profiler ones, follow ALL_CAPS style

And besides DOTNET_MODIFIABLE_ASSEMBLIES that was the most recent addition. Also, vast majority of the settings in clrconfigvalues.h are internal undocumented settings.

From customer point of view, it does not matter which file is used to define the configuration setting in the implementation. We should optimize for consistency from customer point of view.

@AntonLapounov
Copy link
Member

AntonLapounov commented Apr 26, 2021

We should optimize for consistency from customer point of view.

Then it makes sense to review officially documented run-time configuration settings:

4 configuration settings follow CORECLR_ALL_CAPS style CORECLR_ENABLE_PROFILING
CORECLR_PROFILER
CORECLR_PROFILER_PATH
CORECLR_PROFILER_PATH_64
5 configuration settings follow DOTNET_NAMESPACE_ALL_CAPS style DOTNET_SYSTEM_GLOBALIZATION_INVARIANT
DOTNET_SYSTEM_GLOBALIZATION_USENLS
DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_HTTP2SUPPORT
DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER
DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_ALLOWLATIN1HEADERS

In this category underscores come from ._ translation of the corresponding runtimeconfig.json setting names: System.Net.Http.UseSocketsHttpHandlerDOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER.

29 configuration settings follow COMPlus_PascalCase style COMPlus_TieredCompilation
COMPlus_TC_QuickJit
COMPlus_TC_QuickJitForLoops
COMPlus_ReadyToRun
COMPlus_EnableDiagnostics
COMPlus_PerfMapEnabled
COMPlus_PerfMapIgnoreSignal
COMPlus_gcServer
COMPlus_gcConcurrent
COMPlus_GCHeapCount
COMPlus_GCHeapAffinitizeMask
COMPlus_GCHeapAffinitizeRanges
COMPlus_GCCpuGroup
COMPlus_GCNoAffinitize
COMPlus_GCHeapHardLimit
COMPlus_GCHeapHardLimitPercent
COMPLUS_GCHeapHardLimitSOH
COMPLUS_GCHeapHardLimitLOH
COMPLUS_GCHeapHardLimitPOH
COMPLUS_GCHeapHardLimitSOHPercent
COMPLUS_GCHeapHardLimitLOHPercent
COMPLUS_GCHeapHardLimitPOHPercent
COMPlus_GCHighMemPercent
COMPlus_GCRetainVM
COMPlus_GCLargePages
COMPlus_gcAllowVeryLargeObjects
COMPlus_GCLOHThreshold
COMPlus_GCName
COMPlus_Thread_UseAllCpuGroups

Arguably, adding DOTNET_PROCESSOR_COUNT and DOTNET_MODIFIABLE_ASSEMBLIES would not increase consistency from customer point of view.

@jkotas
Copy link
Member

jkotas commented Apr 26, 2021

See #13691 (comment)

@richlander
Copy link
Member Author

I added a new "Notes" section and added a comment about affecting the native runtime.

We could use DOTNET_SYSTEM_ENVIRONMENT_PROCESSORCOUNT. That works for me.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 8, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 11, 2021
@richlander
Copy link
Member Author

See #53149 for a demonstration of this change.

@ghost ghost locked as resolved and limited conversation to collaborators Jun 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants