Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename Sys.CPU_CORES to more explict Sys.CPU_LOGICAL_CORES #27856

Closed
wants to merge 2 commits into from

Conversation

juliohm
Copy link
Contributor

@juliohm juliohm commented Jun 29, 2018

This pull request renames Sys.CPU_CORES to the more explicit name Sys.CPU_LOGICAL_CORES, and the corresponding env variable JULIA_CPU_CORES to JULIA_CPU_LOGICAL_CORES.

I've experienced cases of users on Windows using CPU_CORES thinking of physical cores, and then having their simulations running 1000x slower because hyperthreading is not supported sometimes.

The changes also leave the door open for a future Sys.CPU_PHYSICAL_CORES with the actual physical cores.

Please let me know if something is missing or if the changes are not appropriate.

@ararslan
Copy link
Member

Seems like we could just better document that Sys.CPU_CORES is the number of logical cores.

@juliohm
Copy link
Contributor Author

juliohm commented Jun 29, 2018

Sounds like an alternative, but do you think it would solve the misinterpretation effectively? And if we need to add the actual physical cores in the future, how should it be called?

@StefanKarpinski
Copy link
Member

Calling it Sys.CPU_LOGICAL_CORES seems reasonable to me. Between that and a hypothetical Sys.CPU_PHYSICAL_CORES, you have an upper and lower bound on the amount of actual concurrency you can possibly hope to get.

@nalimilan
Copy link
Member

Can you look at what terminology other software use?

One advantage of keeping CPU_CORES is that it reflects the number of threads we use by default for OpenBLAS (IIRC), so it's a reasonable default for users. If it doesn't work in some cases, maybe it just needs more documentation. BTW, 1000× really sounds like a lot, do you have details on that workflow?

@juliohm
Copy link
Contributor Author

juliohm commented Jun 29, 2018

Thank you all for the comments. I like the shorter name CPU_CORES as well, however different software use it with different meaning. Having seen the efforts in Julia to make names more explicit, I thought this one would also be a good candidate to fix. The current documentation already mentions logical cores, but I think not many people in the sciences are aware of this differentiation logical vs. physical cores and the impacts of hyperthreading to performance.

Regarding the slow down, I think I've exaggerated the number, sorry. I could only find an old issue with a 10x slowdown that we took a while to figure out. After back and forth with many hypothesis falsified it was just a misinterpretation of the variable name: #16570

@nalimilan
Copy link
Member

Doesn't #16570 indicate that FFTW should use different defaults on Windows? Skimming that discussion, it seems the user hasn't explicitly used CPU_CORES, right?

@juliohm
Copy link
Contributor Author

juliohm commented Jun 29, 2018

In that code, I was setting the number of threads to CPU_CORES by default for the user because I had misinterpreted it as physical cores. This number of threads was then fed to FFTW inside of my package.

I fixed the issue later by depending on Hwloc.jl which uses the terminology "Core" for physical cores and "PUs" for logical units:

import Hwloc
topology = Hwloc.topology_load()
counts = Hwloc.histmap(topology)
ncores = counts[:Core]
npus = counts[:PU]
println("This machine has $ncores cores and $npus PUs (processing units)")

The CpuId.jl package uses cpucores for physical cores and cpucores_total for logical cores. Just based on these two examples, we see how the terminology varies considerably.

@StefanKarpinski
Copy link
Member

having their simulations running 1000x slower because hyperthreading is not supported sometimes.

This doesn't really make much sense to me:

  • If hyperthreading isn't supported then "physical cores" = "logical cores".
  • A 1000x slowdown from oversubscribing the number of physical cores seems unlikely.

@juliohm
Copy link
Contributor Author

juliohm commented Jun 30, 2018 via email

@juliohm
Copy link
Contributor Author

juliohm commented Jul 1, 2018

Based on the positive hands in @StefanKarpinski 's comment, I think most agree the change would improve clarity.

The build is failing because I am using the macro @deprecate incorrectly. Could you please advise?

@deprecate Sys.CPU_CORES Sys.CPU_LOGICAL_CORES

@fredrikekre
Copy link
Member

@eval Sys @deprecate_binding CPU_CORES CPU_LOGICAL_CORES false

@@ -1732,6 +1732,9 @@ end
@deprecate mapfoldl(f, op, v0, itr) mapfoldl(f, op, itr; init=v0)
@deprecate mapfoldr(f, op, v0, itr) mapfoldr(f, op, itr; init=v0)

# PR #27856
@eval Sys @deprecate_binding CPU_CORES CPU_LOGICAL_CORES false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredrikekre did as you suggested but the CI build is saying that @deprecate_binding is not defined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base.@deprecate_binding

@juliohm
Copy link
Contributor Author

juliohm commented Jul 2, 2018

The error in Travis now is UndefVarError(:CPU_LOGICAL_CORES). Could you please advise on other potential issues in this PR?

@KristofferC
Copy link
Member

It seems the variable CPU_CORES is not created until the __init__ function is run, which hasn't happened yet when the deprecation macro is run.

@StefanKarpinski
Copy link
Member

Workflow I use:

  • just do the rename and push that and make sure everything passes
  • add the deprecation code in a separate commit

In this case it seems like the deprecation may need to be done in Base’s bootstrap code.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 4, 2018

The build is passing without the deprecation warning. Can you advise on how to re-add the deprecation in Base's bootstrap code?

@juliohm
Copy link
Contributor Author

juliohm commented Jul 5, 2018

Or if you prefer to merge and add the deprecation warning on a separate PR, that works too.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 10, 2018

Are you still interested in this change? The PR is ready.

Copy link
Member

@ararslan ararslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be merged without a deprecation

@StefanKarpinski
Copy link
Member

That deprecation was trickier than expected. Although values that are set on startup always are.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 11, 2018

Thanks for the addition, the build is passing, should be ready to merge 👍

@StefanKarpinski
Copy link
Member

Resolved the conflict, letting CI run again, we can merge as soon as anything passes.

@StefanKarpinski
Copy link
Member

Wrong button, sorry.

@JeffBezanson
Copy link
Member

I agree we should eventually have CPU_LOGICAL_CORES and CPU_PHYSICAL_CORES. But, I also think it's ok to keep CPU_CORES as it is now (as an alias for CPU_LOGICAL_CORES), and just document that its meaning may change. For example, Stefan suggested that we might want to set CPU_CORES to logical or physical based on the architecture, basically picking a default for you.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018

I don't know if that improves the situation. Picking defaults is always a tricky task, and the name CPU_CORES does not suggest something that is platform-dependent.

Besides, the chance of users adopting this platform-dependent default when they should have been explicit about what they wanted are big. Perhaps another name that is not CPU_CORES could be used for this platform-dependent default? Sys.MAX_THREADS?

@JeffBezanson
Copy link
Member

I would be ok with making CPU_CORES refer to physical cores, and using MAX_THREADS for fake cores.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018

So the proposal is to rename the current CPU_CORES to MAX_THREADS and leave the name CPU_CORES free for future implementation?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 12, 2018

CPU_CORES does not suggest something that is platform-dependent.

No one is suggesting that it be platform-dependent, the suggestion is that it be hardware-dependent... and it seems fairly obvious that CPU_CORES would depend on your hardware.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@JeffBezanson
Copy link
Member

JeffBezanson commented Jul 12, 2018

that supports hyperthreading and can do four logical cores without performance hits

Does such a thing exist?

If I install Linux versus Windows

No, as Stefan said this would be hardware-dependent, not OS-dependent.

I just don't want writing CPU_LOGICAL_CORES to be my only option for getting a core count.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018

Sorry for the naive question, isn't hyperthreading technology implemented at the software level?

@JeffBezanson
Copy link
Member

Not entirely; it also requires a hardware feature that exposes a different number of CPUs to the OS.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@JeffBezanson
Copy link
Member

No, the number of physical cores and hyper-threaded cores are both determined entirely by the hardware. The number of hyper-threads supported by your CPU does not change when you change your OS.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 12, 2018

The trouble is that naively people expect CPU_CORES to tell them how many cores there are and how many threads they should use. Both of those questions are subtler than they used to be—there is no one correct answer to either—and they may not be the same. You seem to have gotten burned by one interpretation, but there's no reason that the other meaning won't burn you just as badly.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@StefanKarpinski
Copy link
Member

It is natural to interpret CPU_CORES as cpu cores, which is not what the Julia implementation is doing.

The point is that it depends on what the definition of what a "cpu core" is—and this is not clear-cut. You want to change it to mean "physical cores" based on a a single experience on one piece of hardware with one specific workload for which having as many threads as there are logical cores happened to cause suboptimal performance. On the same hardware, there are other workloads where having as many threads as there are logical cores will lead to optimal performance and only having as there are physical cores will lead to highly suboptimal performance. On yet other hardware (e.g. Xeon Phi), it is impossible to get optimal performance if the number of threads is limited to the number of physical cores: you can only get full usage of the hardware with hyper-threading.

Jeff's proposal is (as I understand it) to leave CPU_CORES as is but document that may change in the future and is defined to give you some value between the number of logical cores and physical cores that we believe to be a pretty good guess at how many threads you want to use for CPU-intensive workloads on your hardware. The current definition as logical cores is good for some machines (Xeon Phi) but bad for other machines, so this should be improved. In addition, we can expose the number of logical and physical cores separately.

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Jul 12, 2018
@JeffBezanson
Copy link
Member

+1 for CPU_CORES and CPU_THREADS.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 12, 2018

Based on triage discussion, "CPU cores" is most commonly used to mean "physical cores" whereas "CPU threads" is often used to mean "logical cores"—i.e. number of hardware threads (Intel is not entirely consistent about this, but this is the trend). Accordingly, an alternate proposal is to rename CPU_CORES to CPU_THREADS now and potentially reintroduce CPU_CORES in the future to give the number of physical cores. It should be noted that this number may not be as useful as people believe it to be. The number of threads that a process should use is a different matter, but is generally some n between CPU_CORES ≤ n ≤ CPU_THREADS. I'll update the PR accordingly.

@juliohm
Copy link
Contributor Author

juliohm commented Jul 12, 2018 via email

@StefanKarpinski
Copy link
Member

Got into some weird situation with GitHub remotes and it was easier to make a new PR: #28091.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants