Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add BLAS.get_num_threads #36360

Merged
merged 30 commits into from
Jun 30, 2020
Merged

add BLAS.get_num_threads #36360

merged 30 commits into from
Jun 30, 2020

Conversation

jw3126
Copy link
Contributor

@jw3126 jw3126 commented Jun 19, 2020

From slack, also came up on discourse

Copy link
Contributor

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be nice to have. I think the obvious use is

old = get_num_threads()
@threads for i in 1:100
    # use *
end
set_num_threads(old)

If it's possible to have Julia without one of the libraries handled by set_num_threads, then I think this should do nothing, and not give an error.

stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/test/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
@ViralBShah ViralBShah added the linear algebra Linear algebra label Jun 19, 2020
Comment on lines 164 to 166
ret = nothing
@warn "Could not get number of BLAS threads. Returning `$ret` instead." maxlog=1
return ret
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice that the current code ties the implementation and the warning message. But I fear other core devs might not like "unnecessary" dynamism (as @OkonSamuel pointed out). I think it might be safe to just do:

Suggested change
ret = nothing
@warn "Could not get number of BLAS threads. Returning `$ret` instead." maxlog=1
return ret
@warn "Could not get number of BLAS threads. Returning `nothing` instead." maxlog=1
return nothing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with returning nothing, should there be any warnings at all?

I thought these two were complementary approaches, either return something clearly marking the failure to read how many threads, or else return an integer & warn that it may not be correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a warning won't do any hurt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is used as

old = get_num_threads()
# ...
set_num_threads(old)

then the end-user (which may not be who writes the function) cannot notice something fishy is going on unless there is a warning.

Also, I'm guessing this code pass is rarely used. So, I guess it's OK to make things verbose until somebody complains. It's also possible for a user to suppress the warning.

Copy link
Contributor

@mcabbott mcabbott Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a performance optimisation, those tend to give up silently. (Although I wish more of them had @debug statements.) That said I agree it shouldn't be triggered often.

If there is to be a warning, just one is enough, not two. Edit -- actually the above save/reset might emit 3 warnings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question is if it is OK to consider changing number of threads as pure optimization. It changes the result slightly:

julia> A = randn(300, 300); B = randn(300, 300);

julia> LinearAlgebra.BLAS.set_num_threads(1)

julia> C1 = @btime $A * $B;
  1.112 ms (2 allocations: 703.20 KiB)

julia> LinearAlgebra.BLAS.set_num_threads(2)

julia> C2 = @btime $A * $B;
  650.866 μs (2 allocations: 703.20 KiB)

julia> extrema(C1 .- C2)
(-3.552713678800501e-14, 4.263256414560601e-14)

I think it is worth notifying the user that something unexpected is going on.

Copy link
Contributor

@mcabbott mcabbott Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps one warning when set fails, and nothing on get, is the right way to go?

(The behaviour of get_num_threads is unchanged; surely adding a one-time warning isn't considered a breaking change.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess you mean to merge the two warnings for macOS? But, since this code path would be rarely exercised, I think it's better to go with simplicity than a nicely formatted warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we should not overengineer this rare case. About silent fails of performance optimization, I think it makes a difference if an API level function fails at its one job or if some "optimization detail" burried deep in a long algorithm fails.

@tkf tkf requested review from dkarrasch and andreasnoack June 20, 2020 07:35
Copy link
Member

@tkf tkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refactor this a bit more.

stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/blas.jl Outdated Show resolved Hide resolved
jw3126 and others added 3 commits June 23, 2020 07:50
"""
set_num_threads(n)
set_num_threads(n::Integer)
set_num_threads(::Nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an unusual API, I would prefer set_num_threads() instead of set_num_threads(nothing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to allow the pattern

default = BLAS.get_num_threads() # returns nothing on exotic platforms
BLAS.set_num_threads(1)
# do stuff
BLAS.set_num_threads(default)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will still work if you make the signature set_num_threads(n=nothing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me allowing nothing is a hack to allow the above pattern on strange platforms. It is not a thing I would encourage or that I think needs more convenient syntax.

Copy link
Contributor Author

@jw3126 jw3126 Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine if you have code like set_num_threads() it is more likely you forgot to pass the number of threads, than that you really want to invoke the nothing method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should define set_num_threads() = set_num_threads(nothing). It would send a wrong message that set_num_threads(nothing) is somehow a reasonable default. But it's not. It is the last resort that exists only for supporting the rollback use case.

But this is not clear from the current docstring. I think it's better to clarify this.

Copy link
Contributor

@mcabbott mcabbott Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps clearest to show the pattern which motivates this:

Set the number of threads the BLAS library should use.

Also accepts `nothing`, in which case it tries to set set the default number of threads.
On exotic variants of BLAS, `nothing` may be returned by ` get_num_threads()`.
Thus the following pattern may fail to set the number of threads, but will not give an error:

old = get_num_threads()
set_num_threads(1)
@threads for i in 1:10
    # single-threaded BLAS calls
end
set_num_threads(old)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredrikekre Does the argument here makes sense? It'd be nice if you can have a look at the docstring.

stdlib/LinearAlgebra/test/blas.jl Show resolved Hide resolved
stdlib/LinearAlgebra/test/blas.jl Outdated Show resolved Hide resolved
jw3126 and others added 2 commits June 23, 2020 11:59
@tkf tkf added the needs news A NEWS entry is required for this change label Jun 23, 2020
@tkf
Copy link
Member

tkf commented Jun 23, 2020

We need a news item for BLAS.get_num_threads. I think it's good to go.

@mcabbott
Copy link
Contributor

Has anyone tried this with Apple's BLAS?

I tried & failed to install this today, make USE_SYSTEM_BLAS=1 USE_SYSTEM_LAPACK=0 led to various errors. Paging @dlfivefifty perhaps -- you once said on Discourse that you used this?

@dlfivefifty
Copy link
Contributor

I stopped using Apple BLAS around Julia v0.7: I think MKL is faster

@tkf tkf removed the needs news A NEWS entry is required for this change label Jun 24, 2020
@jw3126
Copy link
Contributor Author

jw3126 commented Jun 26, 2020

Can somebody rerun the failing CI?

@StefanKarpinski
Copy link
Member

I tried restarting but it's not doing my bidding even though I'm logged in. Hitting the rebuild button just gives me a bunch of info about the PR and doesn't do anything. @staticfloat, any idea what's up with that?

@staticfloat
Copy link
Member

If you click rebuild and it shows you a screen like this:

image

That means that the build is queued, but hasn't started yet. You can see at the top instead of builds it says buildrequests. Eventually, the build will run and the status here on GH will change. If you click on the macos64 build here, you can see that it finished about 5 hours ago.

@jw3126
Copy link
Contributor Author

jw3126 commented Jun 30, 2020

Can this be merged?

@tkf tkf merged commit b8110f8 into JuliaLang:master Jun 30, 2020
@mcabbott
Copy link
Contributor

mcabbott commented Jul 2, 2020

This is late but notice (1) tests will fail if get_num_threads() returns nothing, as it claims to sometimes do, and (2) there seems to be no test which runs the set_num_threads(::Nothing) method.

default = BLAS.get_num_threads()
@test default isa Int

@jw3126
Copy link
Contributor Author

jw3126 commented Jul 2, 2020

This is late but notice (1) tests will fail if get_num_threads() returns nothing,

Arguably thats a desirable property. It means that the BLAS library is not properly supported.

(2) there seems to be no test which runs the set_num_threads(::Nothing) method.

The exact number of threads set this way is an implementation detail and should not be tested. But it makes sense, that we just call the method and test that julia does not e.g. crash.

@mcabbott
Copy link
Contributor

mcabbott commented Jul 2, 2020

It's documented to fail gracefully on weird BLAS libraries, so I would say they ought to pass tests.

@jw3126
Copy link
Contributor Author

jw3126 commented Jul 2, 2020

My view is that get_num_threads returning nothing is a bug even if it is a documented bug. But if somebody really wants to add such a test I can also live with that.

simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
Co-authored-by: Okon Samuel <[email protected]>
Co-authored-by: Takafumi Arakaki <[email protected]>
!!! compat "Julia 1.6"
`get_num_threads` requires at least Julia 1.6.
"""
get_num_threads(;_blas=guess_vendor())::Union{Int, Nothing} = _get_num_threads()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why _blas isn't passed to _get_num_threads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting, I will fix it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.