-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multithread processing in KPM #68
Multithread processing in KPM #68
Conversation
Fantastic @fernandopenaranda ! Can you do a squash of your commits to clean up the commit tree? |
src/KPM.jl
Outdated
@@ -164,24 +168,25 @@ function iterateKPM!(ket0::A, ket1::A, adjh::Adjoint, (center, halfwidth)) where | |||
nzv = nonzeros(h) | |||
rv = rowvals(h) | |||
T = eltype(S) | |||
μ = μ´ = zero(T) | |||
#μ = μ´ = zero(T) | |||
μ = μ´ = zeros(T,Threads.nthreads()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, this is dangerous. Since zeros
returns an array, this line makes μ
and μ´
point to the same array. I suspect this is not what you intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right! Let me fix it
src/KPM.jl
Outdated
@@ -106,10 +110,10 @@ function addmomentaKPM!(b::KPMBuilder{<:AbstractMatrix,<:AbstractSparseMatrix}, | |||
fill!(ket0, zero(eltype(ket0))) | |||
mul!(ket1, A´, ket) | |||
μlist[1] += proj(ket1, ket) | |||
ProgressMeter.next!(pmeter; showvalues = ()) | |||
#ProgressMeter.next!(pmeter; showvalues = ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do remove the comments if you don't want the code
Can you explain why ProgressMeter interferes with this? I seem to recall that ProgressMeter can play nicely with threads now, it might be worth a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is no conflict with threads. However, progressmeter slowed a bit the calculations. This is why I commented it. Let me go over it again I guess it is likely that this extra time is negligible for large systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe useful: https://discourse.julialang.org/t/progressmeter-and-threads/11537
It seems to suggest there is a proper way to do things without locking the threads (didn't read it carefully)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll explore and implement it. This looks like the correct way to do it.
@inbounds begin | ||
tmp = α * ket1[col, k] - ket0[col, k] | ||
tmp= α * ket1[col, k] - ket0[col, k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watch the whitespace
- free of race conditions
@fernandopenaranda , what's the conclusion on ProgressMeter? I saw you restored it. Does it slow down the threaded loop? |
No, since the pmeter acts above the threaded loops |
The aim of this PR is to enhance the performance of the Kernel Polynomial Method (KPM) momenta calculation already built on KPM.jl, by using the recently implemented multithreading approach available on julia v1.3+ (see #https://julialang.org/blog/2019/07/multithreading/).
The calculation of KPM momenta doesn't allow for an easy parallelisation scheme due to the sequential nature of the momentum generator algorithm. Although this might suggest that the parallelisation at this level is not a viable option, it is indeed possible to distribute on multiple threads the matrix operations for each step of the momenta calculation (see
iterateKPM()
). Typically, the overhead of distributing tasks over all available threads N times (being N the order of the KPM expansion) leads to a significant fraction of the total resource consumption, which favours serial over parallel processing. However, this might not be the case when dealing with large systems when lots of random vectors are required.In this situation the global calculation can benefit from a multithread processing with negligible overhead. We can see this in the following system size (D) vs time graph: the blue dots are computed using the multithreaded strategy built in this PR over 4 threads, whereas those in orange refer to an external (higher level) distribution on 4 different cores.
Multithreading over multicore distribution is preferable as memory allocation is largely reduced in the former since the system constructor variables are shared and, thus, allocated only once.
Important remark: Even for large systems a small D/N ratio might favour multi-core distribution over multithreading. This has been taken into account since this PR is compatible with the multicore parallelisation at a higher level as noted by @pablosanjose.
Usage
By default the number of Julia threads is set to 1 so in order to benefit from this multithreading scheme it is required to set the desired number of julia threads x. This is done setting the following environment variable:
JULIA_NUM_THREADS=x ./julia
.