Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken KPM scalar optimization, use 5-arg mul! instead #72

Merged
merged 5 commits into from
Jun 17, 2020

Conversation

pablosanjose
Copy link
Owner

The clever scalar version of the KPM iteration inherited from Elsa was broken. Since Julia 1.3 we have 5-argument mul! that implements a similar scalar algorithm. It is also multithreaded to boot. This PR uses such facility, and fixes KPM in the process (I used the computation in https://discourse.julialang.org/t/kernel-polynomial-method/34240/3 as reference). A side effect is that the code is again much clearer and closer to the original algorithm.

@codecov-commenter
Copy link

codecov-commenter commented Jun 17, 2020

Codecov Report

Merging #72 into master will increase coverage by 0.52%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #72      +/-   ##
==========================================
+ Coverage   56.20%   56.72%   +0.52%     
==========================================
  Files          15       15              
  Lines        2288     2267      -21     
==========================================
  Hits         1286     1286              
+ Misses       1002      981      -21     
Impacted Files Coverage Δ
src/KPM.jl 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ef6de5...bc004f3. Read the comment docs.

threading

cleanup

transpose mul

fix for A != I

use thread buffers only when needed

fix

commented BLAS version

activate BLAS codepath
@pablosanjose
Copy link
Owner Author

It turns out the mul! approach relies on BLAS threads (of course...). After experimenting with the help of @fernandopenaranda, we cannot seem to beat BLAS with Threads.@threads (let alone MKL, which appears to be twice as fast), so the mul! method in this PR seems like the optimal choice, currently. The last commit includes the old (but corrected) scalar code, commented out, in case it is of use in the future.

more stringent test

comment [skip ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants