Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance regression for the product between a sparse matrix and a BitArray. #28011

Closed
Alexander-Barth opened this issue Jul 9, 2018 · 14 comments
Labels
performance Must go faster priority This should be addressed urgently regression Regression in behavior compared to a previous version sparse Sparse arrays

Comments

@Alexander-Barth
Copy link

In Julia 0.7.0-beta.206 I see a performance regression relative to Julia 0.6.3 for a production between a sparse matrix and a BitArray. This regression does not affect arrays created by e.g. ones(Bool,n), just BitArrays. It seems that the sparse matrix is transformed into a full matrix.

using SparseArrays
using LinearAlgebra
n = 1000
@time sparse(1.0I,n,n) * trues(n);

33.780663 seconds (116.22 M allocations: 2.102 GiB, 0.36% gc time)

julia 0.6.3:

n = 1000
@time sparse(1.0I,n,n) * trues(n)

0.000031 seconds (14 allocations: 32.313 KiB)

Julia Version 0.7.0-beta.206
Commit b6f9244d6b (2018-07-08 19:42 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, ivybridge)
@mbauman mbauman added sparse Sparse arrays performance Must go faster regression Regression in behavior compared to a previous version labels Jul 9, 2018
@mbauman
Copy link
Member

mbauman commented Jul 9, 2018

It looks like the requisite sparse optimization methods are still there, but they're not layering into the dispatch tables appropriately.

We have a mul!(C::StridedVecOrMat, A::SparseMatrixCSC, B::StridedVecOrMat) method, but that's losing to mul!(y::AbstractVector, A::AbstractVecOrMat, x::AbstractVector).

Edit: interestingly, both of these methods existed in 0.6 (named A_mul_B!), but sorted in the opposite order. I'm not entirely sure which order is correct.

@martinholters
Copy link
Member

Shouldn't that raise an ambiguity error?

@Sacha0
Copy link
Member

Sacha0 commented Jul 11, 2018

Shouldn't that raise an ambiguity error?

Possibly relevant: #27408

@Alexander-Barth
Copy link
Author

With version 0.7.0-beta2.12 (2018-07-15 15:57 UTC, Commit a878341) I get the following results from btime:

n 0.6.4 0.7.0-beta2.12
100 1.535 μs 131.648 μs
1000 11.571 μs 12.698 ms
10000 101.084 μs 1.267 s

The runtime seems to scale like n² in 0.7.0-beta2.12 while it scales like n in Julia 0.6.4.

@KristofferC
Copy link
Member

KristofferC commented Jul 18, 2018

0.6:

julia> trues(20) isa StridedVecOrMat
true

0.7:

julia> trues(20) isa StridedVecOrMat
false

So it is not a matter of losing vs winning, it is about matching at all.

Changed in #25858 (cc @mbauman).

@mbauman
Copy link
Member

mbauman commented Jul 18, 2018

Oh, duh. Thanks. Looks like we need to add generic implementations, then… or it may be possible that the strided implementations don't actually require stridedness.

@Alexander-Barth
Copy link
Author

Alexander-Barth commented Feb 20, 2019

Here are some updated benchmarks if it is helpful (on a difference machine):

in Julia 1.1:

using BenchmarkTools, LinearAlgebra, SparseArrays
n = 1000;
@btime sparse(1.0I,n,n) * trues(n);
#  4.691 ms (7 allocations: 32.09 KiB)

n = 10000;
@btime sparse(1.0I,n,n) * trues(n);
#  488.196 ms (11 allocations: 314.28 KiB)

In julia 0.6.4:

julia> n = 1000;
julia> @btime sparse(1.0I,n,n) * trues(n);
  5.760 μs (10 allocations: 32.16 KiB)

julia> n = 10000;
julia> @btime sparse(1.0I,n,n) * trues(n);
  53.479 μs (14 allocations: 314.34 KiB)

The performance problem is still there.

@StefanKarpinski StefanKarpinski added the priority This should be addressed urgently label Feb 20, 2019
@StefanKarpinski StefanKarpinski added this to the 1.2 milestone Feb 20, 2019
@StefanKarpinski
Copy link
Member

I've put this on the 1.2 milestone—this should really be addressed as it's a huge regression. It's bad enough that a simple, low-risk fix would be a candidate for backporting to 1.1 and maybe even 1.0.

@mbauman
Copy link
Member

mbauman commented Feb 20, 2019

So the sparse implementations can totally be widened to AbstractArray — they needn't be strided. In other words, I think we can do s/Strided/Abstract/g on the entire SparseArrays/src/linalg.jl file.

Unfortunately that rapidly lands you in ambiguity hell.

@KristofferC
Copy link
Member

I'm moving this away from the milestone. While it is a regression, it occurred in 0.7 and I don't think it should block 1.2.

@KristofferC KristofferC modified the milestones: 1.2, 1.x May 6, 2019
@ViralBShah
Copy link
Member

I observe on 1.3 master:

julia> @time sparse(1.0I,n,n) * trues(n);
  0.006774 seconds (12 allocations: 32.266 KiB)

versioninfo()
Julia Version 1.3.0-DEV.449
Commit 6f5cefb1d2 (2019-06-26 20:49 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)


Please reopen if still an issue.

@Alexander-Barth
Copy link
Author

For your information, it seems that sparse arrays times a bit array uses the generic matmul, while sparse arrays times a boolean array uses the sparse matrix multiplication.

julia> @btime sparse(1.0I,n,n) * trues(n);
  10.693 ms (7 allocations: 32.09 KiB)

julia> @btime sparse(1.0I,n,n) * ones(Bool,n);
  8.331 μs (6 allocations: 32.92 KiB)

For me, it is quite easy to use boolean arrays instead of bit arrays, but I did not expect this difference in run-time (also julia 1.3.0, Linux).

@KristofferC
Copy link
Member

BitArrays are slow to index so a vector of bools would likely still till be faster.

@Alexander-Barth
Copy link
Author

Ok, thanks for the information!😀

Alexander-Barth added a commit to gher-uliege/DIVAnd.jl that referenced this issue Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster priority This should be addressed urgently regression Regression in behavior compared to a previous version sparse Sparse arrays
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants