Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize copyto! for triangular matrices #52730

Merged
merged 6 commits into from
May 16, 2024
Merged

Specialize copyto! for triangular matrices #52730

merged 6 commits into from
May 16, 2024

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Jan 4, 2024

This provides a performance boost in copying a triangular matrix to a StridedMatrix, which is a common operation (e.g. in broadcasting or in Matrix(::UpperTriangular)). The main improvement is improved cache locality for strided triangular matrices by fusing the loops.

On master

julia> U = UpperTriangular(rand(4000,4000));

julia> @btime Matrix($U);
  64.649 ms (3 allocations: 122.07 MiB)

This PR

julia> @btime Matrix($U);
  48.332 ms (3 allocations: 122.07 MiB)

@jishnub jishnub added linear algebra Linear algebra arrays [a, r, r, a, y, s] labels Jan 4, 2024
@jishnub
Copy link
Contributor Author

jishnub commented Jan 4, 2024

From the test failure, it looks like we may need something like #52528 before this to avoid accessing uninitialized indices in tril!/triu!

@jishnub jishnub marked this pull request as draft February 11, 2024 07:25
@dkarrasch
Copy link
Member

What's up with this PR? Seems to be in good shape, doesn't it?

@vtjnash
Copy link
Member

vtjnash commented Feb 12, 2024

I don't see a test failure here? Was that meant on a different PR? The copyto! generic code now tries to call isassigned / Base._unsetindex! to generically deal with uninitialized data in all copies, so maybe it is closer here?

@jishnub
Copy link
Contributor Author

jishnub commented Feb 12, 2024

I think we need to fix copyto! for Adjoint/Transpose before this, as that currently uses adjoint!/transpose! which doesn't deal with uninitialized indices well.

One workaround for this PR might be to use copytrito! instead of copyto! on the parent for the non-strided case, which will always work, but would often lead to performance regressions.

@jishnub jishnub marked this pull request as ready for review May 16, 2024 07:03
@jishnub jishnub merged commit c614cb6 into master May 16, 2024
7 checks passed
@jishnub jishnub deleted the jishnub/tricopyto branch May 16, 2024 17:50
@odow
Copy link
Contributor

odow commented May 21, 2024

I think this broke JuMP:

julia> using JuMP, LinearAlgebra

julia> model = Model();

julia> @variable(model, x[1:2, 1:2], Symmetric)
2×2 Symmetric{VariableRef, Matrix{VariableRef}}:
 x[1,1]  x[1,2]
 x[1,2]  x[2,2]

julia> Matrix(UpperTriangular(x))
ERROR: UndefRefError: access to undefined reference
Stacktrace:
  [1] getindex
    @ ./essentials.jl:895 [inlined]
  [2] getindex
    @ ./array.jl:910 [inlined]
  [3] _zero
    @ ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/dense.jl:113 [inlined]
  [4] triu!(M::Matrix{AffExpr}, k::Int64)
    @ LinearAlgebra ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/dense.jl:145
  [5] triu!
    @ ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/generic.jl:509 [inlined]
  [6] _copyto!
    @ ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/triangular.jl:528 [inlined]
  [7] copyto!
    @ ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/triangular.jl:522 [inlined]
  [8] copyto_axcheck!
    @ ./abstractarray.jl:1175 [inlined]
  [9] Matrix{AffExpr}(x::UpperTriangular{VariableRef, Symmetric{VariableRef, Matrix{VariableRef}}})
    @ Base ./array.jl:606
 [10] (Matrix)(x::UpperTriangular{VariableRef, Symmetric{VariableRef, Matrix{VariableRef}}})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/iovKe/src/dispatch.jl:563
 [11] top-level scope
    @ REPL[42]:1

@jishnub
Copy link
Contributor Author

jishnub commented May 21, 2024

The issue seems to be with triu!, as the following fails on v1.10.3 as well:

julia> M = Matrix{AffExpr}(undef, 2, 2)
2×2 Matrix{AffExpr}:
 #undef  #undef
 #undef  #undef

julia> triu!(M);
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex
   @ ./essentials.jl:14 [inlined]
 [2] triu!(M::Matrix{AffExpr}, k::Int64)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/dense.jl:139
 [3] triu!(M::Matrix{AffExpr})
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:435
 [4] top-level scope
   @ REPL[12]:1

We had actually made triu! more conservative in #52528, where types may now opt into the haszero trait:

julia> LinearAlgebra.haszero(AffExpr)
false

julia> zero(AffExpr)
0

in which case the zero on the type will be used instead of the value. Extending haszero seems to resolve this:

julia> LinearAlgebra.haszero(::Type{AffExpr}) = true

julia> Matrix(UpperTriangular(x))
2×2 Matrix{AffExpr}:
 x[1,1]  x[1,2]
 0       x[2,2]

I wonder if we need to be more conservative with haszero to avoid such breakages, and require types to opt out? This would make the behavior consistent with v1.10.3.

@odow
Copy link
Contributor

odow commented May 21, 2024

and require types to opt out?

We should make the behavior consistent with v1.10, and allow types to opt-in to a more efficient implementation via the trait. We should try to avoid breaking changes in minor Julia releases (although I get that LinearAlgebra is particularly prone to these issues, and we've worked around things before).

lazarusA pushed a commit to lazarusA/julia that referenced this pull request Jul 12, 2024
This provides a performance boost in copying a triangular matrix to a
`StridedMatrix`, which is a common operation (e.g. in broadcasting or in
`Matrix(::UpperTriangular)`). The main improvement is improved cache
locality for strided triangular matrices by fusing the loops.

On master
```julia
julia> U = UpperTriangular(rand(4000,4000));

julia> @Btime Matrix($U);
  64.649 ms (3 allocations: 122.07 MiB)
```
This PR
```julia
julia> @Btime Matrix($U);
  48.332 ms (3 allocations: 122.07 MiB)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants