make SparseMatrixCSC and SparseVector work on non-numerical values #30580

abraunst · 2019-01-04T00:56:28Z

This addresses #30573. It amounts essentially in replacing != 0 by iszero in sparsematrix.jl and sparsevector.jl. For a type T to work, it needs to define zero(T) and zero(x::T). Is it possible/reasonable to add a fallback definition zero(x::T) where T = zero(T) somewhere (where)?

tkf · 2019-01-04T02:10:46Z

Is it possible/reasonable to add a fallback definition zero(x::T) where T = zero(T) somewhere (where)?

Approach 3 using "AbstractMatrixCSC" I noted in #30173 (comment) could be used to alter the behavior for zero(T) for sparse arrays.

One approach could be to define (say):

absent(::AbstractMatrixCSC{T}) where T = zero(T)
isabsentvalue(::AbstractMatrixCSC, x) = iszero(x)

so that users can overload SparseArrays.absent etc. for their types.

abraunst · 2019-01-04T08:38:10Z

Is it possible/reasonable to add a fallback definition zero(x::T) where T = zero(T) somewhere (where)?

Approach 3 using "AbstractMatrixCSC" I noted in #30173 (comment) could be used to alter the behavior for zero(T) for sparse arrays.

One approach could be to define (say):
absent(::AbstractMatrixCSC{T}) where T = zero(T)
isabsentvalue(::AbstractMatrixCSC, x) = iszero(x)
so that users can overload SparseArrays.absent etc. for their types.

Sure, or absent/background value could be stored in the matrix itself (one may would like to use the same type with two different absent values). This seems nice.
I don't know however if having a non-zero background value would work out of the box (I suspect not)....

andreasnoack · 2019-01-04T09:11:42Z

Considering a non-zero absent value can be considered separately from this issue. To avoid delaying the improvement of this PR, I'll suggest that we discuss it elsewhere. I believe it is already covered by existing issues.

ViralBShah · 2019-01-04T17:01:34Z

Let's rebase on master and try another go at the CI, and hope we pick up CI fixes.

abraunst · 2019-01-04T17:29:24Z

Let's rebase on master and try another go at the CI, and hope we pick up CI fixes.

Sure, thanks @ViralBShah. At first sight errors seem related though (even if I don't understand exactly how). Question: does it make sense to add the following lines somewhere:

import Base.zero
zero(T::DataType) = throw(MethodError(zero, (T,)))
zero(x::T) where T = zero(T)

The rationale is that whenever a type has zero(T), it seems reasonable to assume by default that zero(x::T) == zero(T). The definition of zero(T::DataType) is there to avoid an infinite loop. If yes, where should it go?

ViralBShah · 2019-01-04T17:43:57Z

For now, in order to get tests to pass, let's add it to the sparse matrix code. But before merging, we do need to find the right place to add it to (assuming everything else works out)

abraunst · 2019-01-04T22:31:44Z

For now, in order to get tests to pass, let's add it to the sparse matrix code. But before merging, we do need to find the right place to add it to (assuming everything else works out)

Rethinking, the alternatives I see are

leave it out (present state of the PR), forcing the user of sparse to implement both zero(Tv) and zero(v::Tv).
use v != zero(Tv) instead of !iszero(v)
add zero(x::Tv) where Tv = zero(Tv) as suggested above

Maybe 2) is the less disruptive?

ViralBShah · 2019-01-05T03:17:36Z

I like option 2 as well. That would mean the user of a new type has to at best define zero(Tv).

…supported

abraunst · 2019-01-05T10:49:27Z

Errr... please advise. With this PR, it will be forbidden to make a SparseMatrixCSC{Tv, Ti} with Tv==Matrix, because there is no zero(::Type{Matrix}) (and there no good way to define it). There is at least a test about recursive transpose and adjoint using these constructs. I could replace Matrix with SMatrix (which is what I've tried to do in the PR), but I didn't realize that StaticArrays is out of the stdlib. So what should I replace it with?

BTW this uncovers yet another type (Matrix) without zero that seems to be being used as Tv in sparse matrix...

abraunst · 2019-01-06T13:39:14Z

Errr... please advise. With this PR, it will be forbidden to make a SparseMatrixCSC{Tv, Ti} with Tv==Matrix, because there is no zero(::Type{Matrix}) (and there no good way to define it). There is at least a test about recursive transpose and adjoint using these constructs. I could replace Matrix with SMatrix (which is what I've tried to do in the PR), but I didn't realize that StaticArrays is out of the stdlib. So what should I replace it with?

BTW this uncovers yet another type (Matrix) without zero that seems to be being used as Tv in sparse matrix...

EDIT: I commented out the test for the time being and all checks pass now.

andreasnoack · 2019-01-07T09:41:46Z

Following up on
https://github.com/JuliaLang/julia/issues/30573#issuecomment-451469404 here. I now remember the previous discussion and I'm pretty sure we kept using t != 0 to avoid excluding element types without a zero such as String and Symbol. It's quite unfortunate that this then happens to be wrong for cases that define a zero which is different from 0 such as matrices.

Any of the proposals in
#30580 (comment) will exclude String and Symbol and make it correct for types with a zero so we'll have to make a choice. I'm in favor of reserving SparseMatrixCSC for types with zero and direct the non-algebraic use cases to uses do different data structures. E.g. I guess NDSparse in IndexedTables might often be fine.

abraunst · 2019-01-07T10:54:35Z

Following up on
#30573 (comment) here. I now remember the previous discussion and I'm pretty sure we kept using t != 0 to avoid excluding element types without a zero such as String and Symbol. It's quite unfortunate that this then happens to be wrong for cases that define a zero which is different from 0 such as matrices.

Any of the proposals in
#30580 (comment) will exclude String and Symbol and make it correct for types with a zero so we'll have to make a choice. I'm in favor of reserving SparseMatrixCSC for types with zero and direct the non-algebraic use cases to uses do different data structures. E.g. I guess NDSparse in IndexedTables might often be fine.

Note that Matrix is one of such types for which zero can't really be defined. This leaves only StaticArray (that I'm aware of -- are there others?). This is why I was initially a bit skeptical. In my opinion the most consistent way out is to store the "background" value in SparseMatrixCSC (and defaulting to zero(Tv)). This would have many advantages in itself (e.g. rendering the output of f.(S) more reasonable etc) and help the support of these heterogenous types.

abraunst · 2019-01-07T11:26:44Z

#30580 (comment) will exclude String and Symbol and make it correct for types with a zero so we'll have to make a choice. I'm in favor of reserving SparseMatrixCSC for types with zero and direct the non-algebraic use cases to uses do different data structures. E.g. I guess NDSparse in IndexedTables might often be fine.

Note that Matrix is one of such types for which zero can't really be defined. This leaves only StaticArray (that I'm aware of -- are there others?). This is why I was initially a bit skeptical. In my opinion the most consistent way out is to store the "background" value in SparseMatrixCSC (and defaulting to zero(Tv)). This would have many advantages in itself (e.g. rendering the output of f.(S) more reasonable etc) and help the support of these heterogenous types.

I found #10410 addressing this, but no associated PR. I think it makes perfect sense, especially for purely numerical matrices. I don't really know how hard would it be to implement, but even a partial implementation would help with non-numerical types. In any case I agree that it is something that should be discussed separately, sorry.

StefanKarpinski · 2019-01-07T14:23:53Z

Note that Matrix is one of such types for which zero can't really be defined.

You mean without a specific element type? With an element type a matrix of zeros of that element type is the correct matrix zero element.

abraunst · 2019-01-07T14:36:32Z

Note that Matrix is one of such types for which zero can't really be defined.

You mean without a specific element type? With an element type a matrix of zeros of that element type is the correct matrix zero element.

I don't think so, even knowing the element type, because it would depend on the size.

andreasnoack · 2019-01-07T14:36:41Z

I think the point is that zero(Matrix{T}) can't be defined because the size isn't known.

@abraunst Why is it that you are using t -> t != zero(Tv) instead of !iszero? Both exclude matrices with Symbol and String elements but the latter might avoid the creation of zero(Tv). That is e.g. the case for BigInt/BigFloat where the creation of zero(BigX) is costly.

abraunst · 2019-01-07T14:41:39Z

I think the point is that zero(Matrix{T}) can't be defined because the size isn't known.

@abraunst Why is that you are using t -> t != zero(Tv) instead of !iszero? Both exclude matrices with Symbol and String elements but the latter might avoid the creation of zero(Tv). That is e.g. the case for BigInt/BigFloat where the creation of zero(BigX) is costly.

The rationale was to avoid forcing the user to define both zero(T) and zero(x::T) for the type T to work as value type and rely solely on the former.

EDIT: Said that, it is a matter of rolling back one commit, because that is what I started with. One could maybe add #30580 (comment) to make a default zero(x::T)

StefanKarpinski · 2019-01-07T14:44:41Z

because it would depend on the size

Duh, of course.

mbauman · 2019-01-07T16:32:56Z

I like consistently using iszero here, too. Heck, I'd title this issue "consistently use iszero to check for zero elements in sparse arrays" — I think that'd be completely unobjectionable. We currently do use a mix of x == 0 and iszero(x), so standardizing is a huge improvement in and of itself. Ref. #24790, which this goes a far way towards fixing!

I didn't quite catch the issue that precipitated the problem with iszero at first. Am I correct in understanding that you don't like iszero because:

its fallback is defined as iszero(x) = x == zero(x)
but SparseArrays also regularly tries to call zero(eltype(S))
thus a custom type T needs to define both zero(::Type{T}) and zero(::T) to fully work as an element of a sparse array.

I think that's an okay situation for now. It maintains the property that we can construct SparseArrays of Array, but we just cannot access any of the non-stored elements.

We can try to improve zero in the future — or simply better document this situation. Let's do that separately.

abraunst · 2019-01-07T17:21:00Z

I like consistently using iszero here, too. Heck, I'd title this issue "consistently use iszero to check for zero elements in sparse arrays" — I think that'd be completely unobjectionable. We currently do use a mix of x == 0 and iszero(x), so standardizing is a huge improvement in and of itself. Ref. #24790, which this goes a far way towards fixing!

I didn't quite catch the issue that precipitated the problem with iszero at first. Am I correct in understanding that you don't like iszero because:

its fallback is defined as iszero(x) = x == zero(x)

but SparseArrays also regularly tries to call zero(eltype(S))

thus a custom type T needs to define both zero(::Type{T}) and zero(::T) to fully work as an element of a sparse array.

Yes, exactly. I worried about the fact that this PR eliminates functionality (the ability to construct arrays of Strings for instance), and made especially cumbersome to make arrays of new types (since you had to define two methods). But I think you are right (also because of the potential cost of zero(T) that @andreasnoack raised).

I think that's an okay situation for now. It maintains the property that we can construct SparseArrays of Array, but we just cannot access any of the non-stored elements.

We can try to improve zero in the future — or simply better document this situation. Let's do that separately.

Okay, I'll roll back the commits!

EDIT: done

KlausC · 2019-01-17T13:55:51Z

I think, replacing the checks for zero A == 0 with iszero(A) is the proper way to continue.
For example it is possible to decide, if a matrix is a zero matrix without constructing a zero matrix of the appropriate size.
The construction of a structural zero value is more convoluted and should not be included in this PR.

In the case of Tv<:AbstractMatrix, which would be very handy to store block-structured matrices. The size of the element matrices depends in general on the indices, at which the element matrix is stored. So, to support the most general case it would be useful to have a function like structural_zero(S::SparseMatrixType, row, col), which would fall back to zero(eltype(S)) in the simple case. SparseMatrixType would be an extension of SparseMatrixCSC, which knows about the sizes of its elements.
Alternatively, block-structured matrices could use element type Tv = Union{AbstractMatrix{T},UniformScaling{T}}, which allows to define zero(Tv) = zero(T)*I without need to know the size of structural zero element matrices.

KlausC · 2019-02-03T17:12:36Z

What do you think about adding:

Base.zero(::Type) = missing
Base.zero(x::T) where T = zero(T)
Base.iszero(x) = coalesce(x == zero(x), false)

andreasnoack · 2019-03-15T14:27:58Z

Was reminded by this when looking into Graph Algorithms in the Language of Linear Algebra. With the changes in this pr and similar changes to SparseArrays/src/linalg.jl (@abraunst why didn't you update that one as well?) it seems possible to utilize the existing sparse matmul for the semi-ring approach of the book. E.g. Algebraic Bellman-Ford

struct SemiRing{T,P,M} <: Number
    val::T
end
import Base: +, *
(+)(x::SemiRing{T,A,M}, y::SemiRing{T,A,M}) where {T,A,M} = SemiRing{T,A,M}(A(x.val, y.val))
(*)(x::SemiRing{T,A,M}, y::SemiRing{T,A,M}) where {T,A,M} = SemiRing{T,A,M}(M(x.val, y.val))
Base.promote_rule(::Type{SemiRing{T,A,M}}, ::Type{SemiRing{S,A,M}}) where {T,S,A,M} = SemiRing{promote_type(T,S),A,M}
Base.zero(::Type{SemiRing{T,min,M}}) where {T,M} = SemiRing{T,min,M}(typemax(T))
Base.one(::Type{SemiRing{T,A,+}})    where {T,A} = SemiRing{T,A,+}(zero(T))
Base.iszero(x::SemiRing{T,A,M}) where {T,A,M} = x == zero(SemiRing{T,A,M})
Base.isone(x::SemiRing{T,A,M}) where {T,A,M} = x == one(SemiRing{T,A,M})
Base.conj(x::SemiRing{<:Real}) = x

and then example from Figure 24.6 in Algorithms becomes (notice that ∞ is the "additive" identity in SemiRing{Float64,min,+})

julia> ∞ = Inf
Inf

julia> A = sparse(SemiRing{Float64,min,+}[0 10 5 ∞ ∞
                                          ∞  0 ∞ 1 ∞
                                          ∞  3 0 9 2
                                          ∞  ∞ ∞ 0 4
                                          7  ∞ ∞ 6 0]);

julia> d = SemiRing{Float64,min,+}[0,∞,∞,∞,∞];

julia> d = A'd
5-element Array{SemiRing{Float64,min,+},1}:
  SemiRing{Float64,min,+}(0.0)
  SemiRing{Float64,min,+}(10.0)
  SemiRing{Float64,min,+}(5.0)
  SemiRing{Float64,min,+}(Inf)
  SemiRing{Float64,min,+}(Inf)

julia> d = A'd
5-element Array{SemiRing{Float64,min,+},1}:
  SemiRing{Float64,min,+}(0.0)
  SemiRing{Float64,min,+}(8.0)
  SemiRing{Float64,min,+}(5.0)
  SemiRing{Float64,min,+}(11.0)
  SemiRing{Float64,min,+}(7.0)

julia> d = A'd
5-element Array{SemiRing{Float64,min,+},1}:
 SemiRing{Float64,min,+}(0.0)
 SemiRing{Float64,min,+}(8.0)
 SemiRing{Float64,min,+}(5.0)
 SemiRing{Float64,min,+}(9.0)
 SemiRing{Float64,min,+}(7.0)

abraunst · 2019-03-15T17:03:52Z

Was reminded by this when looking into Graph Algorithms in the Language of Linear Algebra. With the changes in this pr and similar changes to SparseArrays/src/linalg.jl it seems possible to utilize the existing sparse matmul for the semi-ring approach of the book. E.g. Algebraic Bellman-Ford

This is nice at least as a pedagogical tool 👍

(@abraunst why didn't you update that one as well?)

If I remember correctly, the changes in linalg.jl involve having ~~zero(::Tv)~~ zero(::Type{Tv}) in addition to iszero(::Tv): this forbids the use of matrices as Tv which apparently is a useful pattern. But going trough something as SemiRing seems really appealing...

KlausC · 2019-03-16T11:39:59Z

I think, this PR should be re-based onto the current status.
There are some extra changes needed handling the following patterns:
isequal ... zero(Tv)...
!= zero(...
== 0

ViralBShah · 2019-06-17T02:59:22Z

Let's rebase and get this in if there is still interest.

andreasnoack · 2019-06-18T14:13:52Z

I've resolved the conflicts. There haven't been any objections to the proposal here so I'll merge. We can handle/discuss remaining details regarding zero/iszero elsewhere if needed.

ChrisRackauckas · 2019-06-18T14:16:26Z

Aweomse!

shashi · 2021-11-10T01:34:44Z

There still seems to be != zero(v) instead of !iszero(v) at https://github.com/JuliaLang/julia/blob/master/stdlib/SparseArrays/src/sparsevector.jl#L1742 and other lines in that file, was this intentional?

ViralBShah added the sparse Sparse arrays label Jan 4, 2019

abraunst force-pushed the zero branch from 0b0424a to 2c3917c Compare January 4, 2019 17:18

abraunst added 4 commits January 5, 2019 08:22

make SparseMatrixCSC and SparseVector work on non-numerical values

d9b7cd3

remove tests for sparse structures with TV=String as it is no longer …

4c90401

…supported

define zero(T) and zero(x::T) for user defined types in tests

3146307

fix test

e51650f

abraunst force-pushed the zero branch from 5689977 to 87b4532 Compare January 5, 2019 07:23

abraunst mentioned this pull request Jan 7, 2019

sprand and SparseMatrixCSC constructor simplifications #30617

Merged

abraunst changed the title ~~[WIP] make SparseMatrixCSC and SparseVector work on non-numerical values~~ make SparseMatrixCSC and SparseVector work on non-numerical values Jan 7, 2019

abraunst force-pushed the zero branch from 3b91d24 to e51650f Compare January 7, 2019 17:22

mbauman mentioned this pull request May 14, 2019

Replace != 0 with !iszero in sparse matrix constructions #32019

Closed

andreasnoack added 2 commits June 17, 2019 11:46

Merge branch 'master' into zero

232cc7a

Remove recently added test of sparse matrix with string elements

37a0d28

andreasnoack merged commit da39287 into JuliaLang:master Jun 18, 2019

abraunst deleted the zero branch December 14, 2019 10:46

mbauman mentioned this pull request Aug 31, 2020

Too restrictive setindex! for triangular matrices JuliaLang/LinearAlgebra.jl#761

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make SparseMatrixCSC and SparseVector work on non-numerical values #30580

make SparseMatrixCSC and SparseVector work on non-numerical values #30580

abraunst commented Jan 4, 2019

tkf commented Jan 4, 2019

abraunst commented Jan 4, 2019

andreasnoack commented Jan 4, 2019

ViralBShah commented Jan 4, 2019

abraunst commented Jan 4, 2019

ViralBShah commented Jan 4, 2019

abraunst commented Jan 4, 2019

ViralBShah commented Jan 5, 2019

abraunst commented Jan 5, 2019

abraunst commented Jan 6, 2019

andreasnoack commented Jan 7, 2019

abraunst commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

StefanKarpinski commented Jan 7, 2019

abraunst commented Jan 7, 2019

andreasnoack commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

StefanKarpinski commented Jan 7, 2019

mbauman commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

KlausC commented Jan 17, 2019

KlausC commented Feb 3, 2019 •

edited

Loading

andreasnoack commented Mar 15, 2019 •

edited

Loading

abraunst commented Mar 15, 2019 •

edited

Loading

KlausC commented Mar 16, 2019

ViralBShah commented Jun 17, 2019

andreasnoack commented Jun 18, 2019

ChrisRackauckas commented Jun 18, 2019

shashi commented Nov 10, 2021

make SparseMatrixCSC and SparseVector work on non-numerical values #30580

make SparseMatrixCSC and SparseVector work on non-numerical values #30580

Conversation

abraunst commented Jan 4, 2019

tkf commented Jan 4, 2019

abraunst commented Jan 4, 2019

andreasnoack commented Jan 4, 2019

ViralBShah commented Jan 4, 2019

abraunst commented Jan 4, 2019

ViralBShah commented Jan 4, 2019

abraunst commented Jan 4, 2019

ViralBShah commented Jan 5, 2019

abraunst commented Jan 5, 2019

abraunst commented Jan 6, 2019

andreasnoack commented Jan 7, 2019

abraunst commented Jan 7, 2019 • edited Loading

abraunst commented Jan 7, 2019 • edited Loading

StefanKarpinski commented Jan 7, 2019

abraunst commented Jan 7, 2019

andreasnoack commented Jan 7, 2019 • edited Loading

abraunst commented Jan 7, 2019 • edited Loading

StefanKarpinski commented Jan 7, 2019

mbauman commented Jan 7, 2019 • edited Loading

abraunst commented Jan 7, 2019 • edited Loading

KlausC commented Jan 17, 2019

KlausC commented Feb 3, 2019 • edited Loading

andreasnoack commented Mar 15, 2019 • edited Loading

abraunst commented Mar 15, 2019 • edited Loading

KlausC commented Mar 16, 2019

ViralBShah commented Jun 17, 2019

andreasnoack commented Jun 18, 2019

ChrisRackauckas commented Jun 18, 2019

shashi commented Nov 10, 2021

abraunst commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

andreasnoack commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

mbauman commented Jan 7, 2019 •

edited

Loading

abraunst commented Jan 7, 2019 •

edited

Loading

KlausC commented Feb 3, 2019 •

edited

Loading

andreasnoack commented Mar 15, 2019 •

edited

Loading

abraunst commented Mar 15, 2019 •

edited

Loading