SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

KlausC · 2017-11-26T11:53:21Z

When I first worked with sparse matrices, I was convinced, that the element type of those objects were AbstractFloat types. I am easily convinced, that all Number types make sense as well. Nevertheless there there seems to be a sloppy idea of which element types are actually intended.

The constructors of sparse arrays accept any element type. But you can call certain functions on sparse arrays only, if the element type supports certain methods. So the element types are de facto restricted by a kind of duck typing. The methods, which I refer to, include != 0, == 0, != zero, == zero, isequal, iszero for the question, if an element to be processed is zero or not.
I see, that all those alternatives lead to the same result for Numbers (with the exception of == and isequal, which deserves a separate consideration in the case of ±0.0).
But if I want to use a different element type Tv, I have to support this type with a consistent behaviour for all of those methods. That is far more inconvenient than just have to define zero(::Type{Tv}), what would be sufficient in the case of a consistent use within the SparseArrays module.

Using iszero(x) would be a proper choice. It falls back to x == zero(typeof(x)). Comparison with 0 is good for if x isa Number, but leads into a blind alley, for other types, because all elements would be stored silently (even if zero is defined).

Here is a list of the current implementation (v0.7 2017-11-24) which illustrates the diversity in handling zeros:
How (and where) is defined, if an entry is considered a zero?

check expression	source	function	comments
`v != 0`	sparsevector.jl:387	convert	from dense
`v != 0`	sparsevector.jl:292	setindex!	but only, if index is new
`v != 0`	sparsematrix.jl:2324	setindex!	(but only, if index is new)
`v != zero(Tv)`	sparsematrix.jl:2598	setindex!
nothing	sparseivector.jl:751	getindex(x,::AbstractUnitange)	new sparse vector constructed, but no check
`x != 0`	sparsevector.jl:1975	dropzeros!
`x != 0`	sparsematrix.jl:1227	dropzeros!
`x != 0`	sparsevector.jl:724	findnz
`x != 0`	sparsematrix.jl:1303	findnz
`x != 0`	sparsematrix.jl:1786	findz
`x != 0`	sparsematrix.jl:3189	is_hermsym
`x != 0`	sparsematrix.jl:3227	istriu
`x != 0`	sparsematrix.jl:3246	istril
`x != 0`	sparsematrix.jl:1564,1567,1574,1577	==
`convert(eltype(A), x) != zero(eltype(A))`	sparsematrix.jl:2033	Base.fill!
`v != zero(v)`	sparsevector.jl:1590,:1689	A_mul_B!(::Number, ...)
`v == zero(T)`	sparsematrix.jl:1641	Base._mapreduce
`isequal(f(zero(Tv)), zero(Tv)`	sparsematrix.jl:1744	_mapreducecols!
`iszero(A.nzval)`	sparsematrix.jl:1513	Base.iszero
`iszero(x)` sparsematrix.jl:1520	Base.isone
`isequal(v, zero(T))`	sparsematrix.jl:3472	Base.hash
`x == 0`	higherorderfns.jl:139	_iszero
`Base.iszero(x::Number)`	higherorderfns.jl:140	_iszero
`Base.iszero(x::AbstractArray)`	higherorderfns.jl:141	_iszero

How is zero element created?

expression	source	function	comments
`zero(eltype(A))`	sparsevector.jl:688	find(p, A)
`zero(Tv)`	sparsevector:751	getindex(x,::Integer)
`zero(T)`	sparsematrix:1890,1892	getindex(x,::Integer)
`zero(Tv)`	sparsematrix.jl:2594,2598,2616,2623	setindex!
`zero(eltype(S))`	sparsematrix.jl:73	count
`zero(eltype(S))`	sparsematrix.jl:1262	find
`zeros(Tv, m, n)`	sparsematrix.jl:377	convert	(to dense - if sparse array has no structural zeros, not used)
`zero(T)`	sparsematrix.jl:1598,1620,1629,1631,1639	Base._mapreduce
`zero(T)`	sparsematrix.jl:1708	Base._mapreducedim!
`zero(T)`	sparsematrix.jl:1743	_mapreducecols!
`zero(T)`	sparsematrix.jl:1810	_findr
`zeros(Tv,m)`	sparsematrix.jl:3373	sortSparseMatrixCSC!
`zeros(eltype(A))`	higherorderfns.jl:142,143	_zeros_eltypes

Inconsitencies

setindex!(A, -0.0, ind...) behavior depends on ind is structural zero or not

Example where the result of getindex depends on the pre-history (before the last setindex!):

julia> B = spzeros(2)
2-element SparseVector{Float64,Int64} with 0 stored entries
julia> B[1] = 1;
julia> B[1] = -0.0; B[2] = -0.0;
julia> B[1], B[2]
(-0.0, 0.0)

Proposal:

Change in sparse/*jl all the checks if x is zero by the call iszero(x).
Amend the documentation with a hint, that the element types of SparseArrays need to define the `zero(::Type{T}) method.
Change the values of all AbstractFloat zero values -0.0 to +0.0 of the same type, before they are stored into nzvals.
Change the documentation about getindex(::AbstractSparseArray{T},...) to mention, that all returned zeros have positive signs for T<:Number.
Define a fallback method Base.zero(x) = zero(typeof(x)) and Base.zero(::Type{T}) where T = error("..."). That would be in accordance to current doc of zero.

The text was updated successfully, but these errors were encountered:

KristofferC · 2017-11-26T11:56:47Z

Could you give a code snippet as an example where these "inconsistencies" lead to an inconsistent result?

ViralBShah · 2017-11-26T13:21:42Z

The issue needs a better title! Would be nice to have consistency.

KlausC · 2017-11-26T16:54:44Z

Another code snippet, demonstrating the need (v0.6.1 and v0.7-DEV):

julia> using Base.Dates

julia> S = sparse([ Second(0); Second(1)])
2-element SparseVector{Base.Dates.Second,Int64} with 2 stored entries:
  [1]  =  0 seconds
  [2]  =  1 second
julia> zero(Second(1))
0 seconds
julia> iszero(Second(0))
true
julia> Second(0) == 0
false

Generally it is not possible to take advantage of the storage economy of sparse arrays for dimensionful or unitful element types.

KristofferC · 2017-11-26T17:02:15Z

Ref #9325

KlausC · 2017-11-26T17:29:20Z

@KristofferC : I see that this issue has bee discussed 3 years ago. The main argument then, not to change anything (as I undestood), was performance.
I cannot verify that with the current version of Julia. The timing of x == 0 and iszero(x) does not seem to differ for x isa Union{Float64,BigInt,BigFloat}.
Maybe it is time now to review the arguments.

yurivish · 2018-06-13T06:12:50Z

I ran into this while giving a demonstration of Julia to a friend. We had a custom struct type with zero and iszero defined and wanted to make a sparse matrix out of a dense matrix containing a custom type.

Due to the != 0 check, there was no way for us to communicate our zero value to the sparse matrix code, and the zeros were thus represented explicitly in the sparse matrix:

julia> struct MyType
           val::Bool
       end

julia> Base.zero(::Type{MyType}) = MyType(false)

julia> Base.iszero(x::MyType) = x.val == false

julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 2 stored entries:
  [1]  =  MyType(true)
  [2]  =  MyType(false)

ekinakyurek · 2018-10-26T17:28:52Z

I had same issue, is there a progress on this issue ? What are the challenges ?

KristofferC · 2018-10-26T17:43:20Z

In best case, changing some x == 0 to iszero(x).

Lecrapouille · 2019-09-21T09:19:15Z

@yurivish I made a PR on your problem. Now returning:

julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 1 stored entry:
  [1]  =  MyType(true)

Is it ok for you ?

yurivish · 2019-09-21T13:07:06Z

Thanks! That is the behavior I think is right in this case.

Lecrapouille · 2020-02-05T19:05:50Z

@KlausC do you think my commit has a chance to be merged?

KlausC · 2020-02-06T10:15:20Z

@Lecrapouille, which PR do you mean exactly? You know I have been trying to clean-up some of the inconsistencies in SparseMatrixCSC for a while, but found unexpected resistance. So my answer to this question is not important. You should ask people with permissions to merge.

The bugs you point out in issue https://github.com/JuliaLang/julia/issues/33332 are still not fixed. I support your fixes mentioned there, but lost overview of which is the PR providing the fix.

BTW, the example you mention here: #24790 (comment) seems to work for me without your PR merged. (not in v"1.2" but in v"1.5.0")

fredrikekre added the sparse Sparse arrays label Nov 26, 2017

KlausC changed the title ~~SparseMatrix and SparseArray have unclear design~~ SparseMatrix and SparseArray don't consitently define zero elements and allowed element types Nov 26, 2017

fredrikekre mentioned this issue Mar 6, 2018

improve performance for Matrix -> SparseMatrix #26334

Merged

mbauman mentioned this issue Jan 7, 2019

make SparseMatrixCSC and SparseVector work on non-numerical values #30580

Merged

KlausC closed this as completed Feb 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

KlausC commented Nov 26, 2017 •

edited

Loading

KristofferC commented Nov 26, 2017

ViralBShah commented Nov 26, 2017

KlausC commented Nov 26, 2017

KristofferC commented Nov 26, 2017

KlausC commented Nov 26, 2017 •

edited

Loading

yurivish commented Jun 13, 2018

ekinakyurek commented Oct 26, 2018

KristofferC commented Oct 26, 2018

Lecrapouille commented Sep 21, 2019

yurivish commented Sep 21, 2019

Lecrapouille commented Feb 5, 2020

KlausC commented Feb 6, 2020 •

edited

Loading

SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

Comments

KlausC commented Nov 26, 2017 • edited Loading

KristofferC commented Nov 26, 2017

ViralBShah commented Nov 26, 2017

KlausC commented Nov 26, 2017

KristofferC commented Nov 26, 2017

KlausC commented Nov 26, 2017 • edited Loading

yurivish commented Jun 13, 2018

ekinakyurek commented Oct 26, 2018

KristofferC commented Oct 26, 2018

Lecrapouille commented Sep 21, 2019

yurivish commented Sep 21, 2019

Lecrapouille commented Feb 5, 2020

KlausC commented Feb 6, 2020 • edited Loading

KlausC commented Nov 26, 2017 •

edited

Loading

KlausC commented Nov 26, 2017 •

edited

Loading

KlausC commented Feb 6, 2020 •

edited

Loading