Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sparse matrix nits, and make sparse matrix construction fail if zero(Tv) not defined #11408

Merged
merged 1 commit into from
May 30, 2015

Conversation

simonster
Copy link
Member

No description provided.

A = zeros(Tv, S.m, S.n)
# Handle cases where zero(Tv) is not defined but the array is dense.
# (Should we really worry about this?)
A = length(S) == nnz(S) ? Array(Tv, S.m, S.n) : zeros(Tv, S.m, S.n)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one of the issues reported in #9917 so I "fixed" it, but I'm not sure we should even allow creating a SparseMatrix of a type without zero defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable.

@tkelman tkelman added the sparse Sparse arrays label May 23, 2015
@simonster simonster changed the title Fix sparse matrix nits, fixes #9917, fixes #11032 Fix sparse matrix nits, and make sparse matrix construction fail if zero(Tv) not defined May 23, 2015
@simonster
Copy link
Member Author

Added an error in the SparseMatrixCSC constructor if the value type has no defined zero. That part is potentially breaking and should not be backported.

@IainNZ
Copy link
Member

IainNZ commented May 23, 2015

Related PR, that I closed without merging: #9325

@ViralBShah
Copy link
Member

@lindahua Could you give some examples? It would seem that it should be straightforward to provide a zero(T) method. We barely scratched some of the ideas of generalizing sparse matrix operations to graph operations here:

http://www.mit.edu/~kepner/pubs/JuliaSemiring_HPEC2013_Paper.pdf

It could very well be that if you care about having some of these things working, you should provide zero(T), but it doesn't need to be mandatory in case you are only doing operations where zero(T) is unlikely to get invoked.

@tkelman
Copy link
Contributor

tkelman commented May 25, 2015

I have sparse matrices with Symbol or Expr or Any element types every now and then, where it feels very very wrong to be defining zero(::Any) or zero(::Symbol) etc if I don't actually have to.

@ScottPJones
Copy link
Contributor

Has anybody thought about making this more general? Instead of looking for zero(),
look for an emptyvalue() method. For T <: Number, it would still be zero(), but for
T <: AbstractString, it could be T(""), and the generic fallback for T <: Any, it would be Void.
I think for numeric vectors and matrices, you'd still get exactly what you get now, but this would
make sparse vectors and matrices useable (and efficient) for non-numeric data, without having people do strange things like define zero(::Any) or zero(::AbstractString).

@lindahua
Copy link
Contributor

This is actually about a more fundamental problem than just the zero method.

It is about the concept of sparse array (or sparse matrix). We have two options:


  • Consider sparse arrays within the numerical context. This is what most of the methods in the Sparse module were designed for. With Tv being a non-numerical value, you won't be able to use most of these functions (that's not just about the zero method, they also use + and *, etc).

Implementing this concept is very simple -- just have Tv <: Number (you don't have to check whether Tv implements zero or not).


  • Consider sparse arrays as a sorted map from indexes to values, kind of like a Dict{NTuple{N, Ti}, Tv}, but with a more compact internal representation. With this concept, you can put whatever you want as Tv. However, it makes no sense to call any numerical methods on the array (which requires not only zero, but also numerical operators being defined on Tv).

The only method that I can conceive that may be made to work for such arrays is full (by adding zero or emptyvalue, etc). Actually, there is a more flexible way to make this work:

function full(A::SparseMatrixCSC{T}, emptyval::T)
# ...
end

full{T<:Number}(A::SparseMatrixCSC{T}) = full(A, zero(T));

You don't need to define emptyvalue for everything, which, IMO, is not really feasible. There are plenty of types, for which it is unclear what is an empty value. So if you want to call full on a non-numerical sparse array, just supply what you think is a proper emptyvalue depending on your context.

I guess @ViralBShah might be interested in using a sparse matrix as a graph. In those cases, we don't need to have zero at all. The general algorithmic pattern for graph theoretical algorithms is as follows:

for s in vertices(g)
     # ...
     for t in neighbors(g, s)  
     end
end 

For such type of algorithms, what you really need is just a way to quickly give you the list of nonzeros of a certain column (or row), and you don't need to bother with those that are not stored. (With the CSC format, this can be efficiently done).

@lindahua
Copy link
Contributor

It is very important to note that the primary purpose of sparse storage is to make it possible to have algorithms with complexity O(nnz) instead of O(n). Scanning over all elements (stored or not) defeats this purpose.

The only method that needs to scan all elements is full, which I already provided a solution that does not require a zero method on Tv.

@lindahua lindahua mentioned this pull request May 26, 2015
6 tasks
@ScottPJones
Copy link
Contributor

For non-numerical applications, I'd think the primary purpose of sparse storage is to save memory, accepting O(log n) (n = num not null) time instead of O(1) to find a particular field in a row...

@ViralBShah
Copy link
Member

Would you not want to use a hash table in such cases?

@tkelman
Copy link
Contributor

tkelman commented May 26, 2015

The only method that I can conceive that may be made to work for such arrays is full

Indexing and traversal of the stored values works fine too. There are several contexts where I'm putting objects in a sparse matrix that are numberlike containers but not <: Number, or are functions that evaluate to numbers but are not themselves numbers (Jacobian and Hessian matrices, etc). These will inevitably not work as completely as Tv <: Number will, but they're useful nonetheless.

@simonster
Copy link
Member Author

So it sounds like we don't want to restrict SparseMatrixCSC to types with zero, at least for the moment. Shall I go ahead and merge just the first commit here?

@tkelman
Copy link
Contributor

tkelman commented May 29, 2015

2fedc27 looks fine and non-controversial

simonster added a commit that referenced this pull request May 30, 2015
Fix sparse matrix nits, and make sparse matrix construction fail if zero(Tv) not defined
@simonster simonster merged commit d8ca7a2 into master May 30, 2015
@simonster simonster deleted the sjk/sparse-fixes branch May 30, 2015 03:48
simonster added a commit that referenced this pull request Jun 4, 2015
(cherry picked from commit 2fedc27)
ref #11408

Conflicts:
	base/sparse/sparsematrix.jl
	test/sparsedir/sparse.jl
@tkelman
Copy link
Contributor

tkelman commented Jun 4, 2015

backported in c8804e8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants