-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map
on sparse arrays does not work with non-numeric data
#19561
Comments
What version of Julia are you using? on 0.5.0 this works |
master. |
The new fofzeros = f(zero(eltype(A)), zero(eltype(B)))
fpreszeros = fofzeros == zero(fofzeros) This check requires that (1) the input array eltypes provide The first requirement is fundamental: A sparse matrix The second requirement is relaxable: We need only the result of the hypothetical Both examples above violate the second requirement and would work given Thoughts? Best! |
alternatively wouldn't defining |
If you only ever do operations on the stored entries (or their locations), then it doesn't always matter whether the unstored entries have a realizable value in the same element type as the stored entries. I've found sparse matrices of symbols or functions to be useful on occasion as a data structure, where it was useful to have them in CSC format, do concatenations or similar operations. We should just add |
Absolutely,
+1 :). Best! |
But how would |
In both cases above, |
Looked closer and I'm mistaken. The zero preservation check discussed above works as written in these cases. Rather, checking whether output entry |
That thought was silly. |
Using sparse arrays/vectors for non-algebraic data (types that do not define |
For example, what does |
If you just want a general "sparse" associative "array" from |
And if you want to map the nonzero elements of a sparse array to non-numeric values, you should use |
Dict isn't CSC. It's useful on occasion to have a sparse array that has the same index iteration structure as a numeric SparseMatrixCSC, with effectively undefined non-stored entries. Linear algebra operations won't work on them, but many array manipulation operations will. |
Lots of things are occasionally useful but lead to poor library design. We normally try to avoid bad type puns in Base. |
Could you give a concrete example where nonzeros(A) would not suffice? |
We allow non numeric or heterogeneously typed dense arrays. Allowing for the same with sparse matrices is a feature that it would be a regression to lose. Not being able to do linear algebra on some array types doesn't always make it a pun or bad library design to support. For this particular example of map I agree nonzeros is mostly good enough. Will have to check whether the assumption that |
The definition of sparsity here is "mostly zero", which is what makes it seem like a pun, and rather different from an Associative type (a partial function), which is what you seem to want to use it for. |
The generalization would make more sense if we allowed an arbitrary default value, but that could cause trouble with code expecting "sparse" to mean "default zero" Maybe we should have a DefaultArray type with an arbitrary default value, and have SparseMatrixCSC be a subtype? |
@stevenj this is a really good idea. It will also provide a path where we don't return dense arrays when we broadcast nonzero preserving functions, which could be a huge win for memory usage. |
If we want to use it for avoiding dense broadcast output, then the sparse case can't be a subtype (wouldn't be type stable). I dunno, maybe all sparse arrays should be DefaultArrays after all. @StefanKarpinski was arguing for this in another issue, and I was skeptical, but maybe it's worth the trouble. |
The really big advantage would be that all broadcasts would be efficient and produce the same type no matter what. Also, if I understood his proposal correctly, we could do even better, as his was a way to cheat a default from a sparse, but if it were going on base, we could further simplify the code. |
For a potential stopgap solution, please see #17623 (comment). Best!
|
Fix #19561 (sparse map/broadcast where the output eltype is not a concrete subtype of Number)
Type inference seems to have improved (:tada:) to the point that half of the tests for this issue are ineffective: julia> intoneorfloatzero(x) = x != 0.0 ? Int(1) : Float64(x)
intoneorfloatzero (generic function with 1 method)
julia> foo = map(intoneorfloatzero, speye(4))
4×4 SparseMatrixCSC{Union{Float64, Int64},Int64} with 4 stored entries:
[1, 1] = 1
[2, 2] = 1
[3, 3] = 1
[4, 4] = 1
julia> eltype(foo)
Union{Float64, Int64}
julia> zero(eltype(foo))
0 (Previously |
… a concrete subtype of Number (JuliaLang#19561, later part).
where the output eltype is not a concrete subtype of Number (#19561, later part).
Closed by #20862 I believe. |
As mentioned in #22945 (comment), the consequence of the fix here might cause some trouble elsewhere. @amitmurthy would you be able to share some real world examples where |
The specific requirement was for a Lines 263 to 268 in 0a37b3d
SparseMatrix{Ref,Int} continues to be supported.
Other than the above I don't have any other real-world examples for non-numeric sparse arrays. |
Above
|
Also a problem when mixing ints and floats.
If sparse arrays are not meant to be used with non-numeric data, we should at least throw a better error message.
The text was updated successfully, but these errors were encountered: