Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that sparse matrix is valid before constructing a CHOLMOD.Sparse #20464

Merged
merged 1 commit into from
Feb 14, 2017

Conversation

andreasnoack
Copy link
Member

These are just some consistency checks of the inputs before constructing a CHOLMOD.Sparse. Julia's SparseMatrixCSC should always pass the tests unless the buffers have been directly modified. However, as seen in #20024, serialization might corrupt the sparse matrices which can lead to cryptic bugs. The checks introduced in this PR would probably have made it easier to detect the issue.

Fixes #20024

# check if columns are sorted
# checks
## length of input
if length(nzval) != length(rowval)
Copy link
Member

@Sacha0 Sacha0 Feb 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSC does not require that length(nzval) == length(rowval), and we do not generally enforce that condition (nor should we)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you'd like to allow length(nzval) > colptr[end] but why would you allow length(nzval) != length(rowval)? It also appears that CHOLMOD assumes that these have the same size.

if length(nzval) != length(rowval)
throw(ArgumentError("nzval and rowval must have same length"))
end
if length(colptr) != n + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps length(colptr) >= n + 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shouldn't length(colptr) == n + 1? Again, that seems to be the assumption of CHOLMOD.

if length(colptr) != n + 1
throw(ArgumentError("length of colptr must be n + 1 = $(n + 1) but was $(length(colptr))"))
end
if colptr[end] != length(rowval)
Copy link
Member

@Sacha0 Sacha0 Feb 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSC does not require that length(rowval) == colptr[n + 1], and we do not generally enforce that condition (nor should we)? Perhaps length(rowval) + 1 >= colptr[n + 1] and likewise for nzval?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you don't want this so I'll change it. However, I think I might have assumed this all over the CHOLMOD wrappers so there might be other places where the number of non-zeros is queried with length(nzval) which is then wrong.

@@ -898,6 +909,17 @@ function (::Type{Sparse}){Tv<:VTypes}(m::Integer, n::Integer,
end

function (::Type{Sparse}){Tv<:VTypes}(A::SparseMatrixCSC{Tv,SuiteSparse_long}, stype::Integer)
## Check length of input. This should never fail but see #20024
if length(A.nzval) != length(A.rowval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if length(A.colptr) != size(A,2) + 1
throw(ArgumentError("length of colptr must be size(A,2) + 1 = $(size(A,2) + 1) but was $(length(A.colptr))"))
end
if A.colptr[end] != length(A.rowval) + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sacha0
Copy link
Member

Sacha0 commented Feb 5, 2017

Re. discussion in the review comments above, the general argument for the weaker conditions (length(colptr) >= n + 1 rather than length(colptr) == n + 1, length(rowval) >= colptr[n + 1] - 1 rather than length(rowval) == colptr[n + 1] - 1 and likewise for nzval, and no restriction on the relative lengths of rowval and nzval) is better allocation control / predictability / reuse. Examples:

Re. length(rowval) >= colptr[n + 1] - 1 rather than length(rowval) == colptr[n + 1] - 1 (and likewise for nzval), suppose you want to store sparse matrices with different patterns (and particularly stored entry counts) in the same data structure at different points during execution. Under the weaker condition you can do so by allocating enough storage for all expected patterns once at the outset and reusing that storage, whereas under the stronger condition (and concomitant resize!ing) allocation becomes unpredictable. (This issue impacts e.g. sparse broadcast[!].)

Re. length(colptr) >= n + 1 rather than length(colptr) == n + 1, suppose you want to store sparse matrices with different shapes at different points during execution. Under the weaker condition you can do so by allocating enough storage for all expected shapes once at the outset and reusing that storage (constructing only a new SparseMatrixCSC wrapper for each new sparse matrix), whereas under the stronger condition (and concomitant resize!ing) allocation becomes unpredictable. (This issue impacts e.g. the applications discussed in the thread regarding immutability of SparseMatrixCSC.)

Re. no restrictions on the relative lengths of rowval and nzval, suppose you are operating solely on a pattern rather than a complete CSC sparse matrix. Under the weaker condition you can cut your memory footprint by almost a factor of two, whereas under the stronger condition you must carry around a potentially large dead buffer of one form or another. Though your CSC sparse matrix isn't technically valid in that scenario, that scenario is common, for example in the symbolic phases of sparse factorization or in preordering. Alternatively, suppose you want to write a preordering or factorization routine that accepts some buffers and returns a CSC sparse matrix that recycles those buffers as storage; those buffers may need to have different lengths, in which case the weaker condition grants better predictability of and control over allocation downstream. (Issues like these I encountered while working on ApproxMinimumDegree.jl.)

Of course if CHOLMOD expects any of the stronger conditions and will complain or fail if they are not satisfied, I agree wholeheartedly that checking them at CHOLMOD entry points is a great idea :).

As you note above some code in Base assumes some subset of the stronger conditions. For reasons such as the above, we might want to ween that code off those assumptions over time, and e.g. remove automatic resize!ing from a variety of operations. Similarly, decoupling SparseVector storage length from stored entry count is another thing we should strongly consider (to mirror CSC's flexibility and control). (I planned to write a little Julep on the above when things settled down a bit, but it's mostly above already :).)

Best!

@andreasnoack
Copy link
Member Author

Thanks for the explanation. Generally, I'm fine with what you suggest but we should then really take a pass through the sparse code to make sure we use the buffers accordingly. E.g. I can see that this method makes the same assumptions as I did whereas the one right above doesn't. I'll update this PR to use your CSC definition.

@Sacha0
Copy link
Member

Sacha0 commented Feb 5, 2017

we should then really take a pass through the sparse code to make sure we use the buffers accordingly

Wholeheartedly agreed :). Best!

@kshyatt kshyatt added domain:linear algebra Linear algebra domain:arrays:sparse Sparse arrays labels Feb 5, 2017
@andreasnoack andreasnoack force-pushed the anj/spcheck branch 2 times, most recently from 26e5b8d to a26f482 Compare February 6, 2017 18:55
@andreasnoack
Copy link
Member Author

I've changed the tests following the conclusion above. I've also changed the CHOLMOD code and the nnz definition to follow the rules you described and I've just confirmed that it fixes #20024.

if length(colptr) <= n
throw(ArgumentError("length of colptr must be at least n + 1 = $(n + 1) but was $(length(colptr))"))
end
if colptr[n + 1] > length(rowval)
Copy link
Member

@Sacha0 Sacha0 Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these checks be colptr[n + 1] > length(rowval) + 1 and colptr[n + 1] > length(nzval) + 1? If memory serves, colptr[n + 1] should be one greater than the number of stored entries? (Or is colptr indexed from zero here, in which case the existing checks would be correct?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor is zero based. I'm wondering if it should have a more alarming name for that reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor is zero based. I'm wondering if it should have a more alarming name for that reason.

That sounds great! Similarly, if some instances of colptr are zero based, might be worth similarly naming them appropriately? Best!

@@ -869,12 +880,12 @@ function (::Type{Sparse}){Tv<:VTypes}(m::Integer, n::Integer,
end
end

o = allocate_sparse(m, n, length(nzval), iss, true, stype, Tv)
o = allocate_sparse(m, n, colptr[n + 1], iss, true, stype, Tv)
Copy link
Member

@Sacha0 Sacha0 Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the third argument to allocate_sparse be the number of stored entries or one more than the number of stored entries? (Or is colptr indexed from zero here, in which case the existing statement is correct?)

unsafe_copy!(s.i, pointer(rowval), length(rowval))
unsafe_copy!(s.x, pointer(nzval), length(nzval))
unsafe_copy!(s.p, pointer(colptr), n + 1)
unsafe_copy!(s.i, pointer(rowval), colptr[n + 1])
Copy link
Member

@Sacha0 Sacha0 Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be colptr[n + 1] - 1, i.e. the number of stored entries? (Or is colptr indexed from zero here, in which case the existing statement is correct?)

@@ -42,7 +42,7 @@ julia> nnz(A)
3
```
"""
nnz(S::SparseMatrixCSC) = Int(S.colptr[end]-1)
nnz(S::SparseMatrixCSC) = Int(S.colptr[S.n + 1]-1)
Copy link
Contributor

@tkelman tkelman Feb 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the distinction here could use a more direct test that doesn't go through cholmod-specific methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some tests.

Don't use length of underlying arrays to query sizes

Rename variables in zero based constructors
@andreasnoack andreasnoack merged commit 769d37b into master Feb 14, 2017
@andreasnoack andreasnoack deleted the anj/spcheck branch February 14, 2017 15:25
tkelman pushed a commit that referenced this pull request Mar 1, 2017
#20464)

Don't use length of underlying arrays to query sizes

Rename variables in zero based constructors
(cherry picked from commit 769d37b)

testsets removed for release-0.5, and Vector{<:VTypes} written out
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants