Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHOLMOD makeover #10117

Merged
merged 10 commits into from
Feb 11, 2015
Merged

CHOLMOD makeover #10117

merged 10 commits into from
Feb 11, 2015

Conversation

andreasnoack
Copy link
Member

This was initiated as an attempt to fix #3295 and with the changes in this pr makes it possible to write

(99340,99340)

julia> nnz(A)
954163

julia> b = ones(size(A, 1));

julia> @time A\b;
elapsed time: 1.068284149 seconds (13 MB allocated)

which is comparable to MATLAB on my computer. MATLAB is a little faster, but their versions of the sparse solvers could use a different optimization library for the symbolic part and MATLAB also uses MKL.

To make this work, I had to wrap cholmod_symmetry which is an effective function for checking symmetry/Hermitianity of a matrix. The method had a tiny bug, which Tim Davis was kind enough to fix promptly, so I'm also bumping the SuiteSparse version.

However, it ended up with quite a bit more:

  • Separate LDLt and LLt factorizations into cholfact and ldltfact.
  • Remove support for factorizing AAt when passing non-symmetric/Hermitian matrix
  • Let CHOLMOD handle memory allocation instead of Julia. Many CHOLMOD functions make changes to the allocated memory, so in the end it wasn't feasible to allocate the memory as Julia vectors.
  • Test coverage of CHOLMOD should now be almost 100 pct.
  • Renaming: Cholmod has been removed from most names as it is redundant in a module named CHOLMOD
  • Fix the method for reading to matrix market files to work for complex and all integer types. We might want to use this for an exported method to ease the import of MM files.
  • Use Symmetric{Tv,SparseMatrixCSC{Tv,Ti}} and Hermitian{Tv,SparseMatrixCSC{Tv,Ti}} for symmetric/Hermitian sparse matrices. This makes it possible to work with symmetric/Hermitian sparse matrices where only one triangle is stored as is the case for MM files.
  • Triplet type and methods have been removed as it they were unexported, untested, and the functionality is covered in native Julia
  • Add show(Factor) method that takes an io argument
  • Remove other untested and unused methods

I'd like to hear what @dmbates has to say on these changes. I've might broken some functionality used down the stream (beyond the obvious from the name changes). Also cc: @ViralBShah, @tkelman, @acroy

UPDATE: I forgot a couple of changes which are now added to the list

@andreasnoack andreasnoack force-pushed the anj/cholmod branch 2 times, most recently from cc21a72 to 994aa0a Compare February 8, 2015 00:53

### A way of examining some of the fields in chm_com
for Ti in IndexTypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is IndexTypes defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cholmod_h.jl

@tkelman
Copy link
Contributor

tkelman commented Feb 8, 2015

Generally in favor so far, but a few things.

First, can you separate the library version bump from the major revamp of the Julia code? Presumably it's a backwards compatible release of SuiteSparse so shouldn't break anything otherwise. If you are relying on behavior that has only been fixed in that latest version, we'll need to get that version built into the Windows nightlies, and packaged in the Homebrew tap and the Ubuntu PPA for this to work on CI.

And this pull request is unmergeable, it apparently needs rebasing.

@andreasnoack
Copy link
Member Author

Yes. I'll separate out the version bump. I've inserted branches for the versions where I rely on the fix, so the code works with order versions of SuiteSparse, but is less efficient.

I'll rebase when SuiteSparse has been bumped. However, don't know why the tests fail on Travis, as they pass locally, but I'm looking into it.

@ViralBShah
Copy link
Member

This is fantastic. Much needed work!

I see there are modifications to chm_print as well. If you are not already doing this, could you lower the default verbosity level of printing of Cholmod objects? It shows far too much information right now, which doesn't mean much to people not in the sparse solver implementation business.

@ViralBShah
Copy link
Member

Also, the commits are nicely separated - so would love to keep this commit history.

@ViralBShah ViralBShah added the sparse Sparse arrays label Feb 8, 2015
@ViralBShah
Copy link
Member

Can't quite tell in the large patch set here - but we should also check for the correct cholmod version for the bug in cholmod_symmetry, since distros may probably distribute older versions.

@ViralBShah
Copy link
Member

Do you think we can get rid of SuiteSparse_wrapper.c?

@isok ccall((:cholmod_l_start, :libcholmod), Cint, (Ptr{UInt8},), chm_l_com)
end
chm_l_com
ccall((:jl_cholmod_version, :libsuitesparse_wrapper), Cint, (Ptr{Cint},), version_array)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if dlsym does not find the cholmod_version symbol, it is safe to assume that the version of CHOLMOD is too old for us. This lets us get rid of jl_cholmod_version from the wrapper too.

@ViralBShah
Copy link
Member

These tests are failing for me:

    JULIA test/sparse
    From worker 4:       * sparse/umfpack       in   6.22 seconds
exception on 3: ERROR: LoadError: ArgumentError: sparse matrix is not symmetric/Hermitian
 in cholfact at sparse/cholmod.jl:942
 in anonymous at no file:317
 in runtests at /Users/viral/julia/test/testdefs.jl:78
 in anonymous at multi.jl:837
 in run_work_thunk at multi.jl:588
 in anonymous at task.jl:837
while loading sparse/cholmod.jl, in expression starting on line 284
exception on 2: ERROR: LoadError: ArgumentError: sparse matrix is not symmetric/Hermitian
 in cholfact at sparse/cholmod.jl:942
 in cholfact at sparse/cholmod.jl:985
 in runtests at /Users/viral/julia/test/testdefs.jl:78
 in anonymous at multi.jl:837
 in run_work_thunk at multi.jl:588
 in anonymous at task.jl:837
while loading sparse/sparse.jl, in expression starting on line 674
ERROR: LoadError: LoadError: ArgumentError: sparse matrix is not symmetric/Hermitian
 in anonymous at task.jl:1379
while loading sparse/sparse.jl, in expression starting on line 674
while loading /Users/viral/julia/test/runtests.jl, in expression starting on line 49

    From worker 2:       * sparse/sparse        From worker 3:       * sparse/cholmod      make[1]: *** [sparse] Error 1
make: *** [test-sparse] Error 2

@acroy
Copy link
Contributor

acroy commented Feb 8, 2015

Very nice! It's a lot of changes but it looks much more consistent this way.

@ViralBShah : we can't get rid of the wrapper yet, since we still need jl_cholmod_sizeof_long (In cholmod_h.jl now).

@ViralBShah
Copy link
Member

Since we pick the sizeof_long as part of SuiteSparse config, we can write it during build time.

@acroy
Copy link
Contributor

acroy commented Feb 8, 2015

The problem is how to parse SuitSparse's config.h. The size of Suitsparse_long is picked depending on the platform (via #ifdefs) or a user defined macro. We would have to know how the library was build to be sure of the size of Suitsparse_long.

@ViralBShah
Copy link
Member

We know it because we pick it in deps/Makefile:

ifeq ($(USE_BLAS64), 1)
UMFPACK_CONFIG = -DLONGBLAS='long long'
CHOLMOD_CONFIG = -DLONGBLAS='long long'
SPQR_CONFIG = -DLONGBLAS='long long'
ifeq ($(USE_SYSTEM_BLAS), 0)
ifeq ($(OPENBLAS_SYMBOLSUFFIX), 64_)
UMFPACK_CONFIG += -DSUN64
CHOLMOD_CONFIG += -DSUN64
SPQR_CONFIG += -DSUN64
endif
endif
endif

@tkelman
Copy link
Contributor

tkelman commented Feb 8, 2015

@ViralBShah LONGBLAS is not exactly the same thing as SuiteSparse_long. The former is used inside SuiteSparse when calling blas, the latter is used for SuiteSparse long API's. cholmod_blas.h defines BLAS_INT to SuiteSparse_long when LONGBLAS is defined, but distributions might be compiling suitesparse quite differently to the way we do.

@andreasnoack
Copy link
Member Author

@ViralBShah I changed the printing level as you suggest. This is also to suppress warnings when the matrix is not positive definite. In the end I decided to write a new show method that extracts the same information as cholmod_print_factor because this makes it possible to direct the printed output whereas the CHOLMOD print functions just print to STDOUT (I presume).

I'm checking the version in issym and ishermitian, (see the end of the file) but maybe we should add a check to symmetry.

Regarding SuiteSparse_wrapper.c, a problem with removing the file is to have a way to get the offsets into the cholmod_common_struct which is a quite large struct that has structs as fields and therefore is probably difficult to write a matching type for. Maybe we could ask if Tim Davis would consider to add some get and set functions to the library.

The errors you are getting are the same as the errors on Travis, but I dont' see them locally. I'll try julia.mit.edu and see if I can get them there.

@andreasnoack
Copy link
Member Author

Actually, I think that we are only using fields of cholmod_common_struct that are already part of our Common type so I don't think we need the offsets from SuiteSparse_wrapper.c`

@andreasnoack
Copy link
Member Author

Disregard the last comment. We'll need the offsets because we need to extract fields that are far down the struct. The Common type should probably just be removed as it is not really used for anything right now.

@ViralBShah
Copy link
Member

It's probably not worth the effort to ask Tim. I guess we can just retain the file for now.

@andreasnoack
Copy link
Member Author

@ViralBShah I don't see the test failures on julia.mit.edu either. Annoying when failures are not locally reproducible. Which system are you using when you get the failures?

@ViralBShah
Copy link
Member

@tkelman I was wondering the same after I posted my comment. I vaguely recollect that one can still configure SuiteSparse_Long. It may not matter if we are retaining the wrapper. However, from SuiteSparse_config.h

 * The long integer can also be defined at compile time.  For example, this
 * could be added to SuiteSparse_config.mk:
 *
 * CFLAGS = -O -D'SuiteSparse_long=long long' \
 *  -D'SuiteSparse_long_max=9223372036854775801' -D'SuiteSparse_long_idd="lld"'
 *
 * This file defines SuiteSparse_long as either long (on all but _WIN64) or
 * __int64 on Windows 64.  The intent is that a SuiteSparse_long is always a
 * 64-bit integer in a 64-bit code.  ptrdiff_t might be a better choice than
 * long; it is always the same size as a pointer.
 *

@ViralBShah
Copy link
Member

@andreasnoack I am using OS X. Are you running on a clean copy?

@andreasnoack
Copy link
Member Author

No. I also just tried to build on the server again without cleanall right after changing from master and it works. Could you try and see if a make cleanall can fix it on your machine?

@ViralBShah
Copy link
Member

I am trying that - but are you sure you have committed everything?

@ViralBShah
Copy link
Member

Huh - after a make cleanall, the tests are passing. Perhaps the wrong suitesparse was getting picked up?

@ViralBShah
Copy link
Member

Travis probably also has the old suitesparse - if it is indeed the cholmod_symmetry bug.

@andreasnoack
Copy link
Member Author

I've added some lines to the list of changes. In particular I'd forgotten to add the separation between the LLt and LDLt factorization such you now get the former from cholfact and the latter from ldltfact. Furthermore, a runtime error is thrown if the matrix is not symmetric/Hermitian and it is this error we see in the Travis output.

Talking about Travis, I made it print the version of CHOLMOD and to my surprise it was 2.1.1 which is almost two years old. I thought that Travis used the same versions of the dependencies as those specified in Versions.make. @tkelman and @staticfloat is this expected?

r2 = int(unsafe_load(s.p, i1 + 1))
(r1 > r2) && return zero(T)
r1 = int(searchsortedfirst(pointer_to_array(s.i, (s.nzmax,), false), i0 - 1, r1, r2, Base.Order.Forward))
((r1 > r2) || (unsafe_load(s.i, r1) + 1 != i0)) ? zero(T) : unsafe_load(s.x, r1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that none of these branches are tested as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably right. I can probably reach all of them by testing the show method on Sparse.

@andreasnoack
Copy link
Member Author

@jakebolewski I have made an iteration more over the test coverage and I've also expanded most of the branches to separate lines to make it easier to see the coverage as you suggested. The print methods from CHOLMOD cannot be tested without printing to the screen, so I have left them out deliberately.

I've been rebasing locally to add changes to the old commits to try to keep some of the changes separated. At the same time there was a conflict with master and I think the combination made git do things I didn't expect when trying to rebase origin/master. Almost all of my changed got reverted and the only solution I could come up with was to merge master into my branch. Any git advice on this issue is appreciated.

Split cholfact and ldltfact functionality

Add cholmod_symmetry wrapper. Add issym and ishermitian methods.

SuiteSparse version bumped to fix bug in cholmod_symmetry

Name cleanup:
 - cholmod and CHOLMOD avoided within module
 - use same names as in the C library for the wrapper

Use same argument order as the C library

Argument types in ccall checked

Remove scale methods for Sparse

Reorganize tests
andreasnoack added a commit that referenced this pull request Feb 11, 2015
@andreasnoack andreasnoack merged commit a387920 into master Feb 11, 2015
@andreasnoack andreasnoack deleted the anj/cholmod branch February 11, 2015 21:42
@ViralBShah
Copy link
Member

This is a great improvement. Thanks @andreasnoack

@ViralBShah
Copy link
Member

I believe SuiteSparse needs to be bumped, now that this is merged.

@andreasnoack
Copy link
Member Author

You're welcome. I think we should wait to next bug fix release. I've added version checks around the new features so they work, although slower, with out present version. The reason we should wait is that Tim Davis forgot to bump the version number in the header file so it will appear as the old version in our wrappers. He said he'd collect a few more bug fixes before next release so we should just wait for that.

@dmbates
Copy link
Member

dmbates commented Feb 13, 2015

I just noticed now that I have been working with this version for a couple of days that there are no cholfact! methods for updating an existing decomposition. To me at least these are very important, not only for reusing the storage but also because updating a decomposition is much faster than forming the decomposition from scratch. The former only requires the numeric phase whereas the latter requires both the symbolic and numeric phase.

Was this intentional or an oversight? If it is an oversight then I will create a pull request to add methods for it. (Well, I will try to create a pull request. Most of the time things fall apart badly when I try to do so and I need to remove all traces of julia from my computer and start again from scratch.)

@dmbates
Copy link
Member

dmbates commented Feb 13, 2015

It looks as if it would be possible to simply copy the cholfact! method definitions in cholmod.jl from the release version to the development version, changing CholmodSparse to Sparse and CholmodFactor to Factor.

@andreasnoack
Copy link
Member Author

Because I had split the LDLt from the LL', I decided to introduce update! instead of cholfact! because I assumed that you wouldn't switch between the two versions when updating a factorization. However, I didn't export it which it probably should have been.

Could you check and see if update! works for you?

@dmbates
Copy link
Member

dmbates commented Feb 13, 2015

Thanks. I didn't see the update! method (although now that you mention it I believe that you documented it). The name makes more sense than cholfact! for the general case, as you suggested.

Sorry for the false alarm.

@staticfloat
Copy link
Member

Small note to those interested: SuiteSparse 4.4.3 is now the default on Travis and Ubuntu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sparse \ polyalgorithm
7 participants