initial svds support based on eigs #9425

jaak-s · 2014-12-20T17:20:44Z

Added support for sparse truncated SVD based on eigs. Keyword inputs are the same as for eigs. Returns left singular vectors, singular values, right singular vectors (and also outputs from eigs). Added a test case.

StefanKarpinski · 2014-12-20T17:22:53Z

My sense is that truncated SVD should be an option to the svd function and the fact that the svds sparse SVD function was used for this is a weird historical design artifact in Matlab that we should not copy.

StefanKarpinski · 2014-12-20T17:33:52Z

In fact, I'm not even sure a separate svds function should exist.

jaak-s · 2014-12-20T17:57:36Z

Sure, I've no preference on the structuring/naming.

Just was currently missing the ability to run Arnoldi (ARPACK) methods to extract first singular vectors of large matrices that have sparse or special structure.

ViralBShah · 2014-12-21T03:04:24Z

We do need a different name though since svd on sparse matrices will do the full svd when we have it. We could have a named argument, but the interface adds too many options for my liking to combine it all in svd.

ViralBShah · 2014-12-21T03:05:50Z

@jaak-s could you add the documentation as well?

ViralBShah · 2014-12-21T03:06:18Z

Cc @Jutho

tkelman · 2014-12-21T06:24:12Z

Also remove the trailing whitespace from the comment in the tests

ViralBShah · 2014-12-21T06:51:06Z

Should the trailing whitespace be discussed in CONTRIBUTING.md along with configuration options for emacs/vim and in git?

For example, I now have this in my .emacs thanks to some help in one of the issues:

;; Delete trailing whitespace
(add-hook 'before-save-hook 'delete-trailing-whitespace)

tkelman · 2014-12-21T06:59:42Z

We should have this conversation in a separate issue, but yes I think we should move it to several places, and not do the check in Travis. Code style nitpicking should not prevent running the build and tests. I think we should pick an arbitrary throwaway CI competitor to Travis, like Drone or CircleCI or Shippable, and do the whitespace check there as a separate status indicator. Or put together something custom on Heroku or whatever, but using an existing service seems easier.

ViralBShah · 2014-12-21T07:07:26Z

BTW, the bikeshedding should not block this PR. Let's get the functionality in along with tests and docs. The naming issue applies to eigs also, and is a longstanding issue.

Jutho · 2014-12-21T09:16:29Z

Looks good to me. The tests are only ok for input matrices with real numbers (which is the case for the input matrix A), otherwise it should be sign(...) \ ... or conj(sign(...)) * ...

andreasnoack · 2014-12-21T13:20:45Z

base/linalg/arnoldi.jl

+## svds
+
+type SvdX <: AbstractArray{Float64, 2}
+    X


I think we'd like to parametrize this type on the input element and matrix type (and use capital letters), i.e. something like

type SVDOperator{T,S} data::S end

I addition, we'll have to support more than Float64 and the type should reflect that. The type can probably not be subtype of AbstractMatrix either, as the input might not be a subtype of AbstractMatrix.

ViralBShah · 2014-12-21T13:27:42Z

@andreasnoack Allocation-free version of eigs is discussed in #7907.

ViralBShah · 2014-12-21T13:29:02Z

Octave takes the approach of using [0 A; A' 0], and there is then some cleanup afterwards, IIRC. I am happy to look it up and describe the method here so that someone else can do a cleanroom implementation - unless you already know what needs to be done.

andreasnoack · 2014-12-21T14:07:55Z

@ViralBShah I'm not sure about how to handle the cleanup in general. The [0 A; A' 0] operator is singular when m!=n so some zeros would have to be removed when looking for the small singular values.

jaak-s · 2014-12-21T16:28:07Z

Pushed few commits: manual for svds and improved handling of ritzvec parameter.

The comments from @andreasnoack are spot on.

Regarding type information I don't have a good solution yet because the SVDOperator is only an internal type and not visible to users. It seems some additional tricks are needed. Currently, svds(x) uses duck typing, i.e. x only needs to implement functions size and *. What about adding extra parameter to svds specifying whether Float64 or Complex64 is used for eigs?

I have not checked how to do post-processing for [0 A; A' 0]. Any pointers? If it easy I can do it. If it is more complex then we can use the current version as a place holder until the other is ready.

andreasnoack · 2014-12-22T21:14:50Z

Currently, svds(x) uses duck typing, i.e. x only needs to implement functions size and *. What about adding extra parameter to svds specifying whether Float64 or Complex64 is used for eigs?

I don't think it is a limitation to require that the operator type has a parameter for the element type. The linear operators in Base (the matrices) have and I think it would feel natural to have an element type for user defined linear operators as well. If we assume that then we already have the element type in the operator type. We'd also like to have that in order to have fast dispatch for * and \ because sometimes it is necessary with many iterations.

I have not checked how to do post-processing for [0 A; A' 0]. Any pointers?

Not sure, but maybe it is discussed somewhere in the ARPACK manual, but that is not the most joyful read. Let's browse the literature and see what we can find.

ViralBShah · 2014-12-23T05:50:34Z

I don't think the ARPACK manual sheds light on that, and even if it does, it will be impossible to find it. IIRC, the only place where I see discussion of svd is in the examples, and they do A'*A there, but it is only a demonstration, not a recommended way.

ViralBShah · 2014-12-23T06:05:00Z

Here is what octave does. I suggest that @jaak-s not look at the code, since it is GPL, and I am looking at it and describing what it does. For others look for svds.m in the octave source.

Octave calls eigs on [0 A; A' 0]. The comment in the codebase says:

    ## We wish to exclude all eigenvalues that are less than zero as these
    ## are artifacts of the way the matrix passed to eigs is formed. There
    ## is also the possibility that the value of sigma chosen is exactly
    ## a singular value, and in that case we're dead!! So have to rely on
    ## the warning from eigs. We exclude the singular values which are
    ## less than or equal to zero to within some tolerance scaled by the
    ## norm since if we don't we might end up with too many singular
    ## values.

The norm that they use is the largest singular value (if the sigma input variable is a character) or normest (uses a power series analysis to estimate the norm). Basically, they return all singular values where abs(s) <= tol.

Other things they do are scaling the input by the largest value (1-norm) to make things a bit more stable. If sigma is specified, that also needs scaling. If sigma is 0, then they compute twice as many eigenvalues. Quote:

      ## Find the smallest eigenvalues
      ## The eigenvalues returns by eigs for sigma=0 are symmetric about 0.
      ## As we are only interested in the positive eigenvalues, we have to
      ## double k and then throw out the k negative eigenvalues.
      ## Separately, if sigma is non-zero, but smaller than the smallest
      ## singular value, ARPACK may not return k eigenvalues. However, as
      ## computation scales with k we'd like to avoid doubling k for all
      ## scalar values of sigma.

ViralBShah · 2014-12-23T06:17:50Z

@alanedelman What are your thoughts here on doing A'*A vs. [0 A; A' 0]?

jaak-s · 2014-12-23T12:09:51Z

I will soon add type information to SVDOperator. Users with x::AbstractArray only need to do svds(x). For custom types (duck typing) users probably have to wrap it themselves by SVDOperator.

@ViralBShah Thank you for information. It is now quite clear how to extract largest singular values from [0 A; A' 0]. For the smallest it is not so clear yet for me. Also is there information how to extract singular vectors (left or right) from that solution?

ViralBShah · 2014-12-23T15:29:32Z

@jaak-s The singular vectors computed just as @Jutho described.

m, n = size(A)
V, s = eigs(SVDOperator(A), ....)
ind = <list of indices into s, after filtering the singular values>
u  = sqrt(2) * V[1:m, ind]
v = sqrt(2) * V[m+1:end, ind]

ViralBShah · 2015-01-01T20:03:28Z

Bump @jaak-s

jaak-s · 2015-01-01T22:28:02Z

Thanks for the bump, got busy during the holidays. I'll try to put patches together in the next couple of days.

I tried out the [0 A; A' 0] approach you and @Jutho outlined and it seems to work well.

jaak-s · 2015-01-03T01:19:07Z

Now svds uses [0 A; A' 0] approach, saves type information in SVDOperator and works also with complex operators. The doc has not been updated. Plan is to finalize the docs after you all are fine with the patch.

Main changes:

using approach by @ViralBShah to extract singular vectors
changed input parameter nev to nsv (number of singular values)
svds only returns largest singular values. The option for the smallest singular values is currently removed because [0 A; A' 0] creates extra small singular values and post-processing is not so clear. If someone knows the exact steps let me know, if not we can leave it for future ToDo.
When using duck typed X in svds(X) following methods are required: size(X), eltype(X), X * vector, X' * vector.

ViralBShah · 2015-01-03T03:30:02Z

base/linalg/arnoldi.jl

+end
+
+function svds{S}(X::S; nsv::Int = 6, ritzvec::Bool = true, tol::Float64 = 0.0, maxiter::Int = 1000)
+  otype = getOperatorType(X)


I think we can just simplify this to use just eltype(X) and avoid getOperatorType() altogether. The checks for whether eltype() is BlasFloat should ideally be in eigs. It should be possible to make SVDOperator free of types then.

ViralBShah · 2015-01-03T03:31:13Z

For good performance, we should be using A_mul_B!, but that requires some work on eigs. For now, nothing much can be done here.

tkelman · 2015-01-03T03:35:00Z

base/linalg/arnoldi.jl

+  ex    = eigs(SVDOperator{otype,S}(X), I; ritzvec = ritzvec, nev = 2*nsv, tol = tol, maxiter = maxiter)
+  ind   = [1:2:nsv*2]
+  sval  = abs(ex[1][ind])
+


travis doesn't like the trailing whitespace here or on line 151

ViralBShah · 2015-01-03T03:39:35Z

We also need to filter out the singular values that are zero or close to zero.

jaak-s · 2015-01-03T12:38:50Z

The reason for introducing getOperatorType was to support (sparse) binary and integer matrices. Their eltype is bool/integer but they can be multiplied by float vectors. Basically changing the operator type to Float64 for integer matrix allows eigs to work on it. Note that svd works with integer matrices and I found it is useful to have out-of-the-box support for them in svds too (also because in my work I use svds on binary matrices).

If you wish I can remove getOperatorType.

White-space should be now fixed.

jaak-s · 2015-01-03T12:44:57Z

Regarding removing close to zero singular values, I would prefer the current behavior. I think users like to get N values when they set nsv=N, even if last of them are 0. This is also the behavior of svds in Octave and in Scipy (scipy.sparse.linalg.svds).

ViralBShah · 2015-01-03T12:56:30Z

I agree the integer matrix support is good to have. I suggest that instead of getOperatorType in svds, we should fix this for eigs. To support integer matrices, it is better to convert them to float in eigs, since the matrix-vector products are significantly faster (due to use of BLAS in the dense case). Would be great if you can add it to eigs here, or I can do it later.

jaak-s · 2015-01-03T13:07:26Z

Sounds good, I will replace getOperatorType with eltype then. Since I am not very familiar with code of eigs I'd prefer to leave it to you.

ViralBShah · 2015-01-03T13:14:46Z

About the filtering out of singular values that are 0 - octave does do it. But, we can do it once we get this basic piece of code checked in - once you remove getOperatorType.

jaak-s · 2015-01-03T13:44:50Z

getOperatorType is now removed, also added check for nsv.

I wonder what singular values does Octave remove because they return 0-valued singular values (version 3.8.1):

octave:1> svds( zeros(4,3), 2 )
ans =

   0
   0

ViralBShah · 2015-01-11T09:53:22Z

This was in the octave source that I quoted above. Of course there are legitimate zero singular values. Reading this carefully, they are doing a filter on negative values, and perhaps something in the case where sigma is specified. I will look deeper.

    ## We wish to exclude all eigenvalues that are less than zero as these
    ## are artifacts of the way the matrix passed to eigs is formed. There
    ## is also the possibility that the value of sigma chosen is exactly
    ## a singular value, and in that case we're dead!! So have to rely on
    ## the warning from eigs. We exclude the singular values which are
    ## less than or equal to zero to within some tolerance scaled by the
    ## norm since if we don't we might end up with too many singular
    ## values.

ViralBShah · 2015-01-11T09:54:11Z

Travis failure on linux appears unrelated.

initial svds support based on eigs

ViralBShah · 2015-01-11T09:54:49Z

I think this is a great start with svds and we will refine it as we go along.

ViralBShah · 2015-01-11T10:19:00Z

Added to NEWS in 47061a5

jaak-s · 2015-01-11T11:41:02Z

Thank you.

I forgot to update the docs earlier but now pushed a commit that reflects the latest version of svds. Can you merge it again or should I create a new pull request?

ViralBShah · 2015-01-11T11:54:36Z

It looks like your last commit here is still 8 days ago. Could you create a new PR?

Also, we should have squashed these commits, but it escaped my mind before merging.

jaak-s · 2015-01-11T12:02:49Z

Created PR #9723 for docs.

Ai run arnoldy test nao kthx

initial svds support based on eigs

9ea6b51

jaak-s mentioned this pull request Dec 20, 2014

Feature request: sparse truncated SVD JuliaLang/LinearAlgebra.jl#106

Closed

ViralBShah added arrays [a, r, r, a, y, s] linear algebra Linear algebra and removed arrays [a, r, r, a, y, s] labels Dec 21, 2014

removed extra spaces, added ritzvec param to svds

8b061df

andreasnoack reviewed Dec 21, 2014
View reviewed changes

jaak-s added 2 commits December 21, 2014 15:55

manual for svds

d654afa

renamed to SVDOperator

6e91217

jaak-s added 2 commits January 2, 2015 19:58

svds now uses eigs of [0 A; A' 0]

95e2d6c

support for Complex operators

b1517a6

ViralBShah reviewed Jan 3, 2015
View reviewed changes

tkelman reviewed Jan 3, 2015
View reviewed changes

removed trailing whitespace

be1f1bb

jaak-s added 2 commits January 3, 2015 14:19

replaced getOperatorType to eltype

f8edd41

added check for nsv

b9d7a30

ViralBShah added a commit that referenced this pull request Jan 11, 2015

Merge pull request #9425 from jaak-s/svds

2acaf34

initial svds support based on eigs

ViralBShah merged commit 2acaf34 into JuliaLang:master Jan 11, 2015

tkelman referenced this pull request Feb 26, 2015

orly y u noe run arnoldy test

0e70a4b

Ai run arnoldy test nao kthx

andreasnoack mentioned this pull request May 5, 2017

Use A'A instead of [0 A;A' 0] in svds. #21701

Merged

initial svds support based on eigs #9425

initial svds support based on eigs #9425

Conversation

jaak-s commented Dec 20, 2014

StefanKarpinski commented Dec 20, 2014

StefanKarpinski commented Dec 20, 2014

jaak-s commented Dec 20, 2014

ViralBShah commented Dec 21, 2014

ViralBShah commented Dec 21, 2014

ViralBShah commented Dec 21, 2014

tkelman commented Dec 21, 2014

ViralBShah commented Dec 21, 2014

tkelman commented Dec 21, 2014

ViralBShah commented Dec 21, 2014

Jutho commented Dec 21, 2014

andreasnoack Dec 21, 2014

Choose a reason for hiding this comment

andreasnoack Dec 21, 2014

Choose a reason for hiding this comment

ViralBShah commented Dec 21, 2014

ViralBShah commented Dec 21, 2014

andreasnoack commented Dec 21, 2014

jaak-s commented Dec 21, 2014

andreasnoack commented Dec 22, 2014

ViralBShah commented Dec 23, 2014

ViralBShah commented Dec 23, 2014

ViralBShah commented Dec 23, 2014

jaak-s commented Dec 23, 2014

ViralBShah commented Dec 23, 2014

ViralBShah commented Jan 1, 2015

jaak-s commented Jan 1, 2015

jaak-s commented Jan 3, 2015

ViralBShah Jan 3, 2015

Choose a reason for hiding this comment

ViralBShah commented Jan 3, 2015

tkelman Jan 3, 2015

Choose a reason for hiding this comment

ViralBShah commented Jan 3, 2015

jaak-s commented Jan 3, 2015

jaak-s commented Jan 3, 2015

ViralBShah commented Jan 3, 2015

jaak-s commented Jan 3, 2015

ViralBShah commented Jan 3, 2015

jaak-s commented Jan 3, 2015

ViralBShah commented Jan 11, 2015

ViralBShah commented Jan 11, 2015

ViralBShah commented Jan 11, 2015

ViralBShah commented Jan 11, 2015

jaak-s commented Jan 11, 2015

ViralBShah commented Jan 11, 2015

jaak-s commented Jan 11, 2015