Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Linear Algebra bikeshedding #2212

Closed
wants to merge 13 commits into from
Closed

WIP: Linear Algebra bikeshedding #2212

wants to merge 13 commits into from

Conversation

ViralBShah
Copy link
Member

Basically includes:

  1. Deprecate lud, chold, cholpd, qrd, qrpd
  2. lu, chol, cholpivot, qr, qrpivot now return Factorization objects
  3. documentation

[edit: The suitesparse stuff is now cherry-picked on the db/suitesparse branch/]

ViralBShah and others added 10 commits February 3, 2013 20:23
Relevant comments are in #2062.
- use lud and lud! methods for the decomposition
- allocate umf_conf and umf_info and use them throughout
- preserve the pointers to the symbolic and numeric factorizations in
  the UmfpackLU object.  Also keep around the decremented
  index/pointer vectors.
Conflicts:
	base/deprecated.jl
	test/Makefile
@StefanKarpinski
Copy link
Member

Oh, never mind. It's in a pull request. Carry on.

@nolta
Copy link
Member

nolta commented Feb 7, 2013

Here's a link to the previous chol vs chold argument: 826c42f#commitcomment-1816094

@carlobaldassi
Copy link
Member

What is the rationale for removing the matlab-compatible versions? (Alternatively: was this discussed further?) I'm quite happy with the solution found at the end of the thread linked by @nolta, i.e. the current situation.

@ViralBShah
Copy link
Member Author

What I do not like about the current situation is that we have all these great names that matlab uses (lu, qr, etc.) but they actually force you into computing in inefficient ways. I am proposing that we have defaults that are good and do the right thing in julia - i.e. all the linear algebra routines return Factorization objects, which allow for efficient solving of systems and such things, and use factors() to get the underlying factors in a way that is matlab compatible.

@ViralBShah
Copy link
Member Author

With this pull request, what you get is this, which is often all you really want to do with these factorizations:

lufact = lu(A)
x = lufact \ b 

But one can always get the factors and do other things:

l,u,p = factors(lufact) # Matlab-style - just the extra `factors`
x = u \ l \ b[p]  # Matlab-style solve that is not efficient, but works

@nolta
Copy link
Member

nolta commented Feb 7, 2013

So we're just going to have the same argument again?

@ViralBShah
Copy link
Member Author

In general, we have recently, as a community, been taking decisions that favour sane behaviour over matlab compatibility. I think that this is a case where being matlab compatible actually hurts, especially since we do not have varargout that matlab does to make its APIs nicer.

@StefanKarpinski
Copy link
Member

Having the Matlab-compatible versions and the *d and *d! versions is an awful lot of multiplicity here. Inflating the number of functions in the language by some non-negligible fraction is a lot of bending over backwards to maintain the Matlab-compatible and inferior interface.

@nolta
Copy link
Member

nolta commented Feb 7, 2013

I guess inferior's in the eye of the beholder. I use chol all the time for monte carlo simulations, and factors(chol(A)) feels incredibly inferior.

@ViralBShah
Copy link
Member Author

What do you do with the result of chol in your MC simulations?

@toivoh
Copy link
Contributor

toivoh commented Feb 7, 2013

Since chol is a bit special (only one factor), perhaps it could deserve a shortcut like cholfactor to extract the factor?

@nolta
Copy link
Member

nolta commented Feb 7, 2013

Sample from a multivariate Gaussian distribution.

@ViralBShah
Copy link
Member Author

I like the idea of having cholfactor as a shortcut in the case of chol. @nolta would that work?

@ViralBShah
Copy link
Member Author

We can also overload a few more methods for CholeskyDense objects to allow for the kind of things one might want to do.

@StefanKarpinski
Copy link
Member

Needing cholfactor might be a sign of this change not being ideal. Perhaps we should give this particular change some more time? Can Gaussian sampling be done easily with and/or more efficiently with the factorization object?

@carlobaldassi
Copy link
Member

For the record, what I use most often is sum(log(diag(chol(x))), i.e. the log of the square root of the determinant. Passing through det of the factor is not ok, since a lot of precision is lost that way (plus it's less efficient). This is already a long chain, putting factors in there is not the end of the world, but not particularly nice either. The only reasonable addition I can think of to the methods for CholeskyDense in this context would be having a logdet function. But I guess I'll end up writing my own shortcut anyway.

@andreasnoack
Copy link
Member

There is already rand(MultivariateNormal(mean, var)) in Distributions which could be used for multivariate normal sampling but that is of course a long name. Two possibilities could be a randn(cov, dim) method and some *(Matrix, CholeskyDense) methods.

@carlobaldassi I think it would be a good idea to have a logdet.

@mlubin
Copy link
Member

mlubin commented Feb 7, 2013

I support getting rid of the matlab compatibility. Factorization objects are a much more sane default.

@dmbates
Copy link
Member

dmbates commented Feb 7, 2013

@nolta As @andreasnoackjensen pointed out, sampling from a multivariate normal can be done with rand(MultivariateNormal(...))from the Distributions package.

julia> using Distributions

julia> mvn = MultivariateNormal([1.,2,3], diagm([4.,5,6]))
MultivariateNormal distribution
Mean:
[1.0, 2.0, 3.0]
Covchol: CholeskyDense{Float64}(3x3 Float64 Array:
 2.0  0.0      0.0    
 0.0  2.23607  0.0    
 0.0  0.0      2.44949,'U')
Mean:
[1.0, 2.0, 3.0]
Variance:
3x3 Float64 Array:
 4.0  0.0  0.0
 0.0  5.0  0.0
 0.0  0.0  6.0

julia> srand(12321)

julia> rand(mvn)
3-element Float64 Array:
 2.34813
 4.22277
 1.4347 

julia> srand(12321)

julia> rand(mvn, 3, 20)
3x20 Float64 Array:
 2.34813   0.491271    3.73474  -1.58803  2.15756   6.94249   2.16482  -3.25563     1.34691  -1.97108  0.192781  2.69105  1.23021  3.86129  -1.64641  1.4843   -0.778525
 4.22277   1.74714     4.80336   4.99726  2.07695  -0.790902  1.7337   -1.32994      6.00423   2.7134   4.6624    2.69176  1.08644  4.08717   3.14195  3.62224  -3.74896 
 1.4347   -0.0141233  -1.76968   3.15905  1.85379   2.03352   1.69304   0.195228     3.5728    2.00291  4.46716   4.52372  2.83042  3.09832   7.28602  3.83717   6.57325 

@dmbates
Copy link
Member

dmbates commented Feb 7, 2013

@carlobaldassi I think that adding a logdet method for CholeskyDense (and the soon-to-be CholeskySparse) type is the way to go. It would also be worthwhile adding diag methods for those types. The same could be done for the log-determinant and diagonal of LU factorizations and even the QR, operating on R.

The approach in R is to use the signature determinant(mat::Matrix{Number}, log::Bool) where log defaults to false. When log == true the value is the log of the determinant and + or - 1 for the sign. In the short term the important accessor is diag of the factorization.

@StefanKarpinski
Copy link
Member

I'm not sure I understand. Are you objecting to "shipping" the current state of master or something else?

@ViralBShah
Copy link
Member Author

What I am saying is that the current state of Factorization in master is not satisfactory in my opinion to ship. For something as basic as this, it would be difficult to change it at a later date - and I prefer moving all Factorization stuff into a package where it can be baked till it is good enough for inclusion in Base.

I would want to hear what others have to say.

@ViralBShah
Copy link
Member Author

Just a note - I have cherry-picked all the commits by @dmbates. The blas stuff is in master, whereas the suitesparse stuff is now on the db/suitesparse branch.

@mlubin
Copy link
Member

mlubin commented Feb 9, 2013

I suppose it's ok for Factorization to be moved into a package for now. Does this include the matlab-style routines? The Julia documentation should clearly point to it though.

@ViralBShah
Copy link
Member Author

What I am proposing is to leave the matlab style routines as they are for now as part of Base - so chol, cholp, lu, qr, qrp, etc. can stay as they are. The *d versions: chold, cholpd, lud, qrd, qrpd would move to say, a LinalgFactor package, where that interface can be experimented with. The package versioning gives a good method for backwards compatibility too.

@mlubin
Copy link
Member

mlubin commented Feb 9, 2013

Doesn't that make it difficult to change lu, chol, etc. to return factorization objects in the future?

@StefanKarpinski
Copy link
Member

Honestly, I think this is a terrible idea. If a change this drastic was going to happen, it should have been done days ago. We now have an API stability deadline in one day and there is no way a change this disruptive is going to happen without breaking things. If the factorization stuff wasn't ready, it shouldn't have been in master – we are already shipping it. Trying to pull this out of base now is making a massive, disruptive change with no deprecation path with a single day to get it right. Not a good idea.

@mlubin
Copy link
Member

mlubin commented Feb 9, 2013

Is there really that much to fix in order to get Factorization to a reasonable state? Could we agree on a list of things to fix and leave non-compatibility-breaking enhancements for later?

I propose:

  • We codify @nolta's approach that in all cases where operators are overloaded for Factorization objects or Factorization objects are passed to Base functions, the behavior should be the same as if the original matrix was passed.
  • Access individual factors using ref instead of factors. For now these can return full matrices, but in the future they could return specialized types, e.g. for QR as I mentioned the Q could be a special type which implements multiplication and backslash. This wouldn't break syntax compatibility if all someone does is use the Q factor as a normal matrix, although it would break things if they try to access the elements for some reason.
  • Put off for 0.2 implementing the nice tuple/iterator syntax for picking out all the factors at once.

@StefanKarpinski
Copy link
Member

It seems to me that "getting this right" is not super crucial now that we've arrived at the realization that we're not going to get this right before 0.1; since we know we're going to break people's code with 0.2 when we change all of this the main goal is not to break their code with 0.1 also with no benefit. So for tomorrow, I would suggest we tread lightly with only minimally disruptive changes that are unambiguously a good thing.

@andreasnoack
Copy link
Member

I also think it would too drastic to remove the factorizations. Just for the perspective in this, the discussion started from the annoyance of writing factors(chol(A)) instead of chol(A).

I think @mlubin's suggestions are reasonable. In particular @dmbates's idea of using ref instead of factors.

However, I think we should reconsider if it is useful to restrict factorizations to be the same thing as the original matrix: the LinearOperator view. What is the gain? If this association is dropped you wouldn't be surprised by the behaviour of * in the QR factorization. Why would you ever use * on a factorization object it represented the original matrix?

@mlubin
Copy link
Member

mlubin commented Feb 9, 2013

(qr(A)[:Q])*x is much more understandable than (qr(A))*x to anyone with a background in linear algebra who has never seen Julia before.

@andreasnoack
Copy link
Member

I agree that qr(A)[:Q]*x is more understandable, but that is different from qr(A)*x being dangerous. My point was partly that * is not useful if a factorization is understood as the same thing as the original matrix and partly that mathematical abstractions are not always useful in programming.

@timholy
Copy link
Member

timholy commented Feb 9, 2013

But then * should not be defined at all. \ clearly is useful, because the factorization is the best way to solve a linear equation. And having * not be the inverse of \ is simply too horrible to contemplate.

@toivoh
Copy link
Contributor

toivoh commented Feb 9, 2013

About _, not computing unneeded factors, and extracting several at once: how about

Q, R = qr(A)[(:Q, :R)]

Then the entire set of requested factors would be known immediately upon extraction, and without extending the iterator protocol. Also, it is quite clear what is happening even if the type of factorization object being operated upon cannot be seen from the code, unlike factors, which can happily give the wrong answer if the type of factorization is different than expected.

Still, I'm unclear on the details of factorizations where you only want to compute some of the factors. This seems to presuppose some kind of lazy evaluation, which would require the factorization object to keep a copy of the source matrix inside, unless the set of needed factors is given in advance. The most efficient algorithm to compute a given subset of factors would probably also need to know this set in advance, not incrementally. On the other hand, a factorization object with some factors unavailable will be crippled, e.g. cannot implement \, so is probably not really a factorization object at all.

To compute a subset of factors, maybe it would be better not to (visibly) construct a factorization object, but to use something like

U, S = svd((:U, :S), A)

(and you could perhaps specify either :V or :Vt, which could provide some nice flexibility regarding #912.)
This would also allow to compute the Cholesky factor using

L = chol (:L, A)

@ViralBShah
Copy link
Member Author

@toivoh This is not lazy evaluation. The idea is that you use the LAPACK routines for factorization, which often compute a compact form. If you want the actual matrices that represent the factors, you may have to do some more work, when forming the actual matrices. However, if you only want to use the factorization to say, solve a system, then you may not need the actual matrices.

@ViralBShah
Copy link
Member Author

Given the need for minimally invasive changes, I am still pushing for renaming the *d routines to *fact routines. It is absolutely impossible to figure out what the d in qrd means, given that it returns a Factorization object. It may cause some intermediate churn, but at least we will have names that are more meaningful to newcomers, and we can then take our time discussing what the ideal interface should look like.

@mlubin
Copy link
Member

mlubin commented Feb 10, 2013

Sounds reasonable to me.

@ViralBShah
Copy link
Member Author

I am also not sure whether for 0.1, we should retain the current behaviour of qrd(a)*x which only multiplies by Q.

@mlubin
Copy link
Member

mlubin commented Feb 10, 2013

If it's too much intermediate churn to remove that behavior given that we haven't decided how to expose the functionality, I would at least not present it as a good thing to do in the documentation.

ViralBShah added a commit that referenced this pull request Feb 10, 2013
Minimal invasive changes to the Factorization objects for 0.1.
All existing functionality is retained,	with only a few	renamings for clarity.

Replaces the *d routines with *fact routines:
         lud ->lufact
         chold -> cholfact
         cholpd-> cholpfact
         qrd ->qrfact
         qrpd -> qrpfact

Replaces * and Ac_mul_B on QRDense objects:
         qmulQR performs Q*A
         qTmulQR performs Q'*A

Updates to exports, documentation, tests, and deprecations.

This is a self-contained commit that can be reverted if necessary
close #2062
@ViralBShah
Copy link
Member Author

I am closing this pull request, and we can resume this discussion after 0.1. I am actually fine with the current state, if we can retain the commit above. If everyone hates it, it can be reverted - but it does not change any functionality and only renames a few things for clarity and consistency.

I am pushing it in since 0.1 will most likely be tagged when I am not awake.

@ViralBShah ViralBShah closed this Feb 10, 2013
@andreasnoack
Copy link
Member

@timholy The horrible behaviour is already there for matrices. Inherited from Matlab:

julia> A=randn(5,2);b=randn(5);

julia> b-A*(A\b)
5-element Float64 Array:
 -0.418845
  1.15814 
 -0.986439
  0.719669
  0.531056

Horrible, but convenient.

@timholy
Copy link
Member

timholy commented Feb 10, 2013

Fair enough, but that's only because inv(A) doesn't exist. When it does, it gives you the expected result.

@alanedelman
Copy link
Contributor

May I propose we disconnect two issues

  1. the possibility of a compressed form for Q and perhaps L and U

  2. the entanglement of Q&R, L&U etc.

    I believe that the issue is more about 2 and yet the discussion above is mostly about 1.
    There is nothing wrong with having a compressed format for Q analogous to "sparse." or
    "symtridiagonal." If we didn't entangle, it still could be just fine to go (L,U)\A for
    L(U\A). With enough overloading, the user would hardly know the difference from matlab.
    ... so the real discussion we have to have is whether LAPACK's entangled formats
    should be disentangled, or rewritten so they end up disentangled, or if both are bad ideas
    for some sort of porformance reason. (Off the top of my head, I don't know the answer)

@ViralBShah
Copy link
Member Author

Does anyone use svdt? Is it useful to have? svd is matlab-compatible and factors(svdfact(A)) now does what svdt does right now. I would prefer to deprecate svdt but wanted to check here first.

@ViralBShah ViralBShah deleted the vs/linalg2 branch February 11, 2013 16:25
@andreasnoack
Copy link
Member

Now that we have an svd factorization type I think svdt should disappear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.