Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General eigenproblem algorithm #1743

Merged
merged 29 commits into from
Apr 18, 2021
Merged

Conversation

cshaa
Copy link
Collaborator

@cshaa cshaa commented Feb 14, 2020

Fixes #1741.
Introduces #2178, #2179.

@josdejong
Copy link
Owner

Thanks Michal, I will have a look into your work soon!

@cshaa
Copy link
Collaborator Author

cshaa commented Feb 16, 2020

The previous implementation of eigs done by Arkajit had this type signature:

function eigs(m: number[][]): {
   values: number[],
   vectors: number[][]
};
function eigs(m: Matrix): {
   values: Vector,
   vectors: Matrix
};

(By Vector I mean a 1D matrix, by Matrix a 2D matrix.)

To be honest, I consider the convention of matrix functions returning arrays for array input and matrices for matrix input a little weird. But the eigs function seems really off to me. Why would you want eigenvalues in a vector? Why would you want eigenvectors as rows of a matrix?

In my opinion, a much more sensible type signature would be:

function eigs(m: number[][]): {
   values: number[],
   vectors: number[][]
};
function eigs(m: Matrix): {
   values: number[],
   vectors: Vector[]
};

Of course, if you think that the original signature is more reasonable, or that we shouldn't change the API, I'll use the original signature. But I think it's worth at least reconsidering.

What's your opinion on this, Jos?

EDIT: Numerical recipes in Fortran 77 doesn't have a good description of QR of complex Hessenberg matrices, so I'll have to take more time figuring it out. Sorry for the delay.

EDIT 2: Oh, we already have a QR algorithm. If it performs well on Hessenberg matrices, my PR is almost done. I still need to add tests though.

@cshaa cshaa marked this pull request as ready for review February 20, 2020 21:55
@cshaa
Copy link
Collaborator Author

cshaa commented Feb 20, 2020

The algorithm now successfully computes the eigenvalues for general real matrices. Complex matrices don't work only because math.qr decomposition doesn't work for complex matrices. I will fix that in a separate PR.

You can test that eigs really works with this code:

const math = require('.');
m = math.matrix([[1,2,3,4,5,6],[1,2,3,4,5,5],[1,2,3,4,4,6],[1,2,3,3,5,6],[1,2,2,4,5,6],[1,1,3,4,5,6]]);
math.eigs(m)

Next I'll write the tests and then I'll add inverse iteration a procedure to compute eigenvectors. Then I'll fix the QM decomposition.

@josdejong
Copy link
Owner

Sounds good! This weekend I hope to review your PR.

// this isn't the only time we loop thru the matrix...
let last = false

while (!last)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk of getting into an infinite loop in this while loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite certain that no. The loop iterates through every row-column pair only once, then they are balanced for good. The only reason for using while instead of for (let i = 0; i < N; i++) is that the rows and columns get switched, so there's no clear ordering.

@josdejong
Copy link
Owner

I've had a look at your PR @m93a , looks like a great start! Thanks for all your effort, this is quite some work.

I've placed a few comments in the code. Most important I think is that currently the functions are sort of a hybrid containing many switches to handle either the number or the BigNumber case. The core idea behind mathjs though is that you can inject functions like add, multiply, smaller, etc which work for both numbers and BigNumbers and other numeric types, so inside your function you don't have to worry about the actual numeric type. This way, it's possible for example to implement a new numeric type (say BigInt) and have the eigen value functions automatically work with this new type because they are agnostic of the actual numeric type. Would this be possible for the eigs function? Or are there specific reasons to not to?

One other thing I'm missing in the PR yet is unit tests. I guess you're planning on implementing that later?

@cshaa
Copy link
Collaborator Author

cshaa commented Feb 22, 2020

Thanks for the review, Jos!
I'll fix the braces. The reason why I used big ? ... : ... was to minimize the overhead, especially when dealing with native numbers. Arkajit's algorithm for real symmetric matrices has two separate methods: one for native numbers and one for big numbers. But my algorithm is so big that doing this doesn't make sense. I think that the condition big ? ... : ... is very simple to optimize out for the JIT, so it will produce almost no overhead. I imagine calling addScalar and multiplyScalar would be much slower, we can test the performance once the algorithm is completed.

I found out that finding the eigenvectors is much harder than I expected, but I hope I'll finish it during this weekend.

@cshaa
Copy link
Collaborator Author

cshaa commented Feb 22, 2020

Now the only things missing are tests, fixing usolve #1748 and investigating why the algorithm doesn't work with complex matrices (NaNs start appearing in a complex matrix, so I guess I just used native operations where add and multiply were needed).

But I need to take a break for a while, this PR took much more time than I estimated.

if (hasBig) {
for (let i = 0; i < N; i++) {
for (let j = 0; j < N; j++) {
arr[i][j] = bignumber(arr[i][j])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function bignumber also accepts array or matrix input, so you can replace the two for loops simply with
arr = bignumber(arr)?

@josdejong
Copy link
Owner

Thanks Michal, sounds good. Very nice that you even solved these limitations in usolve!

About the big ? ... : ... switches: what I'm concerned about is complexity and maintainability of the code. It's very easy to have bugs sneak in when there is so many of these kind of switches. I do understand the concern of performance. Maybe there is a solution in between: for every operator we want to use, we define an optimized version on top of the function where we also determine big, so we replace:

const big = type === 'BigNumber'

// ...
colNorm = big ? colNorm.add(c) : colNorm + c
rowNorm = big ? rowNorm.add(c) : rowNorm + r    
// ...

with:

const big = type === 'BigNumber'
const add = big 
    ? (a, b) => a.add(b) 
    : (a, b) => a + b

// ...
colNorm = add(colNorm, c)
rowNorm = add(rowNorm, c)
// ...

This will make the code much better readable I expect, which is really important to me. As a side effect it would also make it very easy to try out the performance penalty of simply using the generic add (or addScalar) function of mathjs. What do you think?

But I need to take a break for a while, this PR took much more time than I estimated.

Yeah I can imagine that! Take it easy 😃

@cshaa
Copy link
Collaborator Author

cshaa commented Aug 25, 2020

Since usolveAll is now implemented, I think the time has finally come to finish this PR. However some merge conflicts appeared over the time. Do you know what changes happened that conflict with this PR? Is it safe to discard them, or did they include eg. some new feature?

@josdejong
Copy link
Owner

@m93a I see you made some new commits 👍 . Is the PR ready to review?

@cshaa
Copy link
Collaborator Author

cshaa commented Oct 5, 2020

Sadly not yet, I have the bad habit of consistently overestimating my free time and underestimating the difficulty of tasks. I'll let you know once it is ready for review 😅️✌️

@josdejong
Copy link
Owner

👍 😂 no problem at all, I'll await your message.

@cshaa
Copy link
Collaborator Author

cshaa commented Mar 15, 2021

Hey Jos, I'm finally back! I hope to finish this PR in a reasonable time. However I wanted to ask you something – the eigs function we have now returns eigenvectors as columns in a matrix. This is practical for diagonalization, but I find it very unintuitive – I would expect two arrays: values and vectors, sorted so that vectors[i] is an eigenvector with the corresponding eigenvalue values[i].

const { eigs, matrix } = math;

// The way eigs works now:
eigs( matrix([[0,1],[1,0]]) )
 {
  values: matrix([-1, 1]),
  vectors: matrix([ [0.7, 0.7], [0.7, -0.7] ]) //clearly vectors.data[0] has eigenvalue values.data[1]
}

// The way I would like it:
eigs( matrix([[0,1],[1,0]]) )
 {
  values: [-1, 1],
  vectors: [ matrix([0.7, -0.7]), matrix([0.7, 0.7]) ]
}

Does this kind of change look good to you?

@cshaa
Copy link
Collaborator Author

cshaa commented Mar 15, 2021

More specifically: if the change I proposed does look good to you, there are two minor inconveniences I should solve before we merge this PR. Until now, diagonalization of a matrix could be performed like this:

const { eigs, matrix, multiply, transpose } = math
let M = matrix([[1,2,3],[2,4,5],[3,5,6]])
let V = eigs(M).vectors
multiply( transpose(V), M, V ) // returns a diagonal matrix

However now this fails, because vectors is not a matrix with eigenvectors as columns, but rather an array containing the eigenvectors. There are two problems with this:

  1. The old code doesn't throw an error, it just produces an incorrect result. This is because multiply accepts inputs with arrays of vectors and interprets them as matrix whose rows are the vectors. I don't see how this could be the intended behavior for any user. My opinion is that multiply should throw an error when is recieves arrays of matrices.
  2. There is no simple way to create a matrix whose columns are given vectors. The closest thing we have right now is transpose(matrix( vecs )). However matrix(vecs) is very ambiguous, it is not immediately clear what it will return. I think it would be best to create two new functions matrixFromColumns and matrixFromRows that would unambiguously do what their name says.

Do you like the solutions to these two problems? Should I open a PR for them? If we implement these two solutions, the new diagonalization code would look something like this:

const { eigs, matrix, matrixFromColumns, multiply, transpose } = math
let M = matrix([[1,2,3],[2,4,5],[3,5,6]])
let V = matrixFromColumns( eigs(M).vectors )
multiply( transpose(V), M, V ) // returns a diagonal matrix

@josdejong
Copy link
Owner

Good to hear back from you 😄 👍

I would expect two arrays: values and vectors, sorted so that vectors[i] is an eigenvector with the corresponding eigenvalue values[i].

Yes that makes sense to me too. I'm not sure though whether there where specific reasons to not do that in the first place.

I understand the issue you explain with vectors being an array with matrices. mathjs in general expects either (nested) arrays or matrices, but not an array with matrices. A helper function like matrixFromColumns could be a solution indeed. However, isn't the eigs function returning an array with matrices the core of the problem? Is it possible to let eigs(M).vectors return a Matrix instead?

@cshaa
Copy link
Collaborator Author

cshaa commented Mar 22, 2021

Hey Jos!

Is it possible to let eigs(M).vectors return a Matrix instead?

Yes, that's exactly the current behavior 😁️ I found it unintuitive, because to find the eigenvector of eigs(M).values[i], one has to do column(eigs(M).vectors, i). But if you think it's better that way, I can leave it like that.

@josdejong
Copy link
Owner

Ahh, ok. Yes indeed with column you have to pick the vector you're interested in.

I'm a bit afraid it will get hairy very quickly when we start having functions which return arrays with nested arrays, not being the same as a nested array representing a 2d array. So far the concept basically is that the returned results are a single "thing", the thing typically being either a 2d matrix or nested array representing a 2d matrix. I think for consistency and simplicity of the library it will be good to keep thinking from this high level "Matrix" concept, and see usage of Array only as a Matrix implementation and not use Array for other usage like returning multiple vectors. I understand that the latter would be convenient in practice in this case, but think it doesn't outweigh introducing this inconsistency in the library. What do you think?

@cshaa
Copy link
Collaborator Author

cshaa commented Mar 31, 2021

Okay, I'm keeping the old behavior, vectors are returned as a matrix whose columns are eigenvectors. The original eigs function also had another quirk – it returned values as a vector sometimes. This seems totally wrong to me, since it's not a vector (it doesn't transform like one and I can't imagine why would you need eigenvalues in a vector). If you don't mind, I'll change values to always return an array.

@cshaa
Copy link
Collaborator Author

cshaa commented Mar 31, 2021

The current implementation is still somewhat pathetic. Eigenvalues converge very slowly for matrices as small as 4x4, the search for eigenvectors fails maybe 50% of the time. There are two specific things that can be done to remedy this (see lines 287 and 418). But the vast majority of the algorithm is there and I'm a little burnt out about this specific PR, I want to work on different things in math.js to get motivated again :D

I propose we do this:

  1. Merge this PR. The current code either produces a valid result, or fails with a descriptive error, it never returns nonsense.
  2. File two issues about the two shortcomings.
  3. Either someone else picks them up, or I return to them after a while.

@josdejong
Copy link
Owner

Thanks for your honesty of being fed up with the PR 😄 I know the feeling.

You're right that the PR provides functional improvements. Most important feedback was about refactoring the code such that we get rid of all the switches for number vs BigNumber, see #1743 (comment).

I agree with you that we can be pragmatic here and address this refactoring in a later stage. The PR is already a step in the right direction :).

Can you add a clear comment on top of the relevant functions in this PR what we would like to refactor/improve in them: address the number/bignumber switches, and address the slow convergence for small matrices with the pointers you mention above? Let's merge your work after that 👍

@josdejong
Copy link
Owner

And: thanks again for all your effort 👍

@cshaa
Copy link
Collaborator Author

cshaa commented Apr 12, 2021

Most important feedback was about refactoring the code such that we get rid of all the switches for number vs BigNumber, see #​1743 (comment).

This should already be fixed, I use addScalar and multiplyScalar everywhere it's possible. The only type-checking ternary operators left in the code are the ones in definition of const one and const zero, eg. here, here and here.

Can you add a clear comment on top of the relevant functions in this PR what we would like to refactor/improve in them?

All the things that are imo suboptimal are marked with a descriptive TODO comment. The most important things to improve are next eigenvalue estimation for shifting (#2178, relevant code here), which will improve the convergence significantly, and iterative eigenvector finder (#2179, relevant code here). Both of these are also marked with a descriptive TODO comment in the code.

@josdejong
Copy link
Owner

This should already be fixed, I use addScalar and multiplyScalar everywhere it's possible.

O wow, I hadn't realized you already addressed all that refactoring 😎 . Thanks again, will merge your PR now!

@josdejong josdejong merged commit 9e8deb5 into josdejong:develop Apr 18, 2021
joshhansen pushed a commit to LearnSomethingTeam/mathjs that referenced this pull request Sep 16, 2021
* split the eigs function into multiple algorithms

* moved checks and coersions to eigs.js, made them more robust

* fix little bugs, make im and re more robust

* Implemented matrix balancing algorithm

* fix typos

* a draft of reduction to Hessenberg matrix

* finished implementation of reduction to Hessenberg

* fix Hessenberg elimination for complex numbers

* implemented non-shifted explicit QR algorithm for real matrices

* implemented vector computation, won't work untill usolve is fixed

* refactored to match yarn lint

* some minor changes

* solve merge conflicts

* refactored and re-fixed josdejong#1789

* some old uncommited changes

* fix small problems introduced by merging

* done some polishing

* improved jsdoc description of eigs

* little changes in jsdoc

Co-authored-by: Jos de Jong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend eigs to a general matrix
2 participants