-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI support #302
Comments
afaik openmp is only used in two pragma calls (exp cone projection and over rows of y += A'*x accumulation). With these sizes you'll probably encounter numerical issues in eigendecomposition, MPI is not going to change it. Do your problems have some additional structures (symmetry, sparsity, some algebraicness)? You'll be much better off exploiting those than trying to brute-force a large problem. |
This is the code for the problem.
I can calculate a not-bad analytical solution for this problem but I don't know how to use it to speed up the optimization. |
I've never used scs through python, but I recall for warmstart you need to provide both primal and dual variables. what is in |
cov.npy is a positive semidifinite matrix and satisfy the constraint that cov-1j*Omega positive semidifinite.
|
yes, this I understand, but does But the best speedup will be achieved by just not solving such large an optimization problem... |
I don't think |
You can warm-start just some of the variables I think, SCS will set the others to zero. But SCS is expecting the variables corresponding to the conic reformulation that CVXPY does under the hood, so you would need access to those. If you have one of the conic variables as a warm-start it should be possible to get reasonable (better than zero) warm-starts for the others too, using the KKT equations. As for MPI, as @kalmarek says we only have OpenMP pragmas over a couple of minor routines, and your problem will likely be dominated by the eigendecomposition of the semidefinite cone variable. That being said, we do support MKL as the blas/lapack backend and that definitely uses multiple threads for the eigendecomp, it may even support MPI for clusters. That's probably your best bet. |
Oh also the MKL pardiso direct solver is much faster than the indirect solver, I would recommend using the direct solver if you can. |
Thanks for your reply. I've tried MKL pardiso direct solver. The time it took is simular to the indirect solver. |
I am very surprised that the MKL pardiso solver requires about the same time as the indirect solver! It might be worth playing with the MPI settings (more threads or whatever) or the SCS solver settings to see if there is anything to squeeze out. |
Can you post the entire output trace (or the very end)? I would be interested in seeing the total times / iterations for each. Are you controlling the number of threads? I think you can do that with env variable OMP_NUM_THREADS. It might be worth tinkering with that a bit. |
Yes it's doing a factorization, I didn't realize that was done single-threaded, seems like a waste! That probably explains the slowdown for MKL then. In principle we could write an MKL indirect solver which would be faster, but that would take a little bit of work. |
Is it possible to support MPI?
SCS uses OpenMP to realize parallelism. However, when solving large-scale problems, we need MPI support to run them on HPC clusters.
Now, SCS can run on only one node and solve SDP problems using the indirect linear system solver with a 4000*4000 variable matrix (about 16000000 variables and 32000000 constraints). If I plan to calculate larger problems, about with a 16000 * 16000 variable matrix, MPI is necessary.
The text was updated successfully, but these errors were encountered: