-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
./out/demo_socp_gpu fails to solve its problem #180
Comments
Thanks for posting. I am unable to reproduce this, when I run the command I get:
It might be the case that you are missing the gpu fixes I submitted here: 13e675d. I did not cut a new release / tag with those fixes. Is that the issue? By the way, you can better test the gpu using:
|
I'm on make purge
make test_gpu as advised and then test it with
no fancy options, no julia-shipped blas ;)
|
That's strange, I cannot reproduce this on the only gpu machine I have access to. Can you try disabling the AA? You can do it by changing Here's what my ldd looks like, I don't see any major differences to yours:
Can you try running
it likely won't help (and is very noisy for gpus) but just in case. |
I disabled AA but it changed just the numerical values in the log, not the behaviour; I think the problem is somewhere in problem generation (before scs), since the header looks like this:
i.e. first non positive eps_abs and then a problem with 2 variables and 4 constraints? |
That's just the output of the first test which is testing data validation and is working correctly. You will see the same if you run the non gpu tests with |
I have got the same problem with @kalmarek . |
yeah, maybe I should try to compare with |
@bodono: so I set
seem to go away if i replace macro expanded
|
Ok, can you try with The macro itself has an error check when VERBOSITY>0 (see here), which is why the error goes away when you replace it (although it does suggest that only that line is broken, which is strange). |
Sorry, false alarm. |
Even with
I'm not sure how to test that my CUDA/cublas is installed properly? |
Can you try setting |
I set it to 1 but I get a similar behavior (though no errors). I also checked that |
This is so strange, I don't understand what's happening here at all and I can't reproduce this behavior on my gpu machine. If you really want to get to the bottom of this then I'm happy to get on a call and we can debug together manually on your machine. |
Thanks! I asked for the access to a nvidia gpu at my institution; If I can reproduce it there I'll get back to you! |
Dear @bodono
runs just fine (11 out of 11 tests passed).
BLASLDFLAGS="-L${JULIA_BLAS_PATH} -lopenblas64_"
make purge
make -j4 $SCSFLAGS BLASSUFFIX="_64_" BLAS64=1 DLONG=0 BLASLDFLAGS="${BLASLDFLAGS}" test_gpu which results in
I get a failure:
|
Hmmm, if the blas you're using is 64 bit it might be tricky to get everything to work with a GPU which (usually) expects 32 bit integers. |
hmm, precisely the same problem happens if I compile with
here is a gist from build, tests and ldd. OpenBLAS is the |
Looks like the tests are passing except for hs21, which is probably just because the numerics are slightly different on the GPU and it's producing a bad flag. |
@bodono could you have a look at this problem: It's the same problem that you obtain through this code: let T = SCS.GpuIndirectSolver
A = [
1.0 1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 1.0
0.0 0.0 1.0 1.0 1.0
-1.0 0.0 0.0 0.0 0.0
0.0 -1.0 0.0 0.0 0.0
0.0 0.0 -1.0 0.0 0.0
0.0 0.0 0.0 -1.0 0.0
0.0 0.0 0.0 0.0 -1.0
]
m, n = Int32.(size(A))
args = (
m = m,
n = n,
A = A,
P = zeros(n, n),
b = [5.0, 3.0, 9.0, 0.0, 0.0, 0.0, 0.0, 0.0],
c = -[3.0, 4.0, 4.0, 9.0, 5.0],
z = 0,
l = 8,
bu = Float64[],
bl = Float64[],
q = Int32[],
s = Int32[],
ep = 0,
ed = 0,
p = Float64[],
)
solution = SCS.scs_solve(T, args..., max_iters=200, write_data_filename="simple_problem.scs")
@test isapprox(solution.x' * args.c, -99.0; rtol = 1e-4)
end This is easily solvable by the this is what I get here:
|
Did you compile with the |
this is the script I use to compile
|
|
The error message When the INDIRECT flag is set SCS does the additional computation to generate a good warm-start and a sensible tolerance for the indirect system: Line 366 in f2da64d
Otherwise the tolerance is set to -1.0, which is an invalid tolerance: Line 361 in f2da64d
And that trips a warning from the indirect system solvers (should probably error out): scs/linsys/gpu/indirect/private.c Line 474 in 8ca0377
When that flag is not set SCS skips that computation for speed. |
Hmmm, actually this is likely something to do with the GPU solver specifically. There is some issue in there that only trips on some GPUs that I have run into before. It's probably something to do with type sizes that I have not been able to figure out. I would probably recommend shelving the GPU solver for now, the MKL one is typically faster anyway. |
Try the following patch. I got all the tests to pass with this fix. --- a/linsys/gpu/gpu.c
+++ b/linsys/gpu/gpu.c
@@ -19,13 +19,13 @@ void SCS(accum_by_atrans_gpu)(const ScsGpuMatrix *Ag,
if (*buffer != SCS_NULL) {
cudaFree(*buffer);
}
- cudaMalloc(buffer, *buffer_size);
+ cudaMalloc(buffer, new_buffer_size);
*buffer_size = new_buffer_size;
}
CUSPARSE_GEN(SpMV)
(cusparse_handle, CUSPARSE_OPERATION_NON_TRANSPOSE, &onef, Ag->descr, x,
- &onef, y, SCS_CUDA_FLOAT, SCS_CSRMV_ALG, buffer);
+ &onef, y, SCS_CUDA_FLOAT, SCS_CSRMV_ALG, *buffer);
}
/* this is slow, use trans routine if possible */
@@ -48,13 +48,13 @@ void SCS(accum_by_a_gpu)(const ScsGpuMatrix *Ag, const cusparseDnVecDescr_t x,
if (*buffer != SCS_NULL) {
cudaFree(*buffer);
}
- cudaMalloc(buffer, *buffer_size);
+ cudaMalloc(buffer, new_buffer_size);
*buffer_size = new_buffer_size;
}
CUSPARSE_GEN(SpMV)
(cusparse_handle, CUSPARSE_OPERATION_TRANSPOSE, &onef, Ag->descr, x, &onef, y,
- SCS_CUDA_FLOAT, SCS_CSRMV_ALG, buffer);
+ SCS_CUDA_FLOAT, SCS_CSRMV_ALG, *buffer);
}
/* This assumes that P has been made full (ie not triangular) and uses the |
@syockit Thanks for this! I applied the patch and it worked! Do you want to turn this into a PR? The only problem I had was an erroneous 'infeasible' certificate on |
@bodono It's a hassle for me to set up a fork right now, so please apply the commit on your side. You're right, I got the same infeasible certificate on the tests you mentioned. I missed that yesterday. And tightening |
Sure, no problem @syockit , thanks for sending in the patch! |
I presume this issue can be closed after #251 is merged |
Specifications
master
at 5be0e16Description
scs fails at solving
./out/demo_socp_gpu 1000 0.5 0.5 1
How to reproduce
linking against julia openblas:
then running it via
Additional information
similarly compiled direct and indirect solvers (cpu) work just fine
Output
The text was updated successfully, but these errors were encountered: