./out/demo_socp_gpu fails to solve its problem #180

kalmarek · 2021-10-15T20:51:51Z

Specifications

OS: Arch Linux
SCS Version: master at 5be0e16
Compiler: gcc

Description

scs fails at solving ./out/demo_socp_gpu 1000 0.5 0.5 1

How to reproduce

linking against julia openblas:

JULIA_HOME="/opt/julias/julia-1.6"
JULIA_LD_PATH="$JULIA_HOME/lib/julia"
BLASLDFLAGS="-L$JULIA_LD_PATH -lopenblas64_"
SCSFLAGS="USE_OPENMP=1 BLAS64=1 BLASSUFFIX=_64_"
make -j4 CFLAGS="-march=native" DLONG=0 ${SCSFLAGS} BLASLDFLAGS="${BLASLDFLAGS}" gpu

then running it via

LD_LIBRARY_PATH=$JULIA_LD_PATH:$LD_LIBRARY_PATH ./out/demo_socp_gpu 1000 0.5 0.5 1

Additional information

similarly compiled direct and indirect solvers (cpu) work just fine

Output

seed : 1

A is 4000 by 1000, with 32 nonzeros per column.
A has 32000 nonzeros (0.800000% dense).
Nonzeros of A take 0.000238 GB of storage.
Row idxs of A take 0.000119 GB of storage.
Col ptrs of A take 0.000004 GB of storage.

ScsCone information:
Zero cone rows: 2000
LP cone rows: 2000
Number of second-order cones: 0, covering 0 rows, with sizes
[]
Number of rows covered is 4000 out of 4000.

true pri opt = 2022.070521
true dua opt = 2022.070521
------------------------------------------------------------------
               SCS v3.0.0 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 1000, constraints m: 4000
cones:    z: primal zero / dual free vars: 2000
          l: linear vars: 2000
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, warm_start: 0
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
          nnz(A): 32000, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 6.90e+00  9.46e+01  3.33e+04 -1.66e+04  1.00e-01  1.03e-03 
   250| 1.76e+04  4.31e+01  1.23e+04 -6.15e+03  1.00e-01  1.65e-01 
   500| 2.74e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  3.29e-01 
   750| 1.57e+04  4.26e+01  1.23e+04 -6.16e+03  1.00e-01  4.94e-01 
  1000| 1.64e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  6.85e-01 
  1250| 4.30e+21  2.67e+22  6.54e+22 -3.27e+22  1.00e-01  8.48e-01 
  1500| 1.90e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  9.48e-01 
  1750| 2.14e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.04e+00 
  2000| 2.48e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.13e+00 
  2250| 6.45e+20  2.19e+22  4.21e+22  2.11e+22  1.00e-01  1.22e+00 
  2500| 2.07e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.30e+00 
  2750| 2.53e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.39e+00 
  3000| 2.02e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.48e+00 
  3250| 5.72e+20  3.01e+22  3.73e+22  1.87e+22  1.00e-01  1.57e+00 
  3500| 2.09e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.66e+00 
  3750| 2.43e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.75e+00 
  4000| 2.31e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  1.84e+00 
 [ ... ]
 99500| 2.48e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  3.65e+01 
 99750| 2.48e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  3.67e+01 
100000| 2.48e+04  4.29e+01  1.23e+04 -6.16e+03  1.00e-01  3.68e+01 
------------------------------------------------------------------
status:  solved (inaccurate - reached max_iters)
timings: total: 3.68e+01s = setup: 5.47e-02s + solve: 3.68e+01s
         lin-sys: 3.16e+01s, cones: 7.88e-01s, accel: 4.77e-01s
------------------------------------------------------------------
objective = -6159.028853 (inaccurate)
------------------------------------------------------------------
true pri opt = 2022.070521
true dua opt = 2022.070521
scs pri obj= 0.000000
scs dua obj = -12318.057707

The text was updated successfully, but these errors were encountered:

bodono · 2021-10-16T13:50:53Z

Thanks for posting. I am unable to reproduce this, when I run the command I get:

2021-10-16 14:47:37 (base) 0 bodonoghue@bodonoghue-[]-~/git/scs:
└──[ins] => out/demo_socp_gpu_indirect 1000 0.5 0.5 1
seed : 1

A is 4000 by 1000, with 32 nonzeros per column.
A has 32000 nonzeros (0.800000% dense).
Nonzeros of A take 0.000238 GB of storage.
Row idxs of A take 0.000119 GB of storage.
Col ptrs of A take 0.000004 GB of storage.

ScsCone information:
Zero cone rows: 2000
LP cone rows: 2000
Number of second-order cones: 0, covering 0 rows, with sizes
[]
Number of rows covered is 4000 out of 4000.

true pri opt = 2022.070521
true dua opt = 2022.070521
------------------------------------------------------------------
	       SCS v3.0.0 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 1000, constraints m: 4000
cones: 	  z: primal zero / dual free vars: 2000
	  l: linear vars: 2000
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, warm_start: 0
	  acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
	  nnz(A): 32000, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 6.90e+00  7.44e+00  2.65e+02  3.90e+03  1.00e-01  2.11e-02
    25| 3.80e-06  3.17e-04  3.36e-03  2.02e+03  1.00e-01  1.08e-01
------------------------------------------------------------------
status:  solved
timings: total: 6.66e-01s = setup: 5.58e-01s + solve: 1.08e-01s
	 lin-sys: 8.57e-02s, cones: 2.84e-04s, accel: 6.22e-05s
------------------------------------------------------------------
objective = 2022.072100
------------------------------------------------------------------
true pri opt = 2022.070521
true dua opt = 2022.070521
scs pri obj= 2022.070419
scs dua obj = 2022.073782

It might be the case that you are missing the gpu fixes I submitted here: 13e675d.

I did not cut a new release / tag with those fixes. Is that the issue?

By the way, you can better test the gpu using:

make purge
make test_gpu
out/run_tests_gpu_indirect

kalmarek · 2021-10-26T20:50:26Z

I'm on master as of 5be0e16
I have CUDA_PATH=/opt/cuda in my env pointing to cuda-11.4.2.
I compiled scs with

make purge
make test_gpu

as advised and then test it with ./out/run_tests_gpu_indirect. here is what I get:

cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK -DINDIRECT=1 -c src/scs.c -o src/scs_indir.o
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/util.o src/util.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/cones.o src/cones.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/aa.o src/aa.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/rw.o src/rw.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/linalg.o src/linalg.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/ctrlc.o src/ctrlc.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/scs_version.o src/scs_version.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o src/normalize.o src/normalize.c
cc  -c -o linsys/gpu/indirect/private.o linsys/gpu/indirect/private.c -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK -I/opt/cuda/include -Ilinsys/gpu -Wno-c++11-long-long  -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o linsys/scs_matrix.o linsys/scs_matrix.c
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK   -c -o linsys/csparse.o linsys/csparse.c
mkdir -p out
ar rv out/libscsgpuindir.a src/scs_indir.o src/util.o src/cones.o src/aa.o src/rw.o src/linalg.o src/ctrlc.o src/scs_version.o src/normalize.o linsys/gpu/indirect/private.o linsys/scs_matrix.o linsys/csparse.o linsys/gpu/gpu.o
ar: creating out/libscsgpuindir.a
a - src/scs_indir.o
a - src/util.o
a - src/cones.o
a - src/aa.o
a - src/rw.o
a - src/linalg.o
a - src/ctrlc.o
a - src/scs_version.o
a - src/normalize.o
a - linsys/gpu/indirect/private.o
a - linsys/scs_matrix.o
a - linsys/csparse.o
a - linsys/gpu/gpu.o
ranlib out/libscsgpuindir.a
cc -g -Wall -Wwrite-strings -pedantic -funroll-loops -Wstrict-prototypes -I. -Iinclude -Ilinsys -O3 -fPIC -DCTRLC=1  -DCOPYAMATRIX=1  -DGPU_TRANSPOSE_MAT=1  -DUSE_LAPACK -o out/run_tests_gpu_indirect test/run_tests.c out/libscsgpuindir.a -lm -lrt -lblas -llapack  -L/opt/cuda/lib -L/opt/cuda/lib64 -lcudart -lcublas -lcusparse -Itest
test_fails
Testing that SCS handles bad inputs correctly:eps_abs tolerance must be positive
ERROR: Validation returned failure
Failure:could not initialize work
degenerate
------------------------------------------------------------------
               SCS v3.0.0 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 4
cones:    l: linear vars: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-09
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, warm_start: 0
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
          nnz(A): 4, nnz(P): 2
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 2.10e+01  2.00e+00  7.90e+00 -3.95e+00  1.00e-01  1.47e-04 
   250| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  2.53e-02 
   500| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  5.54e-02 
   750| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  7.65e-02 
  1000| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  9.70e-02 
  1250| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  1.18e-01 
  1500| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  1.39e-01 
  1750| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  1.60e-01 
  2000| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  1.81e-01 
  2250| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  2.02e-01
[...]
 99750| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  7.39e+00 
100000| 5.69e+11  2.00e+00  0.00e+00  0.00e+00  1.00e+06  7.41e+00 
------------------------------------------------------------------
status:  solved (inaccurate - reached max_iters)
timings: total: 7.45e+00s = setup: 4.52e-02s + solve: 7.41e+00s
         lin-sys: 7.25e+00s, cones: 2.01e-02s, accel: 8.37e-02s
------------------------------------------------------------------
objective = 0.000000 (inaccurate)
------------------------------------------------------------------
INVALID STATUS
Tests run: 2

no fancy options, no julia-shipped blas ;)

~/local/src/scs   master  ldd ./out/run_tests_gpu_indirect 
        linux-vdso.so.1 (0x00007ffcff3ba000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f12b0400000)
        librt.so.1 => /usr/lib/librt.so.1 (0x00007f12b03f5000)
        libopenblas.so.3 => /usr/lib/libopenblas.so.3 (0x00007f12aefd5000)
        liblapack.so.3 => /usr/lib/liblapack.so.3 (0x00007f12ae90b000)
        libcudart.so.11.0 => /opt/cuda/lib64/libcudart.so.11.0 (0x00007f12ae669000)
        libcublas.so.11 => /opt/cuda/lib64/libcublas.so.11 (0x00007f12a52b5000)
        libcusparse.so.11 => /opt/cuda/lib64/libcusparse.so.11 (0x00007f1296ec8000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f1296cfc000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f12b0597000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f1296cdb000)
        libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007f1296c97000)
        libgfortran.so.5 => /usr/lib/libgfortran.so.5 (0x00007f12969db000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f12969c0000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f12969b7000)
        libcublasLt.so.11 => /opt/cuda/lib64/libcublasLt.so.11 (0x00007f1282fbb000)
        libquadmath.so.0 => /usr/lib/../lib/libquadmath.so.0 (0x00007f1282f70000)

bodono · 2021-10-27T16:51:48Z

That's strange, I cannot reproduce this on the only gpu machine I have access to. Can you try disabling the AA? You can do it by changing ACCELERATION_LOOKBACK to 0 in include/glbopts.h which will disable it for the tests that do not specify it manually and it should be clear if that's the issue.

Here's what my ldd looks like, I don't see any major differences to yours:

└──[ins] => ldd out/run_tests_gpu_indirect
	linux-vdso.so.1 (0x00007ffc11d05000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7c3fcf9000)
	libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x00007f7c3fc97000)
	liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x00007f7c3f5fa000)
	libcudart.so.11.0 => /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.0 (0x00007f7c3f375000)
	libcublas.so.11 => /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcublas.so.11 (0x00007f7c37e9a000)
	libcusparse.so.11 => /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcusparse.so.11 (0x00007f7c29e1c000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7c29c55000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7c3fe94000)
	libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x00007f7c2781e000)
	libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f7c27574000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7c2756e000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7c2754d000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7c27542000)
	libcublasLt.so.11 => /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcublasLt.so.11 (0x00007f7c19776000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7c1956a000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7c19550000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f7c19507000)

Can you try running

valgrind --leak-check=full out/run_tests_gpu_indirect

it likely won't help (and is very noisy for gpus) but just in case.

kalmarek · 2021-10-27T21:31:04Z

I disabled AA but it changed just the numerical values in the log, not the behaviour;
here's valgrind log: https://gist.github.com/kalmarek/adb225c93de2bb8d9a7032caec42eea9

I think the problem is somewhere in problem generation (before scs), since the header looks like this:

test_fails
Testing that SCS handles bad inputs correctly:eps_abs tolerance must be positive
ERROR: Validation returned failure
Failure:could not initialize work
degenerate
------------------------------------------------------------------
               SCS v3.0.0 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 4
cones:    l: linear vars: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-09
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, warm_start: 0
lin-sys:  sparse-indirect GPU
          nnz(A): 4, nnz(P): 2

i.e. first non positive eps_abs and then a problem with 2 variables and 4 constraints?

bodono · 2021-10-27T22:42:28Z

That's just the output of the first test which is testing data validation and is working correctly. You will see the same if you run the non gpu tests without/run_tests_direct. The first real problem is a tiny lp with 2 vars and 4 constraints.

duyipai · 2021-10-28T08:37:29Z

I have got the same problem with @kalmarek .

kalmarek · 2021-10-28T09:10:16Z

That's just the output of the first test which is testing data validation and is working correctly. You will see the same if you run the non gpu tests without/run_tests_direct. The first real problem is a tiny lp with 2 vars and 4 constraints.

yeah, maybe I should try to compare with run_tests_direct first ;)

kalmarek · 2022-01-07T19:30:41Z

@bodono: so I set VERBOSITY=2 and it seems that cg is never run succesfully. those cuda errors

linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument

seem to go away if i replace macro expanded CUBLAS(name) to the appropriate one, but the end result is the same. I literarly have no idea what I am doing ;), but you could suggest how to diagnose it next I'd be glad!

**********************************************************
Running test: test_validation
Testing that SCS handles bad inputs correctly:
eps_abs tolerance must be positive
ERROR: Validation returned failure
size of scs_int = 4, size of scs_float = 8
Failure:could not initialize work
**********************************************************
**********************************************************
Running test: degenerate
------------------------------------------------------------------
               SCS v3.0.0 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 4
cones:    l: linear vars: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-09
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 50, normalize: 1, warm_start: 0
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
          nnz(A): 4, nnz(P): 2
getting pre-conditioner
finished getting pre-conditioner
size of scs_int = 4, size of scs_float = 8
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 2.10e+01  2.00e+00  7.90e+00 -3.95e+00  1.00e-01  3.27e-04 
Norm u = 2.306122, Norm u_t = 1.492570, Norm v = 1.939709, Norm x = 0.000000, Norm y = 4.450789, Norm s = 22.360680, Norm |Ax + s| = 2.24e+01, tau = 1.000000, kappa = 0.000000, |u - u_t| = 1.11e+00, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 7.90e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
     1| 3.68e+01  2.00e+00  0.00e+00  0.00e+00  1.00e-01  6.66e-04 
Norm u = 17.210439, Norm u_t = 18.766100, Norm v = 29.666025, Norm x = 0.000000, Norm y = 0.000000, Norm s = 877.991704, Norm |Ax + s| = 8.78e+02, tau = 17.210439, kappa = 0.000000, |u - u_t| = 1.81e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
     2| 9.46e+01  2.00e+00  0.00e+00  0.00e+00  1.00e-01  1.37e-03 
Norm u = 10.600861, Norm u_t = 22.294830, Norm v = 35.509350, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1226.504583, Norm |Ax + s| = 1.23e+03, tau = 10.600861, kappa = 0.000000, |u - u_t| = 2.20e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
     3| 2.28e+02  2.00e+00  0.00e+00  0.00e+00  1.00e-01  2.07e-03 
Norm u = 5.455154, Norm u_t = 25.405974, Norm v = 40.611483, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1472.679019, Norm |Ax + s| = 1.47e+03, tau = 5.455154, kappa = 0.000000, |u - u_t| = 2.53e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
     4| 5.39e+02  2.00e+00  0.00e+00  0.00e+00  1.00e-01  2.34e-03 
Norm u = 2.454521, Norm u_t = 26.247918, Norm v = 41.989207, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1544.434977, Norm |Ax + s| = 1.54e+03, tau = 2.454521, kappa = 0.000000, |u - u_t| = 2.62e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
     5| 1.26e+03  2.00e+00  0.00e+00  0.00e+00  1.00e-01  2.62e-03 
[...]
    48| 1.05e+18  2.00e+00  0.00e+00  0.00e+00  1.00e-01  1.60e-02 
Norm u = 0.000000, Norm u_t = 26.457513, Norm v = 42.332021, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1569.004030, Norm |Ax + s| = 1.57e+03, tau = 0.000000, kappa = 0.000000, |u - u_t| = 2.65e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
linsys/gpu/indirect/private.c:506:scs_solve_lin_sys
 ERROR_CUDA (#): invalid argument
tol 1.000e-12
cg_its 0
    49| 5.29e+17  2.00e+00  0.00e+00  0.00e+00  1.00e-01  1.63e-02 
Norm u = 0.000000, Norm u_t = 26.457513, Norm v = 42.332021, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1569.004030, Norm |Ax + s| = 1.57e+03, tau = 0.000000, kappa = 0.000000, |u - u_t| = 2.65e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
    50| 5.29e+17  2.00e+00  0.00e+00  0.00e+00  1.00e-01  1.63e-02 
Norm u = 0.000000, Norm u_t = 26.457513, Norm v = 42.332021, Norm x = 0.000000, Norm y = 0.000000, Norm s = 1569.004030, Norm |Ax + s| = 1.57e+03, tau = 0.000000, kappa = 0.000000, |u - u_t| = 2.65e+01, res_infeas = nan, res_unbdd_a = nan, res_unbdd_p = nan, ctx_tau = 0.00e+00, bty_tau = 0.00e+00
------------------------------------------------------------------
status:  solved (inaccurate - reached max_iters)
timings: total: 5.82e-02s = setup: 4.19e-02s + solve: 1.63e-02s
         lin-sys: 1.51e-02s, cones: 1.97e-05s, accel: 3.52e-06s
------------------------------------------------------------------
objective = 0.000000 (inaccurate)
------------------------------------------------------------------
**********************************************************
INVALID STATUS
Tests run: 2

bodono · 2022-01-07T22:31:32Z

Ok, can you try with VERBOSITY=4? That should print out some info on whether pcg is running correctly. The fact that you're seeing cg_its 0 is worrying.

The macro itself has an error check when VERBOSITY>0 (see here), which is why the error goes away when you replace it (although it does suggest that only that line is broken, which is strange).

bodono · 2022-01-07T23:00:46Z

~~I just pushed c10b3fe. Pull that down and see if it fixes it.~~

Sorry, false alarm.

kalmarek · 2022-01-08T21:38:13Z

Even with VERBOSITY=4 I don't see other output, since cg_gpu_norm(cublas_handle, r, n) < tol is satisfied in https://github.com/cvxgrp/scs/blob/77c86c89bc8d75dce0e8475c364f805fdb62cef0/linsys/gpu/indirect/private.c#L399
If I put the printf statement above I get the old

linsys/gpu/indirect/private.c:16:cg_gpu_norm
 ERROR_CUDA (#): invalid argument

I'm not sure how to test that my CUDA/cublas is installed properly?

bodono · 2022-01-09T20:05:56Z

Can you try setting USE_L2_NORM to 1?

kalmarek · 2022-01-09T21:10:11Z

I set it to 1 but I get a similar behavior (though no errors). I also checked that nrm is always 0 in cg_gpu_norm, though &r[1] prints as 1.000000...

bodono · 2022-01-14T15:19:02Z

This is so strange, I don't understand what's happening here at all and I can't reproduce this behavior on my gpu machine. If you really want to get to the bottom of this then I'm happy to get on a call and we can debug together manually on your machine.

kalmarek · 2022-01-16T09:58:46Z

Thanks! I asked for the access to a nvidia gpu at my institution; If I can reproduce it there I'll get back to you!

kalmarek · 2022-04-19T09:25:35Z

Dear @bodono
I managed to get access to a gpu-enabled node and run some tests there;

a simple make test_gpu which results in

~/local/scs$ ldd ./out/run_tests_gpu_indirect 
        linux-vdso.so.1 (0x00007fff935d2000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fbb17291000)
        liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x00007fbb16bed000)
        libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x00007fbb16b80000)
        libcudart.so.10.1 => /usr/lib/x86_64-linux-gnu/libcudart.so.10.1 (0x00007fbb16904000)
        libcublas.so.10 => /usr/lib/x86_64-linux-gnu/libcublas.so.10 (0x00007fbb12b69000)
        libcusparse.so.10 => /usr/lib/x86_64-linux-gnu/libcusparse.so.10 (0x00007fbb0b8e0000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbb0b6ee000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fbb17459000)
        libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007fbb0b426000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fbb0b40b000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fbb0b405000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fbb0b3e2000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fbb0b3d6000)
        libcublasLt.so.10 => /usr/lib/x86_64-linux-gnu/libcublasLt.so.10 (0x00007fbb09532000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fbb09350000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007fbb09306000)

runs just fine (11 out of 11 tests passed).

This works just fine even when I replace the systems CUDA with the one shipped with julia:

~/local/scs$ LD_LIBRARY_PATH="${CUDA_PATH}/lib" ldd out/run_tests_gpu_indirect
        linux-vdso.so.1 (0x00007ffd8ec76000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f472fbad000)
        liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x00007f472f509000)
        libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x00007f472f49c000)
        libcudart.so.10.1 => /local/data/zz1594/.julia/artifacts/f049c2824a217dc29dbf657e5cdf0f8adafca77a/lib/libcudart.so.10.1 (0x00007f472f220000)
        libcublas.so.10 => /local/data/zz1594/.julia/artifacts/f049c2824a217dc29dbf657e5cdf0f8adafca77a/lib/libcublas.so.10 (0x00007f472b47e000)
        libcusparse.so.10 => /local/data/zz1594/.julia/artifacts/f049c2824a217dc29dbf657e5cdf0f8adafca77a/lib/libcusparse.so.10 (0x00007f47241f5000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4724003000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f472fd75000)
        libgfortran.so.5 => /usr/lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f4723d3b000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f4723d20000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4723d1a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f4723cf7000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f4723ceb000)
        libcublasLt.so.10 => /local/data/zz1594/.julia/artifacts/f049c2824a217dc29dbf657e5cdf0f8adafca77a/lib/libcublasLt.so.10 (0x00007f4721e47000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f4721c65000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f4721c1b000)

however if I try to link against julia provided OpenBLAS with

BLASLDFLAGS="-L${JULIA_BLAS_PATH} -lopenblas64_"

make purge
make -j4 $SCSFLAGS BLASSUFFIX="_64_" BLAS64=1 DLONG=0 BLASLDFLAGS="${BLASLDFLAGS}" test_gpu

which results in

LD_LIBRARY_PATH="${JULIA_BLAS_PATH}" ldd out/run_tests_gpu_indirect
        linux-vdso.so.1 (0x00007ffd2f1bb000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0dd6654000)
        libopenblas64_.so => /local/data/zz1594/julia-1.7.2/lib/julia/libopenblas64_.so (0x00007f0dd48fc000)
        libcudart.so.10.1 => /usr/lib/x86_64-linux-gnu/libcudart.so.10.1 (0x00007f0dd4680000)
        libcublas.so.10 => /usr/lib/x86_64-linux-gnu/libcublas.so.10 (0x00007f0dd08e5000)
        libcusparse.so.10 => /usr/lib/x86_64-linux-gnu/libcusparse.so.10 (0x00007f0dc965e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0dc946a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f0dd681c000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0dc9447000)
        libgfortran.so.5 => /local/data/zz1594/julia-1.7.2/lib/julia/libgfortran.so.5 (0x00007f0dc918c000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0dc9186000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0dc917c000)
        libcublasLt.so.10 => /usr/lib/x86_64-linux-gnu/libcublasLt.so.10 (0x00007f0dc72d8000)
        libstdc++.so.6 => /local/data/zz1594/julia-1.7.2/lib/julia/libstdc++.so.6 (0x00007f0dc70c2000)
        libgcc_s.so.1 => /local/data/zz1594/julia-1.7.2/lib/julia/libgcc_s.so.1 (0x00007f0dc70a7000)
        libquadmath.so.0 => /local/data/zz1594/julia-1.7.2/lib/julia/libquadmath.so.0 (0x00007f0dc705e000)

I get a failure:

*********************************************************
Running test: hs21_tiny_qp
------------------------------------------------------------------
               SCS v3.2.1 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 4
cones:    b: box cone vars: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-09
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, rho_x: 1.00e-06
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
          nnz(A): 4, nnz(P): 2
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 9.61e-01  1.17e-01  1.96e-01  9.80e-02  1.00e-01  4.95e-04 
    25| 4.08e-04  4.78e-02  1.14e-01  6.94e-18  1.00e-01  4.21e-03 
------------------------------------------------------------------
status:  infeasible
timings: total: 4.22e-03s = setup: 4.24e-04s + solve: 3.79e-03s
         lin-sys: 3.70e-03s, cones: 3.82e-06s, accel: 1.08e-06s
------------------------------------------------------------------
objective = inf
------------------------------------------------------------------
primal obj error  inf
dual obj error  inf
hs21_tiny_qp: SCS failed to produce outputflag SCS_SOLVED
Tests run: 6

similarly built run_tests_[in]direct pass all tests just fine

bodono · 2022-04-20T09:24:21Z

Hmmm, if the blas you're using is 64 bit it might be tricky to get everything to work with a GPU which (usually) expects 32 bit integers.

kalmarek · 2022-04-22T13:36:56Z

hmm, precisely the same problem happens if I compile with

BLASLDFLAGS="-L${JULIA_BLAS_PATH} -lopenblas"
SCSFLAGS="USE_OPENMP=0 BLAS32=1 DLONG=0"

make purge
CUDA_PATH="${CUDA_PATH}" make -j4 $SCSFLAGS BLASLDFLAGS="${BLASLDFLAGS}" test_gpu

here is a gist from build, tests and ldd.
https://gist.github.com/kalmarek/0bb320b84871351bff1bb796e516c4a7

OpenBLAS is the LP64 version (integers are ints)

bodono · 2022-04-25T11:00:31Z

Looks like the tests are passing except for hs21, which is probably just because the numerics are slightly different on the GPU and it's producing a bad flag.

kalmarek · 2022-11-03T10:23:59Z

@bodono could you have a look at this problem:
https://cloud.impan.pl/s/MX5oBX0lHb5LJl2

It's the same problem that you obtain through this code:

let T = SCS.GpuIndirectSolver
    A = [
        1.0 1.0 0.0 0.0 0.0
        0.0 1.0 0.0 0.0 1.0
        0.0 0.0 1.0 1.0 1.0
        -1.0 0.0 0.0 0.0 0.0
        0.0 -1.0 0.0 0.0 0.0
        0.0 0.0 -1.0 0.0 0.0
        0.0 0.0 0.0 -1.0 0.0
        0.0 0.0 0.0 0.0 -1.0
    ]
    m, n = Int32.(size(A))
    args = (
        m = m,
        n = n,
        A = A,
        P = zeros(n, n),
        b = [5.0, 3.0, 9.0, 0.0, 0.0, 0.0, 0.0, 0.0],
        c = -[3.0, 4.0, 4.0, 9.0, 5.0],
        z = 0,
        l = 8,
        bu = Float64[],
        bl = Float64[],
        q = Int32[],
        s = Int32[],
        ep = 0,
        ed = 0,
        p = Float64[],
    )
    solution = SCS.scs_solve(T, args..., max_iters=200, write_data_filename="simple_problem.scs")
    @test isapprox(solution.x' * args.c, -99.0; rtol = 1e-4)
end

This is easily solvable by the (In)Direct solvers but fails with our julia bindings to the GPU solver.
Maybe by inspecting it by hand (it's a binary which I have no idea how to digest) we can learn what goes wrong?

this is what I get here:

writing data to simple_problem.scs
------------------------------------------------------------------
               SCS v3.2.0 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 5, constraints m: 8
cones:    l: linear vars: 8
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 200, normalize: 1, rho_x: 1.00e-06
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
          nnz(A): 12, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.26e+02  3.95e+00  1.22e+03 -6.94e+02  1.00e-01  7.87e-04 
Warning: tol = -1.000000 <= 0, likely compiled without setting INDIRECT flag.
[...]
Warning: tol = -1.000000 <= 0, likely compiled without setting INDIRECT flag.
   200|      nan       nan      -nan      -nan  1.00e-01  8.29e-01 
------------------------------------------------------------------
status:  unbounded (inaccurate - reached max_iters)
timings: total: 8.81e-01s = setup: 5.27e-02s + solve: 8.29e-01s
         lin-sys: 8.26e-01s, cones: 2.52e-05s, accel: 6.92e-04s
------------------------------------------------------------------
objective = -inf (inaccurate)
------------------------------------------------------------------

bodono · 2022-11-03T10:29:24Z

Did you compile with the INDIRECT flag?

kalmarek · 2022-11-03T13:28:54Z

this is the script I use to compile scs

script = raw"""
cd $WORKSPACE/srcdir/scs*
flags="DLONG=0 BLAS32=1 USE_OPENMP=0 INDIRECT=1"
blasldflags="-L${libdir} -lopenblas"

CUDA_PATH=$prefix/cuda make BLASLDFLAGS="${blasldflags}" ${flags} out/libscsgpuindir.${dlext}

mkdir -p ${libdir}
cp out/libscs*.${dlext} ${libdir}
"""

kalmarek · 2022-11-03T13:37:28Z

DINDIRECT=1 results in the same log

bodono · 2022-11-03T15:00:46Z

The error message Warning: tol = -1.000000 <= 0, likely compiled without setting INDIRECT flag. should only appear if the INDIRECT flag is not set during compilation.

When the INDIRECT flag is set SCS does the additional computation to generate a good warm-start and a sensible tolerance for the indirect system:

scs/src/scs.c

Line 366 in f2da64d

#if INDIRECT > 0

Otherwise the tolerance is set to -1.0, which is an invalid tolerance:

scs/src/scs.c

Line 361 in f2da64d

scs_float tol = -1.0; /* only used for indirect methods, overridden later */

And that trips a warning from the indirect system solvers (should probably error out):

scs/linsys/gpu/indirect/private.c

Line 474 in 8ca0377

scs_printf("Warning: tol = %4f <= 0, likely compiled without setting "

When that flag is not set SCS skips that computation for speed.

bodono · 2022-11-03T15:03:52Z

Hmmm, actually this is likely something to do with the GPU solver specifically. There is some issue in there that only trips on some GPUs that I have run into before. It's probably something to do with type sizes that I have not been able to figure out. I would probably recommend shelving the GPU solver for now, the MKL one is typically faster anyway.

syockit · 2023-04-03T07:35:47Z

Try the following patch. I got all the tests to pass with this fix.

--- a/linsys/gpu/gpu.c
+++ b/linsys/gpu/gpu.c
@@ -19,13 +19,13 @@ void SCS(accum_by_atrans_gpu)(const ScsGpuMatrix *Ag,
     if (*buffer != SCS_NULL) {
       cudaFree(*buffer);
     }
-    cudaMalloc(buffer, *buffer_size);
+    cudaMalloc(buffer, new_buffer_size);
     *buffer_size = new_buffer_size;
   }

   CUSPARSE_GEN(SpMV)
   (cusparse_handle, CUSPARSE_OPERATION_NON_TRANSPOSE, &onef, Ag->descr, x,
-   &onef, y, SCS_CUDA_FLOAT, SCS_CSRMV_ALG, buffer);
+   &onef, y, SCS_CUDA_FLOAT, SCS_CSRMV_ALG, *buffer);
 }

 /* this is slow, use trans routine if possible */
@@ -48,13 +48,13 @@ void SCS(accum_by_a_gpu)(const ScsGpuMatrix *Ag, const cusparseDnVecDescr_t x,
     if (*buffer != SCS_NULL) {
       cudaFree(*buffer);
     }
-    cudaMalloc(buffer, *buffer_size);
+    cudaMalloc(buffer, new_buffer_size);
     *buffer_size = new_buffer_size;
   }

   CUSPARSE_GEN(SpMV)
   (cusparse_handle, CUSPARSE_OPERATION_TRANSPOSE, &onef, Ag->descr, x, &onef, y,
-   SCS_CUDA_FLOAT, SCS_CSRMV_ALG, buffer);
+   SCS_CUDA_FLOAT, SCS_CSRMV_ALG, *buffer);
 }

 /* This assumes that P has been made full (ie not triangular) and uses the

bodono · 2023-04-03T13:46:14Z

@syockit Thanks for this! I applied the patch and it worked! Do you want to turn this into a PR?

The only problem I had was an erroneous 'infeasible' certificate on hs21_tiny_qp and hs21_tiny_qp_rw tests. Do you get that too? I was able to get it to pass by tightening the eps_infeas tolerance in those files so if you have that problem too we can just do that.

syockit · 2023-04-03T23:57:06Z

@bodono It's a hassle for me to set up a fork right now, so please apply the commit on your side.

You're right, I got the same infeasible certificate on the tests you mentioned. I missed that yesterday. And tightening eps_infeas did make it feasible.

bodono · 2023-04-04T07:04:54Z

Sure, no problem @syockit , thanks for sending in the patch!

kalmarek · 2023-04-12T22:50:46Z

the issue mentioned in ./out/demo_socp_gpu fails to solve its problem #180 (comment) seems to be solved by Makefile: libscsgpuindir should depend on SCS_INDIR_O #251
I can not reproduce the original issue anymore (probably solved by Gpu fixes #246).

I presume this issue can be closed after #251 is merged

kalmarek mentioned this issue Dec 24, 2021

add support for scs-3.0.0 jump-dev/SCS.jl#221

Merged

3 tasks

bodono mentioned this issue Jan 31, 2022

SCS GPU convergence in a simple problem #206

Open

kalmarek mentioned this issue Feb 3, 2022

GPU build failing jump-dev/SCS.jl#245

Closed

bodono closed this as completed Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

./out/demo_socp_gpu fails to solve its problem #180

./out/demo_socp_gpu fails to solve its problem #180

kalmarek commented Oct 15, 2021 •

edited

Loading

bodono commented Oct 16, 2021

kalmarek commented Oct 26, 2021

bodono commented Oct 27, 2021

kalmarek commented Oct 27, 2021

bodono commented Oct 27, 2021

duyipai commented Oct 28, 2021

kalmarek commented Oct 28, 2021

kalmarek commented Jan 7, 2022 •

edited

Loading

bodono commented Jan 7, 2022

bodono commented Jan 7, 2022 •

edited

Loading

kalmarek commented Jan 8, 2022

bodono commented Jan 9, 2022

kalmarek commented Jan 9, 2022 •

edited

Loading

bodono commented Jan 14, 2022

kalmarek commented Jan 16, 2022

kalmarek commented Apr 19, 2022

bodono commented Apr 20, 2022

kalmarek commented Apr 22, 2022

bodono commented Apr 25, 2022

kalmarek commented Nov 3, 2022

bodono commented Nov 3, 2022

kalmarek commented Nov 3, 2022

kalmarek commented Nov 3, 2022

bodono commented Nov 3, 2022 •

edited

Loading

bodono commented Nov 3, 2022

syockit commented Apr 3, 2023

bodono commented Apr 3, 2023

syockit commented Apr 3, 2023

bodono commented Apr 4, 2023

kalmarek commented Apr 12, 2023

./out/demo_socp_gpu fails to solve its problem #180

./out/demo_socp_gpu fails to solve its problem #180

Comments

kalmarek commented Oct 15, 2021 • edited Loading

Specifications

Description

How to reproduce

Additional information

Output

bodono commented Oct 16, 2021

kalmarek commented Oct 26, 2021

bodono commented Oct 27, 2021

kalmarek commented Oct 27, 2021

bodono commented Oct 27, 2021

duyipai commented Oct 28, 2021

kalmarek commented Oct 28, 2021

kalmarek commented Jan 7, 2022 • edited Loading

bodono commented Jan 7, 2022

bodono commented Jan 7, 2022 • edited Loading

kalmarek commented Jan 8, 2022

bodono commented Jan 9, 2022

kalmarek commented Jan 9, 2022 • edited Loading

bodono commented Jan 14, 2022

kalmarek commented Jan 16, 2022

kalmarek commented Apr 19, 2022

bodono commented Apr 20, 2022

kalmarek commented Apr 22, 2022

bodono commented Apr 25, 2022

kalmarek commented Nov 3, 2022

bodono commented Nov 3, 2022

kalmarek commented Nov 3, 2022

kalmarek commented Nov 3, 2022

bodono commented Nov 3, 2022 • edited Loading

bodono commented Nov 3, 2022

syockit commented Apr 3, 2023

bodono commented Apr 3, 2023

syockit commented Apr 3, 2023

bodono commented Apr 4, 2023

kalmarek commented Apr 12, 2023

kalmarek commented Oct 15, 2021 •

edited

Loading

kalmarek commented Jan 7, 2022 •

edited

Loading

bodono commented Jan 7, 2022 •

edited

Loading

kalmarek commented Jan 9, 2022 •

edited

Loading

bodono commented Nov 3, 2022 •

edited

Loading