Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: How can TERagwitz.m gain from GPU? #13

Open
dwuab opened this issue Jul 22, 2015 · 8 comments
Open

Q: How can TERagwitz.m gain from GPU? #13

dwuab opened this issue Jul 22, 2015 · 8 comments
Assignees

Comments

@dwuab
Copy link

dwuab commented Jul 22, 2015

The documentation for Trentool only mentions that GPU can help the ensemble method.
However, from the source files, it seems that the GPU code only do nearest neighbor searching. Is it possible to incorporate GPU code into function such as TERagwitz?

@mwibral
Copy link
Collaborator

mwibral commented Jul 22, 2015

Hi,

the problem is how to fill the GPU with computations. In the ensemble
method the original data and a bunch of equally sized surrogate data
sets go to the GPU, and they're all crunched siultaneously. This
basically gives you the surrogate stats with no additional time.

In the standard method, there is only one original data piece and one
surrogate data set for each trial (instead of ~1000). Other channel
pairs may have slighlty different datga sizes after embedding because of
the autocorrelation deacy time (ACT) and the embedding optimization. So
they can't go to the card at the same time. We are looking into
workarounds for that at the moment.

Best,
Michael.

On 22.07.2015 03:48, samuelandjw wrote:

The documentation for Trentool only mentions that GPU can help the
ensemble method.
However, from the source files, it seems that the GPU code only do
nearest neighbor searching. Is it possible to incorporate GPU code
into function such as |TERagwitz|?


Reply to this email directly or view it on GitHub
#13 Bug from
https://github.com/notifications/beacon/AIqYGslEDilqwlNvB_Gy4fUWvtKcNVy8ks5ofu4EgaJpZM4FdO8J.gif

@dwuab
Copy link
Author

dwuab commented Jul 24, 2015

@mwibral I'm interested in looking into this problem too. By the way, how can I view the source of int cudaFindKnn?
Update: I have very long time series such that even Ragwitz "test" would take ages to complete. I'm trying to find a way to use GPU to speed up the knn neighbors searching for Ragwitz test.

@dwuab
Copy link
Author

dwuab commented Jul 28, 2015

@mwibral I cannot find the source code for cudaFindKnn anywhere in the package. The .ptx file in libgpuKnnLibrary.a is low-level Cuda-specific assembly code and I cannot understand it.

@mwibral
Copy link
Collaborator

mwibral commented Jul 28, 2015

Dear Samuel,

I checked and indeed the cuda code is missing. I am traveling at the
moment, but will upload it next week.

Best,
Michael

On 28.07.2015 04:30, samuelandjw wrote:

@mwibral https://github.com/mwibral I cannot find the source code
for |cudaFindKnn| anywhere in the package. The |.ptx| file in
|libgpuKnnLibrary.a| is low-level Cuda-specific assembly code and I
cannot understand it.


Reply to this email directly or view it on GitHub
#13 (comment)
Bug from
https://github.com/notifications/beacon/AIqYGihs9CwTxVOgLb_aTcxLp0dytCLGks5ohuCsgaJpZM4FdO8J.gif

@pwollstadt
Copy link
Collaborator

Hi Samuel,

I uploaded the source code for the CUDA functions (see here), so you can have a look.

Best,
Patricia

@dwuab
Copy link
Author

dwuab commented Aug 1, 2015

@pwollstadt Thanks!
I just changed the path to Matlab and ran make on a machine that has N-cards and cuda and I got the following error messages:

[dwuab@login-0 cuda]$ make
/usr/local/cuda/bin/nvcc -m64  -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -Xcompiler -fPIC -c gpuKnnLibrary.cu
ar -r libgpuKnnLibrary.a gpuKnnLibrary.o
/usr/local/matlab-R2014a/bin/mex -L. -lgpuKnnLibrary -v fnearneigh_gpu.cpp -L/usr/local/cuda/lib64 -lcudart -lcusparse -lcublas
Verbose mode is on.
Neither -compatibleArrayDims nor -largeArrayDims is selected.
     Using -compatibleArrayDims. In the future, MATLAB will require the use of
     -largeArrayDims and remove the -compatibleArrayDims option.
     For more information:
     http://www.mathworks.com/help/matlab/matlab_external/upgrading-mex-files-to-use-64-bit-api.html.
No MEX options file identified; looking for an implicit selection.
... Looking for compiler 'g++' ...
... Executing command 'which g++' ...Yes ('/usr/bin/g++').
... Executing command 'g++ -print-file-name=libstdc++.so' ...Yes ('/usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so').
Found installed compiler 'g++'.
Options file details
-------------------------------------------------------------------
    Compiler location: $GCC_DIR
    Options file: /usr/local/matlab-R2014a/bin/glnxa64/mexopts/g++_glnxa64.xml
    CMDLINE1 : /usr/bin/g++ -c -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE  -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    CMDLINE2 : /usr/bin/g++ -pthread -Wl,--no-undefined  -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o   -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
    CMDLINE3 : rm -f /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    CXX : /usr/bin/g++
    DEFINES : -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE 
    MATLABMEX : -DMATLAB_MEX_FILE 
    CXXFLAGS : -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread
    INCLUDE : -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include"
    CXXOPTIMFLAGS : -O -DNDEBUG
    CXXDEBUGFLAGS : -g
    LDXX : /usr/bin/g++
    LDFLAGS : -pthread -Wl,--no-undefined 
    LDTYPE : -shared
    LINKEXPORT : -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map"
    LINKLIBS : -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++
    LDOPTIMFLAGS : -O
    LDDEBUGFLAGS : -g
    OBJEXT : .o
    LDEXT : .mexa64
    GCC : /usr/bin/g++
    CPPLIB_DIR : /usr/lib/gcc/x86_64-redhat-linux/4.4.7/libstdc++.so
    MATLABROOT : /usr/local/matlab-R2014a
    ARCH : glnxa64
    SRC : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp
    OBJ : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
    OBJS : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o 
    SRCROOT : /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu
    DEF : /tmp/mex_46418310074994570_53220/fnearneigh_gpu.def
    EXP : fnearneigh_gpu.exp
    LIB : fnearneigh_gpu.lib
    EXE : fnearneigh_gpu.mexa64
    ILK : fnearneigh_gpu.ilk
    MANIFEST : fnearneigh_gpu.mexa64.manifest
    TEMPNAME : fnearneigh_gpu
    EXEDIR : 
    EXENAME : fnearneigh_gpu
    OPTIM : -O -DNDEBUG
    LINKOPTIM : -O
-------------------------------------------------------------------
Building with 'g++'.
/usr/bin/g++ -c -DMX_COMPAT_32   -D_GNU_SOURCE -DMATLAB_MEX_FILE  -I"/usr/local/matlab-R2014a/extern/include" -I"/usr/local/matlab-R2014a/simulink/include" -ansi -fexceptions -fPIC -fno-omit-frame-pointer -pthread -O -DNDEBUG /d1/dwuab/TRENTOOL3/cuda/fnearneigh_gpu.cpp -o /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o
/usr/bin/g++ -pthread -Wl,--no-undefined  -shared -O -Wl,--version-script,"/usr/local/matlab-R2014a/extern/lib/glnxa64/mexFunction.map" /tmp/mex_46418310074994570_53220/fnearneigh_gpu.o   -lgpuKnnLibrary  -lcudart  -lcusparse  -lcublas   -L.  -L/usr/local/cuda/lib64   -Wl,-rpath-link,/usr/local/matlab-R2014a/bin/glnxa64 -L"/usr/local/matlab-R2014a/bin/glnxa64" -lmx -lmex -lmat -lm -lstdc++ -o fnearneigh_gpu.mexa64
/tmp/mex_46418310074994570_53220/fnearneigh_gpu.o: In function `mexFunction':
fnearneigh_gpu.cpp:(.text+0x268): undefined reference to `cudaFindKnn(int*, float*, float*, float*, int, int, int, int, int)'
collect2: ld returned 1 exit status

make: *** [mex] Error 255

Looks like something wrong in the linking stage.

@pwollstadt
Copy link
Collaborator

@samuelandjw thanks for letting us know. I forwarded this error to Mario Martínez Zarzuela, who programmed the CUDA functions. I'll let you know as soon as possible.

@dwuab
Copy link
Author

dwuab commented Aug 14, 2015

@pwollstadt I found the solution to the compilation problem. On my University's GPU cluster, mex actually invokes g++ to do the linking. Removing the extern "C" {...} surrounding cudaFindKnn and cudaFindRSAll works for me.
I think we should either:

  1. specify g++ to be the linker in the mex step and remove extern "C" {...} in the two above-mentioned functions.
  2. specify gcc to be the linker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants