-
Notifications
You must be signed in to change notification settings - Fork 6.8k
mx.sym.WarpCTC cuda memcpy or memset failed issue #6121
Comments
@sbodenstein Could you look into this?
|
Looking at this now. I think option 2 or 3 should be aimed at. |
@piiswrong: I'm having difficulties trying to reproduce this due to #6032 (and I'm using a Mac). |
Ok then could you disable ctc_loss when using warp ctc plugin first? We don't want to break existing features. also make sure all the copied ctc/modern gpu code is not in include path. |
Sure, I will do those things. For disabling: how would you recommend I do this? |
I'm actually not sure. I think the problem here is you have two implementations of the same function (one from original baiductc.so, one from absorbed code). One solution I can think of is to wrap all absorbed ctc code in a different namespace. |
I do think it shouldn't be too hard to fix (eg with namespaces). But its annoying that I can't build MXNet on GPU at the moment due to the OSX bug. |
you can workaround it temporarily by deleting quantize/dequantize ops under contrib |
@piiswrong Many thanks |
@xinq2016, @piiswrong: I can finally reproduce this problem. I have confirmed that this example works when you delete all the new ctc_loss files (the folder |
@sbodenstein Many thanks |
I found the same problem. @sbodenstein or @xinq2016, can you provide more details on how to fix this? Edit: Found them! Removing them works. They are located in ~/mxnet/src/operator/contrib |
@sbodenstein @piiswrong I delete all ctc_include, ctc_loss.cc, ctc_loss.cu and ctc_loss-inl.h below ~/mxnet/src/operator/contrib but i still get the error message:
|
@KeyKy: did you do |
@KeyKy sbodenstein is right,It works now,thank you |
Fixed yet? I got the same error with mxnet 0.10.1 |
This doesn't appear to be fixed, it does not work for me either. |
@jcftang: this is strange. Its fixed for some (like myself), and others not. Could you give info about the GPU you are using? |
@sbodenstein I have nvidia 1080Ti's (founder's edition) |
Environment info
Operating System: Ubuntu 14.04
GPU: GTX 1080
Compiler:
Package used (Python/R/Scala/Julia): Python
MXNet version: 0.9.5 in python
Or if installed from source: git clone https://github.com/dmlc/mxnet.git ~/mxnet --recursive
Error Message:
[16:36:11] src/operator/././cudnn_algoreg-inl.h:65: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
[ INFO][2017/05/05 16:36:25.768] ---------train---------
terminate called after throwing an instance of 'std::runtime_error'
what(): Error: compute_ctc_loss, stat = cuda memcpy or memset failed
Minimum reproducible example
cd ~
git clone https://github.com/baidu-research/warp-ctc
cd warp-ctc
mkdir build
cd build
cmake ..
make
sudo make install
git clone https://github.com/dmlc/mxnet.git ~/mxnet --recursive
cd ~/mxnet
cp make/config.mk .
modify the config.mk as following:
USE_BLAS = openblas
USE_CUDA = 1
USE_CUDA_PATH = /usr/local/cuda
USE_CUDNN = 1
WARPCTC_PATH = /home/nd/warp-ctc (which my wrap ctc installed)
MXNET_PLUGINS += plugin/warpctc/warpctc.mk
CUDA_ARCH := -gencode arch=compute_30,code=sm_30
-gencode arch=compute_35,code=sm_35
-gencode arch=compute_50,code=sm_50
-gencode arch=compute_60,code=sm_60
-gencode arch=compute_61,code=sm_61
-gencode arch=compute_61,code=compute_61
make
cd python
sudo python setup.py install
Steps to reproduce
python main.py --python main.py --configfile default.cfg
the ctc layer code:
net = mx.sym.WarpCTC(data=net, label=label, label_length=num_label, input_length=seq_len)
How can I fix it?
Many thanks
Xin.q.
The text was updated successfully, but these errors were encountered: