ctc_loss with large alphabet size raises CUDA error #12493

hallazie · 2018-09-10T07:57:36Z

##Enviroment:
python2.7/windows7_64bit
mxnet-1.2.0
Nvidia Driver Version 397.31

Error Message:

[15:41:28] G:\deeplearn\mxnet\dmlc-core\include\dmlc/logging.h:308: [15:41:28] g:\deeplearn\mxnet\mshadow\mshadow./stream_gpu-inl.h:62: Check failed: e == cudaSuccess CUDA: unknown error
[15:41:28] G:\deeplearn\mxnet\dmlc-core\include\dmlc/logging.h:308: [15:41:28] g:\deeplearn\mxnet\src\engine./threaded_engine.h:370: [15:41:28] g:\deeplearn\mxnet\mshadow\mshadow./stream_gpu-inl.h:62: Check failed: e == cudaSuccess CUDA: unknown error

Minimum reproducible example

import mxnet as mx
import numpy as np
ctx=mx.gpu(0)
alphabet_size=3000
in_var = mx.sym.Variable('data')
labels_var = mx.sym.Variable('label')
ctc = mx.sym.contrib.ctc_loss(in_var, labels_var)
loss = mx.symbol.MakeLoss(ctc)
arg_shapes,_,_ = loss.infer_shape(data=(6,2,alphabet_size), label=(2,3))
arg_array = [mx.nd.normal(shape=shape, ctx=ctx) for shape in arg_shapes]
exe = loss.bind(ctx=ctx, args=arg_array)
exe.forward(is_train=True)
exe.backward()
outTest = exe.outputs[0]
print '%s'%(outTest.asnumpy())

when alphabet_size=200 the code works fine, when alphabet_size=3000 (for chinese ocr task) the code crashes.

The text was updated successfully, but these errors were encountered:

vrakesh · 2018-09-10T17:16:16Z

@hallazie Thank you for reporting the issue. We will look into this
@mxnet-label-bot [CUDA, Operator]

apeforest · 2018-09-10T17:57:06Z

@hallazie This issue seems to have been resolved in a recent PR: #11834
I cannot reproduce this error using the latest master build on CUDA 9.0. Could you please verify and let me know if you still have issue? Thanks

hallazie · 2018-09-12T03:40:02Z

@hallazie This issue seems to have been resolved in a recent PR: #11834
I cannot reproduce this error using the latest master build on CUDA 9.0. Could you please verify and let me know if you still have issue? Thanks

I'm trying to build the master branch from source but encountered some problems. I'll notify you once I verified it.

apeforest · 2018-09-12T23:16:11Z

@hallazie Please let me know if you have encountered specific installation issue. We'd love to help. Ideally, you should be able to just run pip install on your Windows as $ pip install mxnet-cu92 --pre

hallazie · 2018-09-17T11:40:24Z

@hallazie Please let me know if you have encountered specific installation issue. We'd love to help. Ideally, you should be able to just run pip install on your Windows as $ pip install mxnet-cu92 --pre

This issue is solved by installing newer release pip install mxnet-cu91==1.2.0. Thanks for the help. :D

marcoabreu added CUDA Operator labels Sep 10, 2018

hallazie closed this as completed Sep 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ctc_loss with large alphabet size raises CUDA error #12493

ctc_loss with large alphabet size raises CUDA error #12493

hallazie commented Sep 10, 2018

vrakesh commented Sep 10, 2018

apeforest commented Sep 10, 2018 •

edited

Loading

hallazie commented Sep 12, 2018

apeforest commented Sep 12, 2018 •

edited

Loading

hallazie commented Sep 17, 2018

ctc_loss with large alphabet size raises CUDA error #12493

ctc_loss with large alphabet size raises CUDA error #12493

Comments

hallazie commented Sep 10, 2018

Error Message:

Minimum reproducible example

vrakesh commented Sep 10, 2018

apeforest commented Sep 10, 2018 • edited Loading

hallazie commented Sep 12, 2018

apeforest commented Sep 12, 2018 • edited Loading

hallazie commented Sep 17, 2018

apeforest commented Sep 10, 2018 •

edited

Loading

apeforest commented Sep 12, 2018 •

edited

Loading