Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Some mxnet ctc_loss bug & feature request #10995

Closed
chinakook opened this issue May 18, 2018 · 8 comments
Closed

Some mxnet ctc_loss bug & feature request #10995

chinakook opened this issue May 18, 2018 · 8 comments

Comments

@chinakook
Copy link
Contributor

Mxnet ctc_loss has nearly the same source code with baidu's warpctc with little modifications, but it has some bugs.

import mxnet as mx
import numpy as np
import numpy.random as npr

Case 1 - mxnet ctc_loss is all right

batch_size = 1024
seq_len = 35
label_len = 10
num_classes = 60

x = mx.nd.random.uniform(shape=(seq_len, batch_size, num_classes), ctx=mx.gpu(0))
y = npr.randint(0, num_classes, size=(batch_size, label_len))
Y = mx.nd.array(y, ctx=mx.gpu(0)) # float label type

loss = mx.nd.contrib.ctc_loss(data=x, label=Y)
loss = mx.nd.make_loss(loss)
print(loss.asnumpy())

Case 2 - mxnet ctc_loss cannot support integer label types

batch_size = 1024
seq_len = 35
label_len = 10
num_classes = 60

x = mx.nd.random.uniform(shape=(seq_len, batch_size, num_classes), ctx=mx.gpu(0))
y = npr.randint(0, num_classes, size=(batch_size, label_len))
Y = mx.nd.array(y, ctx=mx.gpu(0), dtype=np.int32)

loss = mx.nd.contrib.ctc_loss(data=x, label=Y)
loss = mx.nd.make_loss(loss)
print(loss.asnumpy())

Case 3 - mxnet ctc_loss is slow or will crash when num_classes is big

batch_size = 1024
seq_len = 35
label_len = 10
num_classes = 6000

x = mx.nd.random.uniform(shape=(seq_len, batch_size, num_classes), ctx=mx.gpu(0))
y = npr.randint(0, num_classes, size=(batch_size, label_len))
Y = mx.nd.array(y, ctx=mx.gpu(0), dtype=np.int32)

loss = mx.nd.contrib.ctc_loss(data=x, label=Y)
loss = mx.nd.make_loss(loss)
print(loss.asnumpy())

x = mx.nd.Reshape(x, shape=(-3, -2))
Y = mx.nd.Reshape(Y, shape=(-1,))
loss = mx.nd.WarpCTC(data=x, label=Y, label_length=label_len, input_length=seq_len)
print(loss)

Case 4 - warpctc is all OK with big num_classes and integer types

batch_size = 1024
seq_len = 35
label_len = 10
num_classes = 6000

x = mx.nd.random.uniform(shape=(seq_len, batch_size, num_classes), ctx=mx.gpu(0))
y = npr.randint(0, num_classes, size=(batch_size, label_len))
Y = mx.nd.array(y, ctx=mx.gpu(0), dtype=np.int32)

x = mx.nd.Reshape(x, shape=(-3, -2))
Y = mx.nd.Reshape(Y, shape=(-1,))
loss = mx.nd.WarpCTC(data=x, label=Y, label_length=label_len, input_length=seq_len)
print(loss)
@szha szha self-assigned this May 18, 2018
@szha szha added the Operator label May 18, 2018
@lanking520
Copy link
Member

lanking520 commented Jun 6, 2018

@chinakook do you think it is a good idea for us to depreciate contrib.ctc_loss and use WarpCTC since these two are similiar?
@eric-haibin-lin , can you check this document

@chinakook
Copy link
Contributor Author

Yes, that's a good idea. Cudnnctc should be added too.

@szha
Copy link
Member

szha commented Jun 7, 2018

ctc_loss was added so that there's no need to install the WarpCTC plugin. I've detailed why cudnn CTC is not usable for us in #7445

@lanking520
Copy link
Member

@szha Thank you. So I think we shouldn't abandon this one, as it is native to MXNet. Here is the JIRA ticket https://issues.apache.org/jira/browse/MXNET-526.

@szha szha removed their assignment Jul 13, 2018
@szha szha added the Bug label Jul 13, 2018
@loveltyoic
Copy link

I have met the same problem.
In Case 3, when use cpu and execute mx.nd.contrib.ctc_loss, it cost about 1sencod。
but when use gpu,it will take very long time and gpu util is 100%。

@apeforest
Copy link
Contributor

@nswamy The bug in case 3 was fixed by PR #11834.

I am working on the integer type support via JIRA (https://issues.apache.org/jira/browse/MXNET-807). Please change the label from Bug to Feature

@eric-haibin-lin eric-haibin-lin changed the title Some mxnet ctc_loss bug~ Some mxnet ctc_loss bug & feature request Sep 8, 2018
@apeforest
Copy link
Contributor

@chinakook I have added support of using integer as label type in CTC. Please kindly verify this change at your earliest convenience.

@chinakook
Copy link
Contributor Author

Yes, It's all right in all these cases. This issue can be closed as it's solved.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants