-
Notifications
You must be signed in to change notification settings - Fork 6.8k
contrib ctc interface changes, cudnn7 CTC, and gluon CTC #7442
Conversation
19298a9
to
245a789
Compare
@sbodenstein the second commit in this PR is to address #7445. |
src/operator/contrib/ctc_loss-inl.h
Outdated
} | ||
|
||
template <typename DType, typename xpu> | ||
inline bool PackLabelByLength(mshadow::Tensor<xpu, 2, DType> labels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment
6820fd7
to
79a5b1f
Compare
src/operator/contrib/ctc_loss-inl.h
Outdated
@@ -240,15 +461,22 @@ class CTCLossProp : public OperatorProperty { | |||
int NumOutputs() const override { return 2; } | |||
|
|||
std::vector<std::string> ListArguments() const override { | |||
return {"data", "label"}; | |||
if (param_.use_input_lengths && param_.use_label_lengths) { | |||
return {"data", "label", "input_lengths", "label_lengths"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_lengths
@@ -165,6 +165,36 @@ def test_l1_loss(): | |||
assert mod.score(data_iter, eval_metric=mx.metric.Loss())[0][1] < 0.1 | |||
|
|||
|
|||
def test_ctc_loss(): | |||
loss = gluon.loss.CTCLoss(padding_mask=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included test_loss in test_operator_gpu and everything passes.
src/operator/contrib/ctc_loss-inl.h
Outdated
// since the input is activation before softmax and cudnn ctc takes softmax | ||
// apply softmax to inputs first. | ||
Tensor<xpu, 3, real_t> prob(data.shape_); | ||
mshadow::AllocSpace(&prob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont alloc this way. Use ctx.requested
src/operator/contrib/ctc_loss-inl.h
Outdated
PackLabelByLength(labels, in_data[kLabelLength].get<xpu, 1, real_t>(s), | ||
&packed_labels, &label_lengths); | ||
} else { | ||
#if defined(__CUDACC__) && MXNET_USE_CUDNN == 1 && CUDNN_MAJOR >= 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these ifs
src/operator/contrib/ctc_loss-inl.h
Outdated
CUDNN_CALL(cudnnDestroyCTCLossDescriptor(ctc_desc_)); | ||
CUDNN_CALL(cudnnDestroyTensorDescriptor(prob_desc_)); | ||
CUDNN_CALL(cudnnDestroyTensorDescriptor(grad_desc_)); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if/endif shouldn't cross function
python/mxnet/gluon/loss.py
Outdated
Layout of the output sequence activation vector. | ||
label_layout : str, default 'NT' | ||
Layout of the labels. | ||
use_input_lengths : bool, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for the flag. just see if argument is none
python/mxnet/gluon/loss.py
Outdated
|
||
Parameters | ||
---------- | ||
output_layout : str, default 'NTC' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_layout -> layout
python/mxnet/gluon/loss.py
Outdated
lengths of labels. Only required when `use_label_lengths` is false. | ||
weight : float or None | ||
Global scalar weight for loss. | ||
input_lengths : NDArray or None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are argument to forward. List separately in input/output section
src/operator/contrib/ctc_loss-inl.h
Outdated
int batch = labels.size(0); | ||
int max_num_labels = labels.size(1); | ||
std::vector<int> cpu_labels(max_num_labels); | ||
IndexTensorToVector(in_label_lengths, label_lengths); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function does cudaMemcopy.
Can you do it only once?
src/operator/contrib/ctc_loss-inl.h
Outdated
std::vector<int> *packed_labels, | ||
std::vector<int> *label_lengths) { | ||
int batch = labels.size(0); | ||
int max_num_labels = labels.size(1); | ||
std::vector<index_t> cpu_labels(max_num_labels); | ||
std::vector<int> cpu_labels(max_num_labels); | ||
bool exceed_limit = false; | ||
|
||
for (int b = 0; b < batch; ++b) { | ||
IndexTensorToVector(labels[b], &cpu_labels); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to do copy only once
Do you have performance comp between baidu and cudnn? |
3cfbfeb
to
32884ea
Compare
I compared these two implementations roughly on K80 by timing only the parts where the two implementation diverges. The two implementations are about the same, and cudnn one seems slightly more efficient when input size is large. Below I listed the numbers. The four parts of the workload line represent: input shape before preprocessing, label shape, input length, label length.
|
8630456
to
c70505b
Compare
* contrib ctc interface changes for compatibility * cudnn ctc * update per comments
* contrib ctc interface changes for compatibility * cudnn ctc * update per comments
* contrib ctc interface changes for compatibility * cudnn ctc * update per comments
This change is to make current contrib CTC compatible with the cudnn7 CTC interface, and to add CTC loss layer for gluon.