-
Notifications
You must be signed in to change notification settings - Fork 6.8k
change contrib CTC to zero-indexed, in log space. sequence mask for grad #7727
Conversation
75cdcf0
to
89cca91
Compare
example/ctc/lstm_ocr.py
Outdated
@@ -59,7 +59,7 @@ def gen_rand(): | |||
def get_label(buf): | |||
ret = np.zeros(4) | |||
for i in range(len(buf)): | |||
ret[i] = 1 + int(buf[i]) | |||
ret[i] = int(buf[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contrib.ctcloss has been there for a while. We should maintain backward compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making a 1-indexed API into 0-indexed API is not really backward compatible. Contrib package doesn't guarantee backward compatibility anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that zero is reserved for CTC blank label, then we shall not use zero as label (which is ensured by the old code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, alphabet_size is reserved for CTC blank label.
src/operator/contrib/ctc_loss.cc
Outdated
@@ -35,7 +35,7 @@ ctcStatus_t compute_ctc_cost(const Tensor<cpu, 3, DType> activations, | |||
void *workspace, int train) { | |||
int minibatch = static_cast<int>(activations.size(1)); | |||
int alphabet_size = static_cast<int>(activations.size(2)); | |||
int blank_label = 0; | |||
int blank_label = alphabet_size-1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? Shouldn't it be alphabet_size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Alphabet size may be a misnomer, because the dimension of data is alphabet_size + 1, and this dimension is passed in as alphabet_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On gluon/loss.py line 330, the new change specifies the following, which is inconsistent and should be corrected. Yes. It is better that alphabet_size includes blank label (with the id of alphabet_size-1).
+ input has shape `(sequence_length, batch_size, alphabet_size+1)`
+ Note that the last dimension with index `alphabet_size` is reserved for special
+ blank character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @zhiheng-huang. I will make an example like tf doc did to clarify.
Discussed with @piiswrong offline. I will make the following changes:
|
d2056e5
to
8906d63
Compare
b36bb84
to
152b76c
Compare
python/mxnet/gluon/loss.py
Outdated
@@ -383,5 +384,5 @@ def hybrid_forward(self, F, data, label, | |||
use_data_lengths=data_lengths is not None, | |||
use_label_lengths=label_lengths is not None, | |||
data_lengths=data_lengths, label_lengths=label_lengths, | |||
padding_mask=self._padding_mask) | |||
reserve_first_label=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reserve_first_label is a bad name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your proposal?
src/operator/contrib/ctc_loss-inl.h
Outdated
@@ -194,7 +196,7 @@ inline bool PackLabelByLength(mshadow::Tensor<xpu, 2, DType> labels, | |||
struct CTCLossParam : public dmlc::Parameter<CTCLossParam> { | |||
bool use_data_lengths; | |||
bool use_label_lengths; | |||
dmlc::optional<int> padding_mask; | |||
dmlc::optional<int> blank_label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be optional.
optional means you also accept None
Its shape depends on `label_layout`. For `label_layout='TN'`, this | ||
input has shape `(label_sequence_length, batch_size)` | ||
When `label_lengths` is not specified, the first occurrence of `padding_mask` | ||
input has shape `(label_sequence_length, batch_size)`. Padding mask of value ``-1`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is removed right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter for padding mask is removed. The logic for padding mask value is not and should not be removed.
please rework documentation to reflect changes. |
…rad (apache#7727) * change CTC to zero-indexed, in log space. sequence mask for grad * CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1 * update to blank_label enum * remove optional * update doc * dummy test
…rad (apache#7727) * change CTC to zero-indexed, in log space. sequence mask for grad * CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1 * update to blank_label enum * remove optional * update doc * dummy test
…rad (apache#7727) * change CTC to zero-indexed, in log space. sequence mask for grad * CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1 * update to blank_label enum * remove optional * update doc * dummy test
…rad (apache#7727) * change CTC to zero-indexed, in log space. sequence mask for grad * CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1 * update to blank_label enum * remove optional * update doc * dummy test
…rad (apache#7727) * change CTC to zero-indexed, in log space. sequence mask for grad * CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1 * update to blank_label enum * remove optional * update doc * dummy test
No description provided.