Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

change contrib CTC to zero-indexed, in log space. sequence mask for grad #7727

Merged
merged 6 commits into from
Sep 11, 2017

Conversation

szha
Copy link
Member

@szha szha commented Sep 4, 2017

No description provided.

@szha szha force-pushed the ctc_patch branch 6 times, most recently from 75cdcf0 to 89cca91 Compare September 5, 2017 19:32
@@ -59,7 +59,7 @@ def gen_rand():
def get_label(buf):
ret = np.zeros(4)
for i in range(len(buf)):
ret[i] = 1 + int(buf[i])
ret[i] = int(buf[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contrib.ctcloss has been there for a while. We should maintain backward compatibility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a 1-indexed API into 0-indexed API is not really backward compatible. Contrib package doesn't guarantee backward compatibility anyway.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that zero is reserved for CTC blank label, then we shall not use zero as label (which is ensured by the old code).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, alphabet_size is reserved for CTC blank label.

@@ -35,7 +35,7 @@ ctcStatus_t compute_ctc_cost(const Tensor<cpu, 3, DType> activations,
void *workspace, int train) {
int minibatch = static_cast<int>(activations.size(1));
int alphabet_size = static_cast<int>(activations.size(2));
int blank_label = 0;
int blank_label = alphabet_size-1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? Shouldn't it be alphabet_size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Alphabet size may be a misnomer, because the dimension of data is alphabet_size + 1, and this dimension is passed in as alphabet_size

Copy link

@zhiheng-huang zhiheng-huang Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On gluon/loss.py line 330, the new change specifies the following, which is inconsistent and should be corrected. Yes. It is better that alphabet_size includes blank label (with the id of alphabet_size-1).

+        input has shape `(sequence_length, batch_size, alphabet_size+1)`
 +        Note that the last dimension with index `alphabet_size` is reserved for special
 +        blank character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @zhiheng-huang. I will make an example like tf doc did to clarify.

@szha
Copy link
Member Author

szha commented Sep 6, 2017

Discussed with @piiswrong offline. I will make the following changes:

  1. get rid of the padding_mask argument that I added in my previous PR
  2. make the default behavior of contrib CTC 1-indexed to maintain backward compatibility.
  3. add a boolean blag to specify whether label is 0-indexed or 1-indexed. Use padding mask -1 for 0-indexed and padding mask 0 for 1-indexed for backward compatibility.
  4. in gluon, remove padding_mask argument, and always use 0-indexed label.

@szha szha force-pushed the ctc_patch branch 2 times, most recently from d2056e5 to 8906d63 Compare September 7, 2017 06:03
@szha szha mentioned this pull request Sep 7, 2017
@szha szha force-pushed the ctc_patch branch 2 times, most recently from b36bb84 to 152b76c Compare September 7, 2017 23:20
@@ -383,5 +384,5 @@ def hybrid_forward(self, F, data, label,
use_data_lengths=data_lengths is not None,
use_label_lengths=label_lengths is not None,
data_lengths=data_lengths, label_lengths=label_lengths,
padding_mask=self._padding_mask)
reserve_first_label=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reserve_first_label is a bad name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your proposal?

@@ -194,7 +196,7 @@ inline bool PackLabelByLength(mshadow::Tensor<xpu, 2, DType> labels,
struct CTCLossParam : public dmlc::Parameter<CTCLossParam> {
bool use_data_lengths;
bool use_label_lengths;
dmlc::optional<int> padding_mask;
dmlc::optional<int> blank_label;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be optional.
optional means you also accept None

Its shape depends on `label_layout`. For `label_layout='TN'`, this
input has shape `(label_sequence_length, batch_size)`
When `label_lengths` is not specified, the first occurrence of `padding_mask`
input has shape `(label_sequence_length, batch_size)`. Padding mask of value ``-1``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is removed right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter for padding mask is removed. The logic for padding mask value is not and should not be removed.

@piiswrong
Copy link
Contributor

please rework documentation to reflect changes.
Add an example that explains blank label

@piiswrong piiswrong merged commit 123deb0 into apache:master Sep 11, 2017
@szha szha deleted the ctc_patch branch September 11, 2017 18:15
cjolivier01 pushed a commit to cjolivier01/mxnet that referenced this pull request Sep 11, 2017
…rad (apache#7727)

* change CTC to zero-indexed, in log space. sequence mask for grad

* CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1

* update to blank_label enum

* remove optional

* update doc

* dummy test
mbaijal pushed a commit to mbaijal/incubator-mxnet that referenced this pull request Sep 19, 2017
…rad (apache#7727)

* change CTC to zero-indexed, in log space. sequence mask for grad

* CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1

* update to blank_label enum

* remove optional

* update doc

* dummy test
mbaijal pushed a commit to mbaijal/incubator-mxnet that referenced this pull request Sep 19, 2017
…rad (apache#7727)

* change CTC to zero-indexed, in log space. sequence mask for grad

* CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1

* update to blank_label enum

* remove optional

* update doc

* dummy test
mbaijal pushed a commit to mbaijal/incubator-mxnet that referenced this pull request Sep 20, 2017
…rad (apache#7727)

* change CTC to zero-indexed, in log space. sequence mask for grad

* CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1

* update to blank_label enum

* remove optional

* update doc

* dummy test
crazy-cat pushed a commit to crazy-cat/incubator-mxnet that referenced this pull request Oct 26, 2017
…rad (apache#7727)

* change CTC to zero-indexed, in log space. sequence mask for grad

* CTC remove padding_mask, add reserve_first_label flag for blank=0/len-1

* update to blank_label enum

* remove optional

* update doc

* dummy test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants