Skip to content

Conversation

@Sreyan88
Copy link
Contributor

@Sreyan88 Sreyan88 commented Jul 4, 2022

What does this PR do?

TFWav2Vec2ForCTC implementation was incorrect. The CTC loss calculation wasn't proper. The root of the problem was that the CTC target labels weren't reaching the loss calculation and it was None. So adding @unpack_inputs now unpacks the input properly and loss calculation is properly done.

Additionally, the loss needed to be reshaped for backpropagation.

Fixes #18009

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 4, 2022

The documentation is not available anymore as the PR was closed or merged.

@Rocketknight1 Rocketknight1 self-requested a review July 4, 2022 16:06
@Rocketknight1 Rocketknight1 self-assigned this Jul 4, 2022
@Rocketknight1
Copy link
Member

This looks good to me! We just made the same change to several other losses (reshaping the output from a scalar to a tensor with shape (1,)).

My only concern is that I have no idea how the lack of @unpack_inputs was missed by tests, but I'm happy to merge this for now and think about how to expand test coverage afterwards!

@Rocketknight1
Copy link
Member

@Sreyan88 I'm happy with this and I think it's ready to merge now - if you want to make any other changes, now's the time. If not, ping me and I'll merge it!

@Sreyan88
Copy link
Contributor Author

Sreyan88 commented Jul 4, 2022

@Sreyan88 I'm happy with this and I think it's ready to merge now - if you want to make any other changes, now's the time. If not, ping me and I'll merge it!

@Rocketknight1 I'm happy you can merge!

@Rocketknight1 Rocketknight1 merged commit e3139ad into huggingface:main Jul 4, 2022
viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Negative CTC loss while training TFWav2Vec2ForCTC model

4 participants