Fix inverted conditional in TF common test!#22540
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
As expected this has raised a few bugs in the cross-test that were silent before - I'll see what I can do in this PR |
gante
left a comment
There was a problem hiding this comment.
The change makes sense!
Re broken tests (which probably need to be fixed/skipped before merging) -- it means that the loss calculation has issues, correct?
|
Most likely - I'll investigate them all soon! |
df36a70 to
12cbb74
Compare
|
Quick summary of the fixes needed: ESM: GPT2: For model classes that take rank-3 inputs (e.g. HUBERT: Loss computation especially for CTC overflows a lot with the default labels, which creates lots of Wav2Vec2: Same as HUBERT XGLM: The PT XGLM model does a weird thing where it shifts labels by 1 and then adds |
sgugger
left a comment
There was a problem hiding this comment.
Thanks a lot for fixing the condition in the base test and all the subsequent failures.
* Fix inverted conditional in TF common test! * Make the same change in the PT tests file * Make sure hidden states for GPT2 have the same output shape in PT/TF * Minor fix to PT implementation of token classification loss * Skip loss equivalence test for TFHubert because it keeps overflowing to inf * Compute LM loss for TF the (weird) way it's computed in PT * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert * Fix - don't try to access the hidden states property when output is a tuple
|
Thank you for the fix @Rocketknight1 ❤️ . And I apologize for the mistake I introduced ... |
* Fix inverted conditional in TF common test! * Make the same change in the PT tests file * Make sure hidden states for GPT2 have the same output shape in PT/TF * Minor fix to PT implementation of token classification loss * Skip loss equivalence test for TFHubert because it keeps overflowing to inf * Compute LM loss for TF the (weird) way it's computed in PT * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert * Fix - don't try to access the hidden states property when output is a tuple
Noticed a rather alarming conditional being backwards in the
test_pt_tf_model_equivalencecommon test. This probably resulted in a lot of tests being skipped!