Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling invalid audio generations (for DPO) #43

Conversation

rfejgin
Copy link
Collaborator

@rfejgin rfejgin commented Jan 27, 2025

When T5TTS generates an output audio that is very short (on the order of <1000 samples) this trips up the ASR and speaker similarity calculations. The ASR in particular needs at least two features' (frames) worth of samples.

Here, we detect the ASR error and write out invalid metrics for the entire batch, i.e. a WER/CER of 100%, an SSIM of 0.0, and set the predicted transcript to <INVALID> so that subsequent processing can process this entry as appropriate.

For DPO preference pair creation we skip any record-group that has at least one entry with invalid metrics.

@shehzeen and @paarthneekhara could you review? @paarthneekhara please feel free to merge when ready. Thank you!

When the model generates an output that is very short (less than 2 ASR frames) the ASR and SSIM calculations will error out. We detect the error and invalidate the entire batch, setting WER/CER to 100% and SSIM to 0.0. The transcription is set to "<INVALID">.

Note the metrics still written out to the `.metrics` files; they need to be ignored by any subsequent statistics calculations.
1. Skip groups that have any invalid records.
2. Allow the number of records to exactly match the number of audio files (vs requiring it to be strictly smaller).
3. Add `tqdm` to incidatea progress during long loops.
@github-actions github-actions bot added the TTS label Jan 27, 2025
@rfejgin rfejgin changed the title Experimentalt5tts finalizedtransformer Handling invalid audio generations (for DPO) Jan 27, 2025
…former' into experimentalt5tts_finalizedtransformer
Refining the handling of invalid entries in DPO preference selection.
Copy link

beep boop 🤖: 🚨 The following files must be fixed before merge!


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.collections.tts.models.t5tts
nemo/collections/tts/models/t5tts.py:135:0: C0301: Line too long (147/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:157:0: C0301: Line too long (120/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:167:0: C0301: Line too long (124/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:172:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:175:0: C0301: Line too long (121/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:185:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:253:0: C0301: Line too long (124/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:254:0: C0301: Line too long (160/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:366:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:382:0: C0301: Line too long (140/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:393:0: C0301: Line too long (146/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:441:0: C0301: Line too long (139/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:447:0: C0301: Line too long (125/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:448:0: C0301: Line too long (145/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:461:0: C0301: Line too long (121/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:495:0: C0301: Line too long (158/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:504:0: C0301: Line too long (207/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:507:0: C0301: Line too long (204/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:510:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:511:0: C0301: Line too long (166/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:519:0: C0301: Line too long (132/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:531:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:601:0: C0301: Line too long (133/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:603:0: C0301: Line too long (125/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:617:0: C0301: Line too long (120/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:618:0: C0301: Line too long (121/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:620:0: C0301: Line too long (134/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:621:0: C0301: Line too long (136/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:623:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:625:0: C0301: Line too long (149/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:654:0: C0301: Line too long (165/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:655:0: C0301: Line too long (134/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:709:0: C0301: Line too long (138/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:710:0: C0301: Line too long (151/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:712:0: C0301: Line too long (168/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:713:0: C0301: Line too long (149/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:730:0: C0301: Line too long (135/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:739:0: C0301: Line too long (138/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:751:0: C0301: Line too long (129/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:761:0: C0301: Line too long (158/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:762:0: C0301: Line too long (193/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:769:0: C0301: Line too long (135/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:770:0: C0301: Line too long (124/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:794:0: C0301: Line too long (140/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:795:0: C0301: Line too long (127/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:878:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:885:0: C0301: Line too long (125/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:899:0: C0301: Line too long (121/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:919:0: C0301: Line too long (166/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:934:0: C0301: Line too long (120/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:950:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:952:0: C0301: Line too long (188/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:973:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:977:0: C0301: Line too long (130/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1081:0: C0301: Line too long (136/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1088:0: C0301: Line too long (146/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1091:0: C0301: Line too long (161/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1094:0: C0301: Line too long (191/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1096:0: C0301: Line too long (155/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1135:0: C0301: Line too long (127/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1148:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1168:0: C0301: Line too long (154/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1169:0: C0301: Line too long (176/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1172:0: C0301: Line too long (132/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1197:0: C0301: Line too long (121/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1198:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1199:0: C0301: Line too long (127/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1200:0: C0301: Line too long (149/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1201:0: C0301: Line too long (143/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1203:0: C0301: Line too long (163/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1208:0: C0301: Line too long (132/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1219:0: C0301: Line too long (206/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1238:0: C0301: Line too long (138/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1240:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1275:0: C0301: Line too long (143/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1276:0: C0301: Line too long (151/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1278:0: C0301: Line too long (165/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:1279:0: C0301: Line too long (173/119) (line-too-long)
nemo/collections/tts/models/t5tts.py:57:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:87:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:101:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/models/t5tts.py:199:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:203:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:214:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:227:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:239:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:264:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:278:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:291:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:301:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:326:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:339:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:358:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:376:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:393:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:428:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:444:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:454:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:556:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:587:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:676:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:692:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:724:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:822:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:865:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:987:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:1026:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:1138:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/tts/models/t5tts.py:1249:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/tts/models/t5tts.py:15:0: W0611: Unused ceil imported from math (unused-import)
nemo/collections/tts/models/t5tts.py:17:0: W0611: Unused import omegaconf (unused-import)
nemo/collections/tts/models/t5tts.py:43:0: W0611: Unused OmegaConf imported from omegaconf (unused-import)
nemo/collections/tts/models/t5tts.py:52:4: W0611: Unused import wandb (unused-import)
************* Module scripts.t5tts.dpo.create_preference_pairs
scripts/t5tts/dpo/create_preference_pairs.py:12:0: C0301: Line too long (140/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:20:0: C0301: Line too long (165/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:50:0: C0301: Line too long (124/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:54:0: C0301: Line too long (120/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:67:0: C0301: Line too long (134/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:68:0: C0301: Line too long (130/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:226:0: C0301: Line too long (166/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:227:0: C0301: Line too long (173/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:228:0: C0301: Line too long (209/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:236:0: C0301: Line too long (128/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:255:0: C0301: Line too long (152/119) (line-too-long)
scripts/t5tts/dpo/create_preference_pairs.py:9:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:74:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:82:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:88:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:170:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:185:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/t5tts/dpo/create_preference_pairs.py:243:0: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 8.75/10

Thank you for improving NeMo's documentation!

@rfejgin rfejgin marked this pull request as ready for review January 30, 2025 01:55
@paarthneekhara paarthneekhara merged commit d70b903 into paarthneekhara:experimentalt5tts_finalizedtransformer Jan 30, 2025
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants