Fix caching bug in causal convolutions for cache-aware ASR models #7034

VahidooX · 2023-07-14T00:09:18Z

What does this PR do ?

Fixed the caching bug in the convolution layers for cache-aware models

Changelog

Fixed the caching bug in the convolution layers for cache-aware models

PR Type:

New Feature
Bugfix
Documentation

Signed-off-by: vnoroozi <[email protected]>

for more information, see https://pre-commit.ci

)

) (#7082) Co-authored-by: Vahid Noroozi <[email protected]>

) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]>

* Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (#7022) (#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (#7019) (#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (#6994) (#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (#7052) (#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (#7065) (#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e325. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> Signed-off-by: tbartley94 <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: AlexGrinch <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vitaly Lavrukhin <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Linnea Pari Leaver <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Abhishree <[email protected]> Co-authored-by: Kim Ngo <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: trias702 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jan Beckmann <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: lleaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Ryan Langman <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: anteju <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]>

…IDIA#7034) (NVIDIA#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: dorotat <[email protected]>

* Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (NVIDIA#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (NVIDIA#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (NVIDIA#7022) (NVIDIA#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (NVIDIA#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (NVIDIA#7019) (NVIDIA#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (NVIDIA#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (NVIDIA#7052) (NVIDIA#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (NVIDIA#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (NVIDIA#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (NVIDIA#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (NVIDIA#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (NVIDIA#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (NVIDIA#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (NVIDIA#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (NVIDIA#7053) (NVIDIA#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (NVIDIA#7081) * small Bugfix (NVIDIA#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (NVIDIA#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (NVIDIA#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (NVIDIA#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (NVIDIA#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (NVIDIA#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (NVIDIA#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (NVIDIA#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125) This reverts commit a46e325. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (NVIDIA#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (NVIDIA#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (NVIDIA#7141) (NVIDIA#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (NVIDIA#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (NVIDIA#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (NVIDIA#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (NVIDIA#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (NVIDIA#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (NVIDIA#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (NVIDIA#7027) (NVIDIA#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (NVIDIA#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (NVIDIA#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (NVIDIA#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (NVIDIA#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (NVIDIA#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (NVIDIA#7117) Signed-off-by: Evelina <[email protected]> * update branch (NVIDIA#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (NVIDIA#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (NVIDIA#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * add paths to labeler. (NVIDIA#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> Signed-off-by: tbartley94 <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: AlexGrinch <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vitaly Lavrukhin <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Linnea Pari Leaver <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Abhishree <[email protected]> Co-authored-by: Kim Ngo <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: trias702 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jan Beckmann <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: lleaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Ryan Langman <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: anteju <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Signed-off-by: dorotat <[email protected]>

* migrated class Signed-off-by: dorotat <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: dorotat <[email protected]> * added unit test Signed-off-by: dorotat <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: dorotat <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: dorotat <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: dorotat <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: dorotat <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: dorotat <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: dorotat <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a. Signed-off-by: dorotat <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: dorotat <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: dorotat <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: dorotat <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: dorotat <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: dorotat <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: dorotat <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: dorotat <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: dorotat <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: dorotat <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * T5 metrics fix (#7037) * Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (#7022) (#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (#7019) (#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (#6994) (#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (#7052) (#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (#7065) (#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <abhishreetm@gmai…

* Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a. * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> * fix default attention size (#7141) (#7143) * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <[email protected]> * T5 metrics fix (#7037) * Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (#7022) (#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (#7019) (#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (#6994) (#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (#7052) (#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (#7065) (#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit a46e3251944642f9102aa16ce2d2f9d3a804ff8a. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> Signed-off-by: tbartley94 <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: AlexGrinch <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vitaly Lavrukhin <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Linnea Pari Leaver <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Abhishree <[email protected]> Co-authored-by: Kim Ngo <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Adi Renduchintala <adithyar…

* Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118) * Fix import guard checks (NVIDIA#7124) * Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125) This reverts commit a46e325. * Fix import guard checks (NVIDIA#7126) * Fix import guard checks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130) * [TTS] Create EnCodec training recipe (NVIDIA#6852) * [TTS] Create EnCodec training recipe * [TTS] Update encodec recipe * [TTS] Rename EnCodec to AudioCodec * [TTS] Add EnCodec unit tests * [TTS] Add copyright header to distributed.py --------- * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061) * fix default attention size (NVIDIA#7141) (NVIDIA#7143) * fix evaluator.py for various exceptions by ast (NVIDIA#7150) * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. * unify configs into a single one and add detailed comments providing supported candidates. * choose 36-final IPA as default phoneme dict --------- * [TTS] Add output audio format to preprocessing (NVIDIA#6889) * [TTS] Add output audio format to preprocessing * [TTS] Add format validation * [TTS] Fix data tutorial --------- * freeze (NVIDIA#7152) * make sure any empty segments are removed (NVIDIA#7155) * Update RIR generation scripts (NVIDIA#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio * A quickstart speech enhancement tutorial (NVIDIA#6492) A simple example of training a model for speech enhancement task * NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160) * allow specifying colors of text in ASS subtitle file * specify vertical_alignment instead of marginv in ass_file_config * add documentation of CTMFileConfig and ASSFileConfig to NFA README --------- * Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153) * TE bug fix (NVIDIA#7027) (NVIDIA#7036) * [TTS] Remove nested TTS configs (NVIDIA#7154) * [TTS] Remove nested TTS configs * [TTS] Modify tutorial to support multiple sampling rates * [TTS] Clarify min_duration unit * [TTS] Default 22.05kHz highfreq to null --------- * Merge release r1.20.0 to main (NVIDIA#7167) * update package info * Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955) * Add ASR with TTS Tutorial * Fix enhancer usage * install_bs (NVIDIA#7019) * Fix typo and branch in tutorial (NVIDIA#7048) * fix syntax error introduced in PR-7079 (NVIDIA#7102) * fix syntax error introduced in PR-7079 * fixes for pr review --------- * fix links for TN (NVIDIA#7117) * update branch (NVIDIA#7135) * Fixed main and merging this to r1.20 (NVIDIA#7127) * Fixed main and merging this to r1.20 * Update vad_utils.py --------- * update branch * fix version * resolve conflict the other way * keep both * revert keep both --------- * Upgrade to pytorch lightning 2.0 (NVIDIA#6433) * Upgrade pytorch lightning version in requirements * Initial fixes for PTL2.0 * Add further fixes to support lightning 2.0 * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end * Replace all occurances of validation_epoch_end to on_validation_epoch_end * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively * Change logger=None to logger=False in Trainer object * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass * Modify trainer.precision check and other small edits * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer * Add default values for args to fix Attribute Error * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py * Revert an extra space that was mistakenly added * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing * Remove outputs arg from on_train_epoch_end * Remove outputs from on_validation_epoch_end in multi_binary_acc.py * Remove output args from on_validation_epoch_end in the docstrings of some ASR files * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs * Add on_validation_epoch_end and remove outputs args for nlp models * Append output of validation_step to validation_step_outputs in EncDecClassificationModel * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError * Add if condition check for multiple dataloaders when appending to validation outputs * Separate validation pass to be used with both validation_step and test_step * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 * Modify precision checks to account for 16-mixed and bf16-mixed * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml * Add split arg self.test_step_outputs to TextClassificationModel * Add test_step_outputs to dialogue and text classification models * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg * Add val/test_step_outputs to S2SQAModel and GPTQAModel * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN * Precision fix and skip few failing tests * Add missing comment lines in JenkinsFile * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py * Minor edit JenkinsFile * Minor edit in jenkins file * Edit in Jenkins file * Comment missed lines in Jenkins file * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files * Fix all CI TTS tests and comment few Jenkins tests * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py * Add a missing comment in JenkinsFile * Add try except StopIteration in validation_step for models with dataloader_iter * Remove pyyaml from requirements * Add try except for inference_step in megatron_finetune_model.py * Remove limit_val_batches for mockGPTDataset test * Add new self.validation_step_outputs for MegatronGPTSFTModel * Minor edit Jenkinsfile * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. * Remove resume_from_checkpoint if trainer arg in conf yaml files * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs * Remove resume_from_checkpoint in duplex_tn_config.yaml * Fix typos, unused imports and refactor code to remove redundant funcs * Remove commented code in megatron_nmt_model.py * Fix overriden functions to match parent class functions * Prefetch dataloader_iter to prevent hang for PP>1 * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 * Uncomment tests in JenkinsFile * Add '16' to precision checks and other minor fixes * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders * Minor edits * Modify precision checks to avoid indexing * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs * Reference checkpoint with trainer.ckpt_path * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT --------- * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112) * scripts for sft * fix style * adde special token only for huggingface model * change default name * print out error datapoint content * show error id * annotation script working * try to be compatible with huggingface tokenizer * added examples * added lang * added lang * text to value special case * configure the slider * annoatation handles lang * added the unit test for chat sft dataset * used the file in the test dir * fix json error * load local tokenizer * remove mask count check * added HF dataset backend * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * add paths to labeler. (NVIDIA#7087) * T5 metrics fix (NVIDIA#7037) * Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016) * Added bool types to neural_types export (NVIDIA#7032) * rnnt and char utils (NVIDIA#6971) * rnnt_ngram_merge * char level bug * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * fix tab text gen (NVIDIA#7022) (NVIDIA#7031) * Fixed kwargs for metric instance init * Fixed kwargs for metric instance init * removed kwagrs * Updated config desc * ASR Confidence update and tutorial (NVIDIA#6810) * small fixes and tests * various fixes for the tutorial * tutorial added * for for a little oops after rebasement * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests * unused import removed * fix review comments * deprecated parameters for greedy configs * move re-assigning to configs * fix comments 2 * fix config tests * fix ece test (my env was bugged apparently) * renamings for confidence ensemble * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 * return dropped tutorial * CI flips back and forth, increasing tolerance --------- * install_bs (NVIDIA#7019) (NVIDIA#7028) * fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000) * added back the retro documents (NVIDIA#7033) * Remove pyyaml (NVIDIA#7052) (NVIDIA#7054) * st standalone model (NVIDIA#6969) * st standalone model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix * sacrebleu import fix, unused imports removed * import guard for nlp inside asr transformer bpe model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered * import ordering fix * yttm for asr removed * logging added * added inference and translate method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * remove pos emb from state dict for old models (NVIDIA#7068) * remove pos emb from state dict * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment * fix nmt test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test --------- * Fix typo in ASR-TTS tutorial (NVIDIA#7049) * Fixed tutorial's name (NVIDIA#7047) * Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- * Update Frame-VAD doc and fix onnx export (NVIDIA#7076) * update fvad doc * fix typo * update fvad example * update * fix onnx export * update test * refactor * update doc * update --------- * memmap worker arg (NVIDIA#7062) * memmap worker arg * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * update --------- * Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082) * Fast Conformer global token fix (NVIDIA#7085) * old way * fix * fix * fix * remove extra * clean * clean * clean * fix * fix * fix * fix * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Refined export_config (NVIDIA#7053) (NVIDIA#7066) * Refined export_config * Rolling back hierarchy change --------- * small Bugfix (NVIDIA#7081) * small Bugfix (NVIDIA#7079) * fix branch * fix typo * fix link --------- * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb --------- * Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092) * Added script to extract ctc and rnnt models from hybrid models * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag --------- * Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094) * update TTS readme (NVIDIA#7088) * update TTS readme --------- * Fix absolute path in path join call (NVIDIA#7099) * Disable distopt contiguous param buffer by default (NVIDIA#7095) * microphone demo (NVIDIA#7110) * [Fix] load_state_dict in nlp_model.py (NVIDIA#7086) * Fix load_state_dict in nlp_model.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Fix plot function in vad_utils.py (NVIDIA#7113) Fix plot function in vad_utils.py * Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118) * Fix import guard checks (NVIDIA#7124) * Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125) This reverts commit a46e325. * Fix import guard checks (NVIDIA#7126) * Fix import guard checks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130) * [TTS] Create EnCodec training recipe (NVIDIA#6852) * [TTS] Create EnCodec training recipe * [TTS] Update encodec recipe * [TTS] Rename EnCodec to AudioCodec * [TTS] Add EnCodec unit tests * [TTS] Add copyright header to distributed.py --------- * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061) * fix default attention size (NVIDIA#7141) (NVIDIA#7143) * fix evaluator.py for various exceptions by ast (NVIDIA#7150) * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. * unify configs into a single one and add detailed comments providing supported candidates. * choose 36-final IPA as default phoneme dict --------- * [TTS] Add output audio format to preprocessing (NVIDIA#6889) * [TTS] Add output audio format to preprocessing * [TTS] Add format validation * [TTS] Fix data tutorial --------- * freeze (NVIDIA#7152) * make sure any empty segments are removed (NVIDIA#7155) * Update RIR generation scripts (NVIDIA#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio * A quickstart speech enhancement tutorial (NVIDIA#6492) A simple example of training a model for speech enhancement task * NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160) * allow specifying colors of text in ASS subtitle file * specify vertical_alignment instead of marginv in ass_file_config * add documentation of CTMFileConfig and ASSFileConfig to NFA README --------- * Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153) * TE bug fix (NVIDIA#7027) (NVIDIA#7036) * [TTS] Remove nested TTS configs (NVIDIA#7154) * [TTS] Remove nested TTS configs * [TTS] Modify tutorial to support multiple sampling rates * [TTS] Clarify min_duration unit * [TTS] Default 22.05kHz highfreq to null --------- * Merge release r1.20.0 to main (NVIDIA#7167) * update package info * Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955) * Add ASR with TTS Tutorial * Fix enhancer usage * install_bs (NVIDIA#7019) * Fix typo and branch in tutorial (NVIDIA#7048) * fix syntax error introduced in PR-7079 (NVIDIA#7102) * fix syntax error introduced in PR-7079 * fixes for pr review --------- * fix links for TN (NVIDIA#7117) * update branch (NVIDIA#7135) * Fixed main and merging this to r1.20 (NVIDIA#7127) * Fixed main and merging this to r1.20 * Update vad_utils.py --------- * update branch * fix version * resolve conflict the other way * keep both * revert keep both --------- * Upgrade to pytorch lightning 2.0 (NVIDIA#6433) * Upgrade pytorch lightning version in requirements * Initial fixes for PTL2.0 * Add further fixes to support lightning 2.0 * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end * Replace all occurances of validation_epoch_end to on_validation_epoch_end * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively * Change logger=None to logger=False in Trainer object * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass * Modify trainer.precision check and other small edits * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer * Add default values for args to fix Attribute Error * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py * Revert an extra space that was mistakenly added * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing * Remove outputs arg from on_train_epoch_end * Remove outputs from on_validation_epoch_end in multi_binary_acc.py * Remove output args from on_validation_epoch_end in the docstrings of some ASR files * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs * Add on_validation_epoch_end and remove outputs args for nlp models * Append output of validation_step to validation_step_outputs in EncDecClassificationModel * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError * Add if condition check for multiple dataloaders when appending to validation outputs * Separate validation pass to be used with both validation_step and test_step * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 * Modify precision checks to account for 16-mixed and bf16-mixed * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml * Add split arg self.test_step_outputs to TextClassificationModel * Add test_step_outputs to dialogue and text classification models * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg * Add val/test_step_outputs to S2SQAModel and GPTQAModel * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN * Precision fix and skip few failing tests * Add missing comment lines in JenkinsFile * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py * Minor edit JenkinsFile * Minor edit in jenkins file * Edit in Jenkins file * Comment missed lines in Jenkins file * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files * Fix all CI TTS tests and comment few Jenkins tests * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py * Add a missing comment in JenkinsFile * Add try except StopIteration in validation_step for models with dataloader_iter * Remove pyyaml from requirements * Add try except for inference_step in megatron_finetune_model.py * Remove limit_val_batches for mockGPTDataset test * Add new self.validation_step_outputs for MegatronGPTSFTModel * Minor edit Jenkinsfile * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. * Remove resume_from_checkpoint if trainer arg in conf yaml files * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs * Remove resume_from_checkpoint in duplex_tn_config.yaml * Fix typos, unused imports and refactor code to remove redundant funcs * Remove commented code in megatron_nmt_model.py * Fix overriden functions to match parent class functions * Prefetch dataloader_iter to prevent hang for PP>1 * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 * Uncomment tests in JenkinsFile * Add '16' to precision checks and other minor fixes * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders * Minor edits * Modify precision checks to avoid indexing * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs * Reference checkpoint with trainer.ckpt_path * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT --------- * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112) * scripts for sft * fix style * adde special token only for huggingface model * change default name * print out error datapoint content * show error id * annotation script working * try to be compatible with huggingface tokenizer * added examples * added lang * added lang * text to value special case * configure the slider * annoatation handles lang * added the unit test for chat sft dataset * used the file in the test dir * fix json error * load local tokenizer * remove mask count check * added HF dataset backend * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * add paths to labeler. (NVIDIA#7087) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Adi Renduchintala <adithyar… Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> Signed-off-by: tbartley94 <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: AlexGrinch <[email protected]> Signed-off-by: Vitaly Lavrukhin <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: Linnea Pari Leaver <[email protected]> Signed-off-by: Xin Yao <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Cheng-Ping Hsieh <[email protected]> Signed-off-by: hsiehjackson <[email protected]> Signed-off-by: Cheng-Ping Hsieh <[email protected]> Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Vahid <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: ekmb <[email protected]> Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Signed-off-by: Mostafa Ghorbandoost <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Kunal Dhawan <[email protected]> Signed-off-by: andrusenkoau <[email protected]> Signed-off-by: Andrei Andrusenko <[email protected]> Signed-off-by: KunalDhawan <[email protected]> Signed-off-by: Greg Clark <[email protected]> Signed-off-by: Eric Harper <[email protected]> Signed-off-by: Jan Baczek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Olivier Delalleau <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: jasonwan <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: Guyue Huang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Siddharth Tyagi <[email protected]> Signed-off-by: Abhishree Thittenamane <[email protected]> Signed-off-by: Jason Wang <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Alireza Morsali <[email protected]> Signed-off-by: Siddharth Tyagi <[email protected]> Signed-off-by: dorotat <[email protected]> Signed-off-by: mburchi <[email protected]> Signed-off-by: Maxime Burchi <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Xin Yao <[email protected]> Signed-off-by: Hongbin Liu <[email protected]> Signed-off-by: Alexander Jipa <[email protected]> Signed-off-by: omahs <[email protected]> Signed-off-by: lhb8125 <[email protected]> Signed-off-by: Robin Dong <[email protected]> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: George Zelenfroynd <[email protected]> Signed-off-by: Anton Peganov <[email protected]> Signed-off-by: Samuele Cornell <[email protected]> Signed-off-by: Jason <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Tamerlan Tabolov <[email protected]> Signed-off-by: zhehuaichen <[email protected]> Co-authored-by: trias702 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Ryan Langman <[email protected]> Co-authored-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: anteju <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jan Beckmann <[email protected]> Co-authored-by: lleaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Xin Yao <[email protected]> Co-authored-by: anmolgupt <[email protected]> Co-authored-by: ANMOL GUPTA <[email protected]> Co-authored-by: Cheng-Ping Hsieh <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Ante Jukić <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Sean Naren <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Sean Naren <[email protected]> Co-authored-by: Neha Tadimeti <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Jim O’Regan <[email protected]> Co-authored-by: Mostafa Ghorbandoost <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Kunal Dhawan <[email protected]> Co-authored-by: Andrei Andrusenko <[email protected]> Co-authored-by: Greg Clark <[email protected]> Co-authored-by: jbaczek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Olivier Delalleau <[email protected]> Co-authored-by: Jason Wang <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: guyueh1 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Igor Gitman <[email protected]> Co-authored-by: styagi130 <[email protected]> Co-authored-by: Siddharth Tyagi <[email protected]> Co-authored-by: Cheng-Ping Hsieh <[email protected]> Co-authored-by: Alireza Morsali <[email protected]> Co-authored-by: styagi130 <[email protected]> Co-authored-by: dorotat-nv <[email protected]> Co-authored-by: Maxime Burchi <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: Hongbin Liu <[email protected]> Co-authored-by: Kelvin Liu <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Alexander Jipa <[email protected]> Co-authored-by: Alexander Jipa <[email protected]> Co-authored-by: omahs <[email protected]> Co-authored-by: Robin Dong <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Co-authored-by: Samuele Cornell <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Igor Gitman <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: Tamerlan Tabolov <[email protected]>

…IDIA#7034)

…IDIA#7034) Signed-off-by: zhehuaichen <[email protected]>

* Fix race condition when executing with multi-node where some ranks does not wait for setup (NVIDIA#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (NVIDIA#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (NVIDIA#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (NVIDIA#7022) (NVIDIA#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (NVIDIA#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (NVIDIA#7019) (NVIDIA#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (NVIDIA#6994) (NVIDIA#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (NVIDIA#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (NVIDIA#7052) (NVIDIA#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (NVIDIA#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (NVIDIA#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (NVIDIA#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (NVIDIA#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (NVIDIA#7065) (NVIDIA#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (NVIDIA#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (NVIDIA#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (NVIDIA#7034) (NVIDIA#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (NVIDIA#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (NVIDIA#7053) (NVIDIA#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (NVIDIA#7081) * small Bugfix (NVIDIA#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (NVIDIA#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (NVIDIA#7067) (NVIDIA#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (NVIDIA#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (NVIDIA#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (NVIDIA#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (NVIDIA#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (NVIDIA#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (NVIDIA#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (NVIDIA#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (NVIDIA#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (NVIDIA#7124)" (NVIDIA#7125) This reverts commit ae7624d. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (NVIDIA#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (NVIDIA#7128) (NVIDIA#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (NVIDIA#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (NVIDIA#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (NVIDIA#7141) (NVIDIA#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (NVIDIA#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (NVIDIA#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (NVIDIA#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (NVIDIA#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (NVIDIA#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (NVIDIA#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (NVIDIA#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (NVIDIA#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) (NVIDIA#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (NVIDIA#7027) (NVIDIA#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (NVIDIA#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (NVIDIA#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (NVIDIA#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (NVIDIA#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (NVIDIA#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (NVIDIA#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (NVIDIA#7117) Signed-off-by: Evelina <[email protected]> * update branch (NVIDIA#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (NVIDIA#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (NVIDIA#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (NVIDIA#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * add paths to labeler. (NVIDIA#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> Signed-off-by: tbartley94 <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: AlexGrinch <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vitaly Lavrukhin <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Linnea Pari Leaver <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Abhishree <[email protected]> Co-authored-by: Kim Ngo <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: trias702 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jan Beckmann <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: lleaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Ryan Langman <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: anteju <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]>

* migrated class Signed-off-by: dorotat <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: dorotat <[email protected]> * added unit test Signed-off-by: dorotat <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: dorotat <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: dorotat <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: dorotat <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: dorotat <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: dorotat <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: dorotat <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: dorotat <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit ae7624da7d773a6b9436ff61903dc4b99c7c27cb. Signed-off-by: dorotat <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: dorotat <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: dorotat <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: dorotat <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: dorotat <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: dorotat <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: dorotat <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: dorotat <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: dorotat <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: dorotat <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dorotat <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: dorotat <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: dorotat <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint in duplex_tn_config.yaml Signed-off-by: Abhishree <[email protected]> * Fix typos, unused imports and refactor code to remove redundant funcs Signed-off-by: Abhishree <[email protected]> * Remove commented code in megatron_nmt_model.py Signed-off-by: Abhishree <[email protected]> * Fix overriden functions to match parent class functions Signed-off-by: Abhishree <[email protected]> * Prefetch dataloader_iter to prevent hang for PP>1 Signed-off-by: Abhishree <[email protected]> * Override setup() in NLPDDPStrategy to avoid hang during predict with PP>1 Signed-off-by: Abhishree <[email protected]> * Uncomment tests in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add '16' to precision checks and other minor fixes Signed-off-by: Abhishree <[email protected]> * Clear validation/test_step_outputs with dataloader_idx for multi dataloaders Signed-off-by: Abhishree <[email protected]> * Minor edits Signed-off-by: Abhishree <[email protected]> * Modify precision checks to avoid indexing Signed-off-by: Abhishree <[email protected]> * Remove self.validation_step_outputs_sft and add dataloader_idx to clear outputs Signed-off-by: Abhishree <[email protected]> * Reference checkpoint with trainer.ckpt_path Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add _prefetch to NLPModel and minor fixes Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add limit_val_batches in JenkinsFile for NMT 1) Add trainer.limit_val_batches in Megatron NMT Training TP=2 2) Remove unused import in ModelPT Signed-off-by: Abhishree <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * Include the scripts for preprocessing OAST and unit tests for chat sft datasets (#7112) * scripts for sft Signed-off-by: Yi Dong <[email protected]> * fix style Signed-off-by: Yi Dong <[email protected]> * adde special token only for huggingface model Signed-off-by: Yi Dong <[email protected]> * change default name Signed-off-by: Yi Dong <[email protected]> * print out error datapoint content Signed-off-by: Yi Dong <[email protected]> * show error id Signed-off-by: Yi Dong <[email protected]> * annotation script working Signed-off-by: Yi Dong <[email protected]> * try to be compatible with huggingface tokenizer Signed-off-by: Yi Dong <[email protected]> * added examples Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * added lang Signed-off-by: Yi Dong <[email protected]> * text to value special case Signed-off-by: Yi Dong <[email protected]> * configure the slider Signed-off-by: Yi Dong <[email protected]> * annoatation handles lang Signed-off-by: Yi Dong <[email protected]> * added the unit test for chat sft dataset Signed-off-by: Yi Dong <[email protected]> * used the file in the test dir Signed-off-by: Yi Dong <[email protected]> * fix json error Signed-off-by: Yi Dong <[email protected]> * load local tokenizer Signed-off-by: Yi Dong <[email protected]> * remove mask count check Signed-off-by: Yi Dong <[email protected]> * added HF dataset backend Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yi Dong <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: dorotat <[email protected]> * add paths to labeler. (#7087) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dorotat <[email protected]> * T5 metrics fix (#7037) * Fix race condition when executing with multi-node where some ranks does not wait for setup (#7016) Signed-off-by: Kim Ngo <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added bool types to neural_types export (#7032) Signed-off-by: tbartley94 <[email protected]> Signed-off-by: jubick1337 <[email protected]> * rnnt and char utils (#6971) * rnnt_ngram_merge Signed-off-by: Nikolay Karpov <[email protected]> * char level bug Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix tab text gen (#7022) (#7031) Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * Fixed kwargs for metric instance init Signed-off-by: jubick1337 <[email protected]> * removed kwagrs Signed-off-by: jubick1337 <[email protected]> * Updated config desc Signed-off-by: jubick1337 <[email protected]> * ASR Confidence update and tutorial (#6810) * small fixes and tests Signed-off-by: Aleksandr Laptev <[email protected]> * various fixes for the tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * tutorial added Signed-off-by: Aleksandr Laptev <[email protected]> * for for a little oops after rebasement Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests Signed-off-by: Aleksandr Laptev <[email protected]> * unused import removed Signed-off-by: Aleksandr Laptev <[email protected]> * fix review comments Signed-off-by: Aleksandr Laptev <[email protected]> * deprecated parameters for greedy configs Signed-off-by: Aleksandr Laptev <[email protected]> * move re-assigning to configs Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments 2 Signed-off-by: Aleksandr Laptev <[email protected]> * fix config tests Signed-off-by: Aleksandr Laptev <[email protected]> * fix ece test (my env was bugged apparently) Signed-off-by: Aleksandr Laptev <[email protected]> * renamings for confidence ensemble Signed-off-by: Aleksandr Laptev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fox comments 3 Signed-off-by: Aleksandr Laptev <[email protected]> * return dropped tutorial Signed-off-by: Aleksandr Laptev <[email protected]> * CI flips back and forth, increasing tolerance Signed-off-by: Aleksandr Laptev <[email protected]> --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * install_bs (#7019) (#7028) Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fixes for spellmapper (#6994) (#7000) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: jubick1337 <[email protected]> * added back the retro documents (#7033) Signed-off-by: Yi Dong <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Remove pyyaml (#7052) (#7054) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * st standalone model (#6969) * st standalone model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: AlexGrinch <[email protected]> * sacrebleu import fix, unused imports removed Signed-off-by: AlexGrinch <[email protected]> * import guard for nlp inside asr transformer bpe model Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql fixes Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comments answered Signed-off-by: AlexGrinch <[email protected]> * import ordering fix Signed-off-by: AlexGrinch <[email protected]> * yttm for asr removed Signed-off-by: AlexGrinch <[email protected]> * logging added Signed-off-by: AlexGrinch <[email protected]> * added inference and translate method Signed-off-by: AlexGrinch <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * remove pos emb from state dict for old models (#7068) * remove pos emb from state dict Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to nlp_model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update comment Signed-off-by: Evelina <[email protected]> * fix nmt test Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix nmt test Signed-off-by: Evelina <[email protected]> --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix typo in ASR-TTS tutorial (#7049) Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed tutorial's name (#7047) Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix documentation for Numba (#7065) (#7077) * Fix documentation for Numba * Update force float32 flag dynamically * Update force float32 flag dynamically * Fix nemo version --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update Frame-VAD doc and fix onnx export (#7076) * update fvad doc Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update fvad example Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * fix onnx export Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: jubick1337 <[email protected]> * memmap worker arg (#7062) * memmap worker arg Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix caching bug in causal convolutions for cache-aware ASR models (#7034) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fast Conformer global token fix (#7085) * old way Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * remove extra Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * clean Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sam1373 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Refined export_config (#7053) (#7066) * Refined export_config * Rolling back hierarchy change --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Signed-off-by: jubick1337 <[email protected]> * small Bugfix (#7081) * small Bugfix (#7079) * fix branch Signed-off-by: fayejf <[email protected]> * fix typo Signed-off-by: fayejf <[email protected]> * fix link Signed-off-by: fayejf <[email protected]> --------- Signed-off-by: fayejf <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> * Update tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: fayejf <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Added script to extract ASR CTC and RNNT models from ASR hybrid models (#7092) * Added script to extract ctc and rnnt models from hybrid models Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid extraction script for review request 1 Signed-off-by: Daniel Egert <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updated hybrid convert script to remove --cuda flag Signed-off-by: Daniel Egert <[email protected]> --------- Signed-off-by: Daniel Egert <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Adding docs and models for multiple lookahead cache-aware ASR (#7067) (#7094) Signed-off-by: jubick1337 <[email protected]> * update TTS readme (#7088) * update TTS readme Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix absolute path in path join call (#7099) Signed-off-by: Jan Beckmann <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Disable distopt contiguous param buffer by default (#7095) Signed-off-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * microphone demo (#7110) Signed-off-by: Linnea Pari Leaver <[email protected]> Co-authored-by: Linnea Pari Leaver <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [Fix] load_state_dict in nlp_model.py (#7086) * Fix load_state_dict in nlp_model.py Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Fix plot function in vad_utils.py (#7113) Fix plot function in vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fixed small bug with NoisePerturbationWithNormalization (#7118) Signed-off-by: Daniel Egert <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7124) Signed-off-by: smajumdar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Revert "Fix import guard checks (#7124)" (#7125) This reverts commit ae7624da7d773a6b9436ff61903dc4b99c7c27cb. Signed-off-by: jubick1337 <[email protected]> * Fix import guard checks (#7126) * Fix import guard checks Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: jubick1337 <[email protected]> * Add updated fc ctc and rnnt xxl models (#7128) (#7130) Signed-off-by: jubick1337 <[email protected]> * [TTS] Create EnCodec training recipe (#6852) * [TTS] Create EnCodec training recipe Signed-off-by: Ryan <[email protected]> * [TTS] Update encodec recipe Signed-off-by: Ryan <[email protected]> * [TTS] Rename EnCodec to AudioCodec Signed-off-by: Ryan <[email protected]> * [TTS] Add EnCodec unit tests Signed-off-by: Ryan <[email protected]> * [TTS] Add copyright header to distributed.py Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Fix rank where torch.distributed may not be initialized yet and would not wait for tokenizer file caching (#7061) Signed-off-by: Kim Ngo <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: jubick1337 <[email protected]> * fix default attention size (#7141) (#7143) Signed-off-by: jubick1337 <[email protected]> * fix evaluator.py for various exceptions by ast (#7150) Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. (#6893) * [TTS] add Chinese TTS recipe based on IPA. * add new pinyin and ipa dictionaries with 36 finals. * add yaml configs for 24-final pinyin and ipa. * add copyright header * add a directory level 24finals to discriminate from 36 finals. Signed-off-by: Xuesong Yang <[email protected]> * unify configs into a single one and add detailed comments providing supported candidates. Signed-off-by: Xuesong Yang <[email protected]> * choose 36-final IPA as default phoneme dict Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Add output audio format to preprocessing (#6889) * [TTS] Add output audio format to preprocessing Signed-off-by: Ryan <[email protected]> * [TTS] Add format validation Signed-off-by: Ryan <[email protected]> * [TTS] Fix data tutorial Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * freeze (#7152) Signed-off-by: arendu <[email protected]> Signed-off-by: jubick1337 <[email protected]> * make sure any empty segments are removed (#7155) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Update RIR generation scripts (#6547) - fix: reduce room size if evaluation of params fails - added randomized mic placement - added diffuse noise generation - added an option to specify the format and subtype for saved audio Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * A quickstart speech enhancement tutorial (#6492) A simple example of training a model for speech enhancement task Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: jubick1337 <[email protected]> * NFA subtitle file config - specify colors and vertical alignment (#7160) * allow specifying colors of text in ASS subtitle file Signed-off-by: Elena Rastorgueva <[email protected]> * specify vertical_alignment instead of marginv in ass_file_config Signed-off-by: Elena Rastorgueva <[email protected]> * add documentation of CTMFileConfig and ASSFileConfig to NFA README Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Eagerly accumulate embedding grads into fp32 buffer (#6958) (#7153) Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: jubick1337 <[email protected]> * TE bug fix (#7027) (#7036) Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Signed-off-by: jubick1337 <[email protected]> * [TTS] Remove nested TTS configs (#7154) * [TTS] Remove nested TTS configs Signed-off-by: Ryan <[email protected]> * [TTS] Modify tutorial to support multiple sampling rates Signed-off-by: Ryan <[email protected]> * [TTS] Clarify min_duration unit Signed-off-by: Ryan <[email protected]> * [TTS] Default 22.05kHz highfreq to null Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Merge release r1.20.0 to main (#7167) * update package info Signed-off-by: ericharper <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * Fix typo and branch in tutorial (#7048) Signed-off-by: Vladimir Bataev <[email protected]> * fix syntax error introduced in PR-7079 (#7102) * fix syntax error introduced in PR-7079 Signed-off-by: Alexandra Antonova <[email protected]> * fixes for pr review Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * fix links for TN (#7117) Signed-off-by: Evelina <[email protected]> * update branch (#7135) Signed-off-by: ericharper <[email protected]> * Fixed main and merging this to r1.20 (#7127) * Fixed main and merging this to r1.20 Signed-off-by: Taejin Park <[email protected]> * Update vad_utils.py Signed-off-by: He Huang (Steve) <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * fix version Signed-off-by: ericharper <[email protected]> * resolve conflict the other way Signed-off-by: ericharper <[email protected]> * keep both Signed-off-by: ericharper <[email protected]> * revert keep both Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: ericharper <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Evelina <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: jubick1337 <[email protected]> * Upgrade to pytorch lightning 2.0 (#6433) * Upgrade pytorch lightning version in requirements Signed-off-by: Abhishree <[email protected]> * Initial fixes for PTL2.0 Signed-off-by: Abhishree <[email protected]> * Add further fixes to support lightning 2.0 Signed-off-by: Abhishree <[email protected]> * Add replacements for replace_sampler_ddp, resume_from_checkpoint_fit_path and few occurances of validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace all occurances of validation_epoch_end to on_validation_epoch_end Signed-off-by: Abhishree <[email protected]> * Replace training_epoch_end, test_epoch_end with on_train_epoch_end and on_test_epoch_end respectively Signed-off-by: Abhishree <[email protected]> * Change logger=None to logger=False in Trainer object Signed-off-by: Abhishree <[email protected]> * Remove PTL2.0 deprecated Trainer args from TrainerConfig dataclass Signed-off-by: Abhishree <[email protected]> * Modify trainer.precision check and other small edits Signed-off-by: Abhishree <[email protected]> * Replace logger=None with logger=False in test_ptl_stateless_timer.py Trainer Signed-off-by: Abhishree <[email protected]> * Add default values for args to fix Attribute Error Signed-off-by: Abhishree <[email protected]> * Add the following modifications 1) Remove outputs arg from on_validation_epoch_end, on_test_epoch_end and make it an arg of the class 2) Replace resume_from_checkpoint with ckpt_path as needed 3) Explicitly add accelerator as 'CPU' in UTs being run on CPU Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_validation_epoch_end, on_test_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs arg in on_validation_epoch_end in MultiBinaryAccuracy docstrings Signed-off-by: Abhishree <[email protected]> * Add val, test outputs as instance vars in PunctuationCapitalizationModel and TokenClassificationModel Signed-off-by: Abhishree <[email protected]> * Replace trainer.fit_loop.max_steps with trainer.fit_loop.epoch_loop.max_steps in test_optimizers_schedulers.py Signed-off-by: Abhishree <[email protected]> * Revert an extra space that was mistakenly added Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ema.py for uniformity Signed-off-by: Abhishree <[email protected]> * Use self.validation_step_outputs and self.test_step_outputs in test_ptl_stateless_timer.py and check_for_ranks.py for uniformity Signed-off-by: Abhishree <[email protected]> * Add self.validation_step_outputs.clear() and self.test_step_outputs.clear() wherever missing Signed-off-by: Abhishree <[email protected]> * Remove outputs arg from on_train_epoch_end Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_validation_epoch_end in multi_binary_acc.py Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end in the docstrings of some ASR files Signed-off-by: Abhishree <[email protected]> * Remove output args from on_validation_epoch_end and clear memory from validation_step_outputs Signed-off-by: Abhishree <[email protected]> * Add on_validation_epoch_end and remove outputs args for nlp models Signed-off-by: Abhishree <[email protected]> * Append output of validation_step to validation_step_outputs in EncDecClassificationModel Signed-off-by: Abhishree <[email protected]> * Add the following changes 1) Index self.validation_step_outputs and self.test_step.outputs with dataloader_idx wherever needed 2) Initialize self.validation_step_outputs and self.test_step.outputs as empty lists and add support for multi dataloaders if they exist 3) Remove self.pre_configure_ddp from NLPDDPStrategy class as its removed in PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Add default value dataloader_idx=0 for on_validation_batch_end() in megatron_base_model.py Signed-off-by: Abhishree <[email protected]> * TypeCast precision to str in attention.py and utils_funcs.py to avoid TypeError Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloaders when appending to validation outputs Signed-off-by: Abhishree <[email protected]> * Separate validation pass to be used with both validation_step and test_step Signed-off-by: Abhishree <[email protected]> * Add if condition check for multiple dataloader while appending to test_step_outputs in punctuation_capitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add condition check for multiple dataloaders based on type of trainer.val/test_dataloaders or self._validation/test_dl instead of len Signed-off-by: Abhishree <[email protected]> * Comment Megatron T5 IA3 PP=2 in CI pipeline due to dataloader_iter issue with PTL 2.0 Signed-off-by: Abhishree <[email protected]> * Modify precision checks to account for 16-mixed and bf16-mixed Signed-off-by: Abhishree <[email protected]> * Append output of validation/test_step to self.validation/test_step_outputs in CTCG2PModel Signed-off-by: Abhishree <[email protected]> * Modify find_unused_parameters=True in g2p_heteronym model 1) Add find_unused_parameters=True for DDP strategy in g2p_heteronym_classification_train_and_evaluate.py 2) Remove args output in validation/test_step and add instance variables instead for heteronym_classification.py Signed-off-by: Abhishree <[email protected]> * Remove outputs from on_test_epoch_end in DialogueGPTClassificationModel Signed-off-by: Abhishree <[email protected]> * Add validation/test outputs in sgdqa_model and modify dialogue_config.yaml Signed-off-by: Abhishree <[email protected]> * Add split arg self.test_step_outputs to TextClassificationModel Signed-off-by: Abhishree <[email protected]> * Add test_step_outputs to dialogue and text classification models Signed-off-by: Abhishree <[email protected]> * Change condition check for multiple dataloaders: 1) Replace ds_item as list in dialogue_config.yaml 2) Check for len of val/test_dataloaders or validation/test_dl along with type check of list in sgdqa_model.py while appending outputs of validation/test_step 3) Check for len of _validation/test_dl for creating self.validation/test_step_outputs in ModelPT and punctuation_cpitalization_model.py Signed-off-by: Abhishree <[email protected]> * Add additional condition for multi dataloaders Check len(self.trainer.val/test_dataloaders) > 1 along with type(self.trainer.val/test_dataloaders) == list for multi dataloaders in validation/test_step Signed-off-by: Abhishree <[email protected]> * Add val step outputs and default val for dataloader_idx 1) Append validation_step outout to self.validation_step_outputs in MultiLabelIntentSlotClassificationMode 2) Add default val for dataloader_idx for on_test_batch_start/end in TimingCallback 3) Add self.validation/test_step_outputs in BERTQAModel and remove outputs arg Signed-off-by: Abhishree <[email protected]> * Add val/test_step_outputs to S2SQAModel and GPTQAModel Signed-off-by: Abhishree <[email protected]> * Edit JenkinsFile for bert_pretrainig.py Edit Jenkinsfile for this test to disable validation as a workaround for trainer.val_dataloader None error Signed-off-by: Abhishree <[email protected]> * Modify precision to support 16-mixed, bf16-mixed in megatron_gpt_pretraining.py Signed-off-by: Abhishree <[email protected]> * Add ddp_find_unused_parameters_true and remove output args 1) Add ddp_find_unused_parameters_true fro trainer.strategy in self_alignment_pretraining.py as it has unused parameters 2) Remove output args and add self.validation/test_step_outputs to validation/test_step in mt_enc_dec_model.py 3) Comment tests in JenkinsFile that need to be fixed Signed-off-by: Abhishree <[email protected]> * Precision fix in megatron_nmt_training.py for 16-mixed, bf16-mixed Signed-off-by: Abhishree <[email protected]> * Precision fix for megatron_bert_pretraining.py and megatron_bert_model.py Signed-off-by: Abhishree <[email protected]> * Precision fix and validation/test_step_outputs 1) Add fix to account for 16-mixed and bf16-mixed in megatron_retro_mutransfer_pretrain.py, megatron_retro_pretraining.py 2) Reset ckpt_path for test in enc_dec_nmt.py 3) Remove outputs args and add validation/test_step_outputs in megatron_retrieval_model.py 4) Comment Megatron Bert Pretraining and Resume Training with Pipeline Paralleism and add back NMT Training Post-LN Signed-off-by: Abhishree <[email protected]> * Precision fix and skip few failing tests Signed-off-by: Abhishree <[email protected]> * Add missing comment lines in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Comment jenkin tests and super().on_validation_epoch_end() in megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Minor edit JenkinsFile Signed-off-by: Abhishree <[email protected]> * Minor edit in jenkins file Signed-off-by: Abhishree <[email protected]> * Edit in Jenkins file Signed-off-by: Abhishree <[email protected]> * Comment missed lines in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test outputs 1) Add precision fix to account for 16-mixed and bf16-mixed in megatron_t5_pretraining.py 2) Remove outputs args and add append loss to self.validation/test_step_outputs in megatron_lm_encoder_decoder_model.py 3) Add back resume_from_checkpoint in the megatron_t5_config.yaml 4) Comment out certain tests in Jenkins file Signed-off-by: Abhishree <[email protected]> * Fix precision and validation/test/predict errors in megatron_t5_prompt_learning.py Signed-off-by: Abhishree <[email protected]> * Precision fix and edit precision typo in all files 1) Account for 16-mixed and bf16-mixed in megatron_bart_pretraining.py and megatron_t5_seq2seq_finetune.py 2) Fix precision typo in all files Signed-off-by: Abhishree <[email protected]> * Fix all CI TTS tests and comment few Jenkins tests Signed-off-by: Abhishree <[email protected]> * Combine xx_epoch_end and on_xx_epoch_end Add on_inference_epoch_end to inference_epoch_end function and have a single on_validation/test_epoch_end in megatron_finetune_model.py and megatron_gpt_sft_model.py Signed-off-by: Abhishree <[email protected]> * Add a missing comment in JenkinsFile Signed-off-by: Abhishree <[email protected]> * Add try except StopIteration in validation_step for models with dataloader_iter Signed-off-by: Abhishree <[email protected]> * Remove pyyaml from requirements Signed-off-by: Abhishree <[email protected]> * Add try except for inference_step in megatron_finetune_model.py Signed-off-by: Abhishree <[email protected]> * Remove limit_val_batches for mockGPTDataset test Signed-off-by: Abhishree <[email protected]> * Add new self.validation_step_outputs for MegatronGPTSFTModel Signed-off-by: Abhishree <[email protected]> * Minor edit Jenkinsfile Signed-off-by: Abhishree <[email protected]> * Initialize self.validation/test_step_outputs in megatron_gpt_sft_model.py Initialize self.validation/test_step_outputs in setup of MegatronGPTSFTModel to take care of cases when datalaoders are not setup in ModelPT for example while restoring the model. Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint if trainer arg in conf yaml files Signed-off-by: Abhishree <[email protected]> * Remove resume_from_checkpoint as trainer arg in GPT, T5 configs Signed-off-by: Abhishree <abhishreetm@gmai…

fixed the caching of convolutions.

c7df539

Signed-off-by: vnoroozi <[email protected]>

VahidooX requested a review from borisfom July 14, 2023 00:09

github-actions bot added the ASR label Jul 14, 2023

[pre-commit.ci] auto fixes from pre-commit.com hooks

abcc354

for more information, see https://pre-commit.ci

borisfom approved these changes Jul 14, 2023

View reviewed changes

VahidooX and others added 6 commits July 14, 2023 00:59

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

fc25e11

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

8878bf4

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

0d95178

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

c78ef4d

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

74b0ae3

Merge branch 'r1.20.0' into fix_cacheaware_conv_cache

1df407d

VahidooX merged commit 9ba5277 into NVIDIA:r1.20.0 Jul 20, 2023
9 checks passed

github-actions bot pushed a commit that referenced this pull request Jul 20, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (#7034

ed0604b

)

VahidooX added a commit that referenced this pull request Jul 20, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (#7034

03acd6c

) (#7082) Co-authored-by: Vahid Noroozi <[email protected]>

jubick1337 pushed a commit that referenced this pull request Aug 8, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (#7034

103d94f

) (#7082) Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: jubick1337 <[email protected]>

zhehuaichen mentioned this pull request Sep 22, 2023

sync after 6915 (#14) zhehuaichen/NeMo#15

Merged

8 tasks

zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (NV…

927d176

…IDIA#7034)

zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (NV…

67d7966

…IDIA#7034)

zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (NV…

2a7560a

…IDIA#7034)

zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (NV…

e5fc978

…IDIA#7034) Signed-off-by: zhehuaichen <[email protected]>

zhehuaichen pushed a commit to zhehuaichen/NeMo that referenced this pull request Oct 4, 2023

Fix caching bug in causal convolutions for cache-aware ASR models (NV…

0858a14

…IDIA#7034) Signed-off-by: zhehuaichen <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix caching bug in causal convolutions for cache-aware ASR models #7034

Fix caching bug in causal convolutions for cache-aware ASR models #7034

VahidooX commented Jul 14, 2023

Fix caching bug in causal convolutions for cache-aware ASR models #7034

Fix caching bug in causal convolutions for cache-aware ASR models #7034

Conversation

VahidooX commented Jul 14, 2023

What does this PR do ?

Changelog