Skip to content

Commit 91833b8

Browse files
titu1994zhehuaichen
authored andcommitted
Add support for Numba FP16 RNNT Loss (NVIDIA#6991) (NVIDIA#7038)
* Force working space memory to always be in fp32 Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Fix cost calculation by upcasting to fp32 Signed-off-by: smajumdar <[email protected]> * Fix cost calculation by upcasting to fp32 Signed-off-by: smajumdar <[email protected]> * Add support to check if numba fp16 is available Signed-off-by: smajumdar <[email protected]> * add RNN-T loss implemented by PyTorch and test code (#5312) * Fix the bugs in cache-aware streaming Conformer (#5032) Signed-off-by: Vahid <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * IA3 support for GPT and T5 (#4909) * init commit for ia3 adater training in GPT Signed-off-by: arendu <[email protected]> * ia3 adater training in GPT, models and adapter classes Signed-off-by: arendu <[email protected]> * reshape to operate even on non-contiguous tensors Signed-off-by: arendu <[email protected]> * configs Signed-off-by: arendu <[email protected]> * fixed none init Signed-off-by: arendu <[email protected]> * adding adapter and ia3 support for T5 based models Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * config update and t5 model adapter and ia3 Signed-off-by: arendu <[email protected]> * removed unused imports Signed-off-by: arendu <[email protected]> * predict step for inference Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * adapter inference for t5 Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * fixed bug micro and global batch size in eval Signed-off-by: arendu <[email protected]> * minor edit Signed-off-by: arendu <[email protected]> * agressive truncation if in test examples if no truncation field is given Signed-off-by: arendu <[email protected]> * corrected for language_model_path name changes in main Signed-off-by: arendu <[email protected]> * removed unused import Signed-off-by: arendu <[email protected]> * name change for language_model_path Signed-off-by: arendu <[email protected]> * include inter_attention to IA3 Signed-off-by: arendu <[email protected]> * minor fix in confg Signed-off-by: arendu <[email protected]> * minor fixes Signed-off-by: arendu <[email protected]> * removed unused flag Signed-off-by: arendu <[email protected]> * addressing PR comments Signed-off-by: arendu <[email protected]> * address PR comments Signed-off-by: arendu <[email protected]> * minor fix Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: arendu <[email protected]> * CI test Signed-off-by: arendu <[email protected]> * minor fix in jenkinsfile Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Bug fix - Limit val batches set to 1.0 (#5023) * Bug fix Signed-off-by: shanmugamr1992 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adressed sandeep's comments * Fixing limit val batches support in bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing limit val batches support in bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: shanmugamr1992 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [bug_fix] kv_channels is used when available (#5066) * fix bug s.t kv_channels is used when available Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * P&C Docs (#5068) (#5069) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add spe_split_by_unicode_script arg (#5072) * Add spe_split_by_unicode_script arg Signed-off-by: Anas <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Anas <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * probabilites -> probabilities (#5078) (#5079) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * increase PR and Issue sweep quantity and active close PRs. (#5073) * increase PR and Issue sweep quantity and active close PRs. Signed-off-by: Xuesong Yang <[email protected]> * update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] added missing German phoneme tokenizer. (#5070) (#5074) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * rename to match prompt leanring (#5076) Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061) * Fixes to seq2seq eval Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Notebook bug fixes (#5084) (#5085) * Notebook bug fixes Signed-off-by: Virginia Adams <[email protected]> * Turned nemo install back on Signed-off-by: Virginia Adams <[email protected]> * reverted notebook Signed-off-by: Virginia Adams <[email protected]> * Updated one line in entity linking nb Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * update strategy in notebook from ddp_fork to dp (#5088) (#5089) Co-authored-by: Zhilin Wang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix bug in Squeezeformer Conv block (#5011) (#5024) * Fix bug in Squeezeformer Conv block Signed-off-by: smajumdar <[email protected]> * Fix kernel context Signed-off-by: smajumdar <[email protected]> * Fix access mixin Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fixed megatron lm conversion bug (PTL related) (#5038) (#5063) Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix numba (#5098) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Make it possible to specify output_filename in normalize_with_audio.py (#5092) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Greedy decoding confidence for CTC and RNNT (#4931) * rnnt confidence draft Signed-off-by: Aleksandr Laptev <[email protected]> * word confidence Signed-off-by: Aleksandr Laptev <[email protected]> * advanced entropies added Signed-off-by: Aleksandr Laptev <[email protected]> * refactoring Signed-off-by: Aleksandr Laptev <[email protected]> * oops forgot a file Signed-off-by: Aleksandr Laptev <[email protected]> * metrics and benchmarking script added Signed-off-by: Aleksandr Laptev <[email protected]> * style fix Signed-off-by: Aleksandr Laptev <[email protected]> * texterrors installation added Signed-off-by: Aleksandr Laptev <[email protected]> * lgtm and bug fix Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments Signed-off-by: Aleksandr Laptev <[email protected]> * fix typos Signed-off-by: Aleksandr Laptev <[email protected]> * add missing import after rebase Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Add] SLURP models and examples (#4668) * add model, util and loss Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * refactor annd update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * update available models Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * refactor data processing Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * refactor and update Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * move transformer to asr.modules Signed-off-by: stevehuang52 <[email protected]> * move transformer to asr.modules Signed-off-by: stevehuang52 <[email protected]> * get rid of jsonlines Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * revert changes to nlp Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * only optimize params that are part of the adapter modules (#5086) Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Pipeline Parallel T5 Prompt Learning (#4956) * Added pre process flag checks and pipeline parallel in fwd Signed-off-by: Virginia Adams <[email protected]> * Added rank check for pipeline parallel Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * T5 prompt learning works! Signed-off-by: Virginia Adams <[email protected]> * IA3 passing CI Signed-off-by: Virginia Adams <[email protected]> * Fixed typo Signed-off-by: Virginia Adams <[email protected]> * removed optimizer setup so Adi's change will not conflict Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * [TTS] remove phonemizer.py (#5090) remove phonemizer.py and convert code block to markdown in the tutorial. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 Decoding with PP > 2 fix (#5091) (#5103) * set sequence lenghts in the pipeline properly Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102) * fixed hifigan configs as well * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Fix and refactor consumed samples save/restore for Megatron models. (#5077) * Fixes and refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Remove unused imports Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * RIR corpus generator tool (#4927) Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Multiprocessing fix (#5106) (#5107) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Bug fix] PC lexical + audio (#5109) (#5110) * training running Signed-off-by: ekmb <[email protected]> * revert Signed-off-by: ekmb <[email protected]> * revert Signed-off-by: ekmb <[email protected]> Signed-off-by: ekmb <[email protected]> Signed-off-by: ekmb <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Fix] schedulers with no max_steps param (#4564) * fix schedulers Signed-off-by: stevehuang52 <[email protected]> * update to use python inspect module Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101) * Fix special tokens Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] Add NeMo TTS Primer Tutorial (#4933) * [TTS] Add NeMo TTS Primer Tutorial Signed-off-by: Ryan <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add Squeezeformer CTC model checkpoints on Librispeech (#5121) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * adding loss normalization options to rnnt joint (#4829) * adding normalization options to rnnt joint loss * moving the param to joint * moving loss normalization to rnnt loss config * style * cleaning up * fixing sum reduction in joint Signed-off-by: Dima Rekesh <[email protected]> * moving reduction into RNNT loss class * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring * typos Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Asr concat dataloader (#5108) * forced precision * typo * initial commit Signed-off-by: Dima Rekesh <[email protected]> * typos and bugs Signed-off-by: Dima Rekesh <[email protected]> * reverting conformer encoder Signed-off-by: Dima Rekesh <[email protected]> * additional checks Signed-off-by: Dima Rekesh <[email protected]> * adding support to CTC models as well * reverting conformer_encoder Signed-off-by: Dima Rekesh <[email protected]> * typo Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring Signed-off-by: Dima Rekesh <[email protected]> * merging Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fix blossom ci unittests Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Uppdate container version to 22.09 (#5105) * update container version Signed-off-by: ericharper <[email protected]> * pin click Signed-off-by: ericharper <[email protected]> * pin click 8.0.2 Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Remove unsupported arguments from MegatronNMT (#5065) * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * More fixes Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * pp2 support for T5 IA3 learning and T5 Adapters learning (#5116) * enabling pp2 Signed-off-by: arendu <[email protected]> * optimizer update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * T5 pp>1 support for adapters and ia3 Signed-off-by: arendu <[email protected]> * fix bug with missing adapter_tuning Signed-off-by: arendu <[email protected]> * inference error fixed, pp=2 Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 Prompt Learning Fixes for Pipeline Parallel (#5120) * Initial fixes Signed-off-by: MaximumEntropy <[email protected]> * Added back validation acc Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Put num workers back Signed-off-by: Virginia Adams <[email protected]> * added relative encoding if statament Signed-off-by: Virginia Adams <[email protected]> * Added back val loss only validation Signed-off-by: Virginia Adams <[email protected]> * Revert "Added back val loss only validation" This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f. * Removed val acc for PP > 1 Signed-off-by: Virginia Adams <[email protected]> * Removed enc_seq_len if statement Signed-off-by: Virginia Adams <[email protected]> * Added back validation acc calc Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Virginia Adams <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * add doc info (#4721) Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] Add SpanishCharsTokenizer (#5135) * [TTS] Add SpanishCharsTokenizer Signed-off-by: Ryan <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Update megatron interface to dialogue (#4936) * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * update interface with megatron gpt prompt learning Signed-off-by: Zhilin Wang <[email protected]> * update inline documentation Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update prompt_ids Signed-off-by: Zhilin Wang <[email protected]> * update error msg Signed-off-by: Zhilin Wang <[email protected]> * update config Signed-off-by: Zhilin Wang <[email protected]> * update config Signed-off-by: Zhilin Wang <[email protected]> * set inference = False for dialgue prompt learning during trainng Signed-off-by: Zhilin Wang <[email protected]> * set inference = False for dialgue prompt learning during trainng Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * update config yaml Signed-off-by: Zhilin Wang <[email protected]> * fix bug for megatron gpt prompt learning Signed-off-by: Zhilin Wang <[email protected]> * remove unused import Signed-off-by: Zhilin Wang <[email protected]> * address comments in PR Signed-off-by: Zhilin Wang <[email protected]> * address comments in PR Signed-off-by: Zhilin Wang <[email protected]> * address typo Signed-off-by: Zhilin Wang <[email protected]> * add megatron t5 inference Signed-off-by: Zhilin Wang <[email protected]> * fix bug due to bert tokenizer not being space-aware Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * update IntentSlotModel onnx export test Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * update exportable Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility Signed-off-by: Zhilin Wang <[email protected]> * improve speed of rank_candidates and support for p tuning Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py Signed-off-by: Zhilin Wang <[email protected]> * fix megatron prompt learning saving bug Signed-off-by: Zhilin Wang <[email protected]> * update generate_candidate method Signed-off-by: Zhilin Wang <[email protected]> * remove repeated init text ids and invert attention masks Signed-off-by: Zhilin Wang <[email protected]> * update typo Signed-off-by: Zhilin Wang <[email protected]> * custom collate fn to remove excess padding in batch Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update complete method to mitigate issue when max seq len is low Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * update generation interface Signed-off-by: Zhilin Wang <[email protected]> Signed-off-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Added save inference ready .nemo file with every checkpoint (#5055) * Added save inference ready .nemo file with every checkpoint Signed-off-by: Virginia Adams <[email protected]> * Python style fix Signed-off-by: Virginia Adams <[email protected]> * addressed Adi's comment Signed-off-by: Virginia Adams <[email protected]> * Added ptuning check in model checkpoint saving Signed-off-by: Virginia Adams <[email protected]> * Changed save_nemo_on_valdaition default to False Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changes global batch size of adapter CI Signed-off-by: Virginia Adams <[email protected]> * Changed num workers to 0 Signed-off-by: Virginia Adams <[email protected]> * added first stage of pipeline check Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118) * Remove ; from jupyter notebook cells Signed-off-by: Igor Gitman <[email protected]> * Fix typos in documentation/code Signed-off-by: Igor Gitman <[email protected]> * Fix output message to have 'or equal' Signed-off-by: Igor Gitman <[email protected]> * Link formatting fixes Signed-off-by: Igor Gitman <[email protected]> * Add error if max_utts is used in tarred datasets Signed-off-by: Igor Gitman <[email protected]> * Remove max_utts parameter from tarred datasets Signed-off-by: Igor Gitman <[email protected]> * Fix max_utts removal in tests Signed-off-by: Igor Gitman <[email protected]> * Fix typo if -> is Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Merge r1.12.0 main (#5139) * update branch Signed-off-by: ericharper <[email protected]> * Add cherry-pick action (#4958) * add cherry-pick action Signed-off-by: ericharper <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Sean Naren <[email protected]> * upper bound transformers Signed-off-by: ericharper <[email protected]> * remove duplicate transformers requirement Signed-off-by: ericharper <[email protected]> * Release SOTA Lang ID model (#5080) * add pretrained lang id model ambernet Signed-off-by: fayejf <[email protected]> * update doc and style fix Signed-off-by: fayejf <[email protected]> Signed-off-by: fayejf <[email protected]> * update branch and package info Signed-off-by: ericharper <[email protected]> * remove upper bounds on lightning and transformers Signed-off-by: ericharper <[email protected]> * remove transformers offline from ci Signed-off-by: ericharper <[email protected]> * upper bound transformers Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: fayejf <[email protected]> Co-authored-by: Sean Naren <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Added ASR model comparison to SDE (#5043) SDE: Added ASR model comparison tool to SDE transcribe speech: Added support for many predictions in one file, as well as custom field names Signed-off-by: George Zelenfroynd <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fix nmt eval sampler (#5154) Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix Global init steps (#5143) * move global step to base Signed-off-by: Yi Dong <[email protected]> * fix fused softmax Signed-off-by: Yi Dong <[email protected]> * add the missing file Signed-off-by: Yi Dong <[email protected]> * update the fused kernel Signed-off-by: Yi Dong <[email protected]> * fix import error Signed-off-by: Yi Dong <[email protected]> * fix import again Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518) * bug fix - sample rate was being ignored in vocoder dataset when not loading mel * handled n segments for a different sampling rate than original sampling rate * Added case for n_segments 0, warning for n_segments greater than file length Signed-off-by: Paarth Neekhara <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jocelyn <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add EMA support to NeMo (#4764) * Added Base files Signed-off-by: SeanNaren <[email protected]> * Some refactors, swap to using MNIST Lnet Signed-off-by: SeanNaren <[email protected]> * Add a few more tests, allow the callback to be set via the exp manager Signed-off-by: SeanNaren <[email protected]> * Actually run validation for testing Signed-off-by: SeanNaren <[email protected]> * Run isort Signed-off-by: SeanNaren <[email protected]> * Add test for saving state/fix saving state Signed-off-by: SeanNaren <[email protected]> * Use dummy model Signed-off-by: SeanNaren <[email protected]> * Fix test Signed-off-by: SeanNaren <[email protected]> * Add copyright Signed-off-by: SeanNaren <[email protected]> * Support saving separate EMA weight module Signed-off-by: SeanNaren <[email protected]> * Add standalone functionality/logging Signed-off-by: SeanNaren <[email protected]> * Expose more parameters Signed-off-by: SeanNaren <[email protected]> * Modify to allow option to replace validation Signed-off-by: SeanNaren <[email protected]> * Add jenkins test, formatting Signed-off-by: SeanNaren <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add cherry-pick action (#4958) (#4961) * add cherry-pick action Signed-off-by: ericharper <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Sean Naren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sean Naren <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix changelog builder (#4962) (#4963) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * fix cherry pick workflow (#4964) (#4965) Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: SeanNaren <[email protected]> * reorder model check (#4959) (#4967) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: SeanNaren <[email protected]> * check for active conda environment (#4970) (#4971) Signed-off-by: SeanNaren <[email protected]> * [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Checkpoint averaging class fix (#4946) * 1. Added args.class_path to provide it externally. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add ability to give seperate datasets for test, train and validation (#4798) * Add ability to give seperate datasets for test, train and validation * Addressed Sandeeps comments * Addressed Sandeeps comments * Add ability to give seperate datasets for test, train and validation * Add ability to give seperate datasets for test, train and validation * Addressed review comments * Bug fix for common dataset utils * Add CI tests Signed-off-by: shanmugamr1992 <[email protected]> * Reformat code Signed-off-by: shanmugamr1992 <[email protected]> * Bug fix Signed-off-by: shanmugamr1992 <[email protected]> * Bug fix * Bug Fix * Bug Fix * Update Jenkinsfile * Addressed comments * Addressed Eriks comments. * Addressed Sandeep * Update Jenkinsfile * Update Jenkinsfile * Update dataset_utils.py * Update Jenkinsfile * Update Jenkinsfile * Use GPT CI config Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: shanmugamr1992 <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Signed-off-by: SeanNaren <[email protected]> * fix label models restoring issue from wrighted cross entropy (#4968) (#4975) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add simple pre-commit file (#4983) * Add simple pre-commit file Signed-off-by: SeanNaren <[email protected]> * Exclude docs folder Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: SeanNaren <[email protected]> * Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks" This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb. Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: SeanNaren <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: SeanNaren <[email protected]> * Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951) Signed-off-by: Jin Li <[email protected]> Signed-off-by: Jin Li <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Adding speaker embedding conditioning in fastpitch (#4986) Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix ASR issues (#4984) (#4991) * Fix ASR issues Signed-off-by: smajumdar <[email protected]> * Revert fix Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix current tests Signed-off-by: SeanNaren <[email protected]> * More test coverage Signed-off-by: SeanNaren <[email protected]> * Address reviews Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address review Signed-off-by: SeanNaren <[email protected]> * Drop bf16 test Signed-off-by: SeanNaren <[email protected]> * Address review Signed-off-by: SeanNaren <[email protected]> * remove print Signed-off-by: SeanNaren <[email protected]> * Add bf16 Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: shanmugamr1992 <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Jin Li <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: shanmugamr1992 <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: liji-nv <[email protected]> Co-authored-by: Subhankar Ghosh <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix BF16 test (#5162) Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix errors in speaker diarization nemo docs (#5153) * fix docs and docstrings for MSDD Signed-off-by: Taejin Park <[email protected]> * fix nemo docs errors Signed-off-by: Taejin Park <[email protected]> * reflected review comments Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add interleaved pipeline schedule to GPT (#5025) * add virtual pipeline size to config Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * update for list of modules Signed-off-by: ericharper <[email protected]> * add virtual to init Signed-off-by: ericharper <[email protected]> * update first last stage embedding all reduce Signed-off-by: ericharper <[email protected]> * update sequence parallel all reduce for virtual models Signed-off-by: ericharper <[email protected]> * runs but we get an error Signed-off-by: ericharper <[email protected]> * set virtual rank 0 after looping Signed-off-by: ericharper <[email protected]> * account for virtual when determinining first and last pipeline stages Signed-off-by: ericharper <[email protected]> * checkpointing for virtual models in progress Signed-off-by: ericharper <[email protected]> * add checkpoint hooks Signed-off-by: ericharper <[email protected]> * working on validation when resuming Signed-off-by: ericharper <[email protected]> * skip sanity val steps by default in config Signed-off-by: ericharper <[email protected]> * remove comment Signed-off-by: ericharper <[email protected]> * log number of params Signed-off-by: ericharper <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style Signed-off-by: ericharper <[email protected]> * check if self.model is a list Signed-off-by: ericharper <[email protected]> * make virtual pipeline default size None on init Signed-off-by: ericharper <[email protected]> * make virtual pipeline default to None in config Signed-off-by: ericharper <[email protected]> * remove ensure_divisibility call Signed-off-by: ericharper <[email protected]> * fix lgtm alerts Signed-off-by: ericharper <[email protected]> * remove num_sanity_val_steps from config Signed-off-by: ericharper <[email protected]> * default virtual pipeline size to none Signed-off-by: ericharper <[email protected]> * check for list Signed-off-by: ericharper <[email protected]> * update assert to make sure we are only doing virtual for gpt Signed-off-by: ericharper <[email protected]> * revert change to get_params_for_weight_decay Signed-off-by: ericharper <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * init var Signed-off-by: ericharper <[email protected]> * add import guard for set virtual model parallel world size Signed-off-by: ericharper <[email protected]> * use import guard Signed-off-by: ericharper <[email protected]> * update calls to fake init in eval scripts Signed-off-by: ericharper <[email protected]> * add _get_fwd_bwd_function Signed-off-by: ericharper <[email protected]> * log all total model parameters Signed-off-by: ericharper <[email protected]> * remove unused import Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * reduced to 14 inactive days to be stale for PRs. (#5165) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * refactor TTS documentation organization and add new contents. (#5137) * refactor TTS documentation organization and add new contents. * fix asr api bug. * fix broken links. * fix unexpected indentation errors. * fixed unexpected indentation. * fixed broken paper reference. * fixed cross-reference and typos. * fixed toctree errors. * revert to 'Augmentors' * reordered TTS tutorial list in starthere. * ordered api classes alphabetically for each Section. * fixed underscore typo for fastpitch checkpoint. Signed-off-by: Xuesong Yang <[email protected]> * upcase 'Tuning' Signed-off-by: Xuesong Yang <[email protected]> * fixed typo for RAD-TTS Aligner Signed-off-by: Xuesong Yang <[email protected]> * reorder aligner section after mel-gen and vocoders in models.rst. Signed-off-by: Xuesong Yang <[email protected]> * clarify Mixer-TTS-X and reorder model descriptions alphabetically. Signed-off-by: Xuesong Yang <[email protected]> * fixed some typos and formats. Signed-off-by: Xuesong Yang <[email protected]> * removed old megatron.rst. Signed-off-by: Xuesong Yang <[email protected]> * fixed block quote ends without a blank line warnings. Signed-off-by: Xuesong Yang <[email protected]> * remove duplicate reference; fixed missing key nlp-megatron-shoeybi2019megatron Signed-off-by: Xuesong Yang <[email protected]> * Revert "removed old megatron.rst." This reverts commit c5ea1dc3f23272eecfe8040e3abfa54fa122cf73. Signed-off-by: Xuesong Yang <[email protected]> * removed Russian, a hyphen, and add a note about G2P in tts/…
1 parent dddfa39 commit 91833b8

File tree

12 files changed

+263
-112
lines changed

12 files changed

+263
-112
lines changed

nemo/collections/asr/losses/rnnt.py

+21-5
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,10 @@
3838
from nemo.collections.asr.losses.rnnt_pytorch import MultiblankRNNTLossPytorch, RNNTLossPytorch, TDTLossPytorch
3939
from nemo.core.classes import Loss, typecheck
4040
from nemo.core.neural_types import LabelsType, LengthsType, LogprobsType, LossType, NeuralType
41+
from nemo.core.utils import numba_utils
4142
from nemo.core.utils.k2_utils import K2_INSTALLATION_MESSAGE
4243
from nemo.core.utils.numba_utils import NUMBA_INSTALLATION_MESSAGE
43-
from nemo.utils import logging, model_utils
44+
from nemo.utils import logging, logging_mode, model_utils
4445

4546
try:
4647
import warprnnt_pytorch as warprnnt
@@ -98,7 +99,7 @@ class RNNTLossConfig:
9899
min_version='0.53.0',
99100
is_available=NUMBA_RNNT_AVAILABLE,
100101
installation_msg=NUMBA_INSTALLATION_MESSAGE,
101-
force_float32=True,
102+
force_float32=not numba_utils.NUMBA_FP16_SUPPORTED,
102103
),
103104
"pytorch": RNNTLossConfig(
104105
loss_name="pytorch",
@@ -387,7 +388,7 @@ def __init__(self, num_classes, reduction: str = 'mean_batch', loss_name: str =
387388
for the standard "blank" symbol. In particular, say V is the number of non-blank tokens in
388389
the vocabulary, then in the case of,
389390
standard RNNT: num_classes = V
390-
multiblank RNNT: num_classes = V + number-big-blanks (since we store big-blanks before
391+
multiblank RNNT: num_classes = V + number-big-blanks (since we store big-blanks before
391392
standard blank, and the standard blank is the last symbol in the vocab)
392393
TDT: num_classes = V. Note, V here does not include any of the "duration outputs".
393394
@@ -413,6 +414,7 @@ def __init__(self, num_classes, reduction: str = 'mean_batch', loss_name: str =
413414
self.reduction = reduction
414415
self._loss = resolve_rnnt_loss(loss_name, blank_idx=self._blank, loss_kwargs=loss_kwargs)
415416
self._force_float32 = RNNT_LOSS_RESOLVER[loss_name].force_float32
417+
self._fp16_compat_checked = False
416418

417419
def reduce(self, losses, target_lengths):
418420

@@ -442,8 +444,22 @@ def forward(self, log_probs, targets, input_lengths, target_lengths):
442444
max_targets_len = target_lengths.max()
443445

444446
# Force cast joint to float32
445-
# TODO: Remove once Numba supports FP16
446-
if self._force_float32 and log_probs.dtype != torch.float32:
447+
if not self._force_float32 and numba_utils.NUMBA_FP16_SUPPORTED:
448+
# Execute the kernel in fp16
449+
pass
450+
elif self._force_float32 and log_probs.dtype != torch.float32:
451+
# Log just once if fp16 tensor was passed and fp16 Numba CUDA loss could not be used.
452+
if log_probs.dtype == torch.float16 and not self._fp16_compat_checked:
453+
_, reason = numba_utils.is_numba_cuda_fp16_supported(return_reason=True)
454+
logging.warning(
455+
f"Provided RNNT Joint tensor is of dtype {log_probs.dtype}, but RNNT loss could not be calculated "
456+
f"in fp16 due to following reason stated below. Loss will be calculated in fp32. \n\n"
457+
f"{reason}",
458+
mode=logging_mode.ONCE,
459+
)
460+
self._fp16_compat_checked = True
461+
462+
# Upcast the activation tensor and compute loss and grads in fp32
447463
logits_orig = log_probs
448464
log_probs = log_probs.float()
449465
del logits_orig # save memory *before* computing the loss

nemo/collections/asr/losses/rnnt_pytorch.py

+5
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,12 @@ def __init__(self, blank, reduction):
4747
self.reduction = reduction
4848

4949
def forward(self, acts, labels, act_lens, label_lens):
50+
# CPU patch for FP16
51+
if not acts.is_cuda and acts.dtype == torch.float16:
52+
acts = acts.float()
53+
5054
acts = torch.log_softmax(acts, -1)
55+
5156
forward_logprob = self.compute_forward_prob(acts, labels, act_lens, label_lens)
5257
losses = -forward_logprob
5358
if self.reduction == 'mean_batch':

nemo/collections/asr/parts/numba/rnnt_loss/rnnt.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ def rnnt_loss_gpu(
186186

187187
# Select GPU index
188188
cuda.select_device(acts.device.index)
189-
gpu_workspace = torch.zeros(gpu_size, device=acts.device, dtype=acts.dtype, requires_grad=False)
189+
gpu_workspace = torch.zeros(gpu_size, device=acts.device, dtype=torch.float32, requires_grad=False)
190190

191191
### VIEW TENSORS AS VECTORS FOR POINTER INDEXING ###
192192
acts, acts_shape = rnnt_helper.flatten_tensor(acts)

nemo/collections/asr/parts/numba/rnnt_loss/rnnt_numpy.py

+5
Original file line numberDiff line numberDiff line change
@@ -344,10 +344,15 @@ def forward(self, acts, labels, act_lens, label_lens):
344344
_assert_no_grad(label_lens)
345345
certify_inputs(acts, labels, act_lens, label_lens)
346346

347+
# CPU Patch for fp16 - force cast to fp32
348+
if not acts.is_cuda and acts.dtype == torch.float16:
349+
acts = acts.float()
350+
347351
if self.clamp > 0.0:
348352
acts = LogSoftmaxGradModification.apply(acts, self.clamp)
349353

350354
acts = torch.nn.functional.log_softmax(acts, -1)
355+
351356
return self.rnnt(acts, labels, act_lens, label_lens, self.blank, self.fastemit_lambda)
352357

353358

nemo/collections/asr/parts/numba/rnnt_loss/rnnt_pytorch.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ def forward(ctx, acts, labels, act_lens, label_lens, blank, reduction, fastemit_
5757
loss_func = rnnt.rnnt_loss_gpu if is_cuda else rnnt.rnnt_loss_cpu
5858
grads = torch.zeros_like(acts) if acts.requires_grad else None
5959
minibatch_size = acts.size(0)
60-
costs = torch.zeros(minibatch_size, device=acts.device, dtype=acts.dtype)
60+
costs = torch.zeros(minibatch_size, device=acts.device, dtype=torch.float32)
6161

6262
loss_func(
6363
acts,
@@ -119,7 +119,6 @@ def forward(
119119
label_lens: Tensor of (batch) containing label length of each example
120120
fastemit_lambda: Float scaling factor for FastEmit regularization. Refer to
121121
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization.
122-
123122
durations: list of durations for TDT model, must include 0 and 1, e.g.
124123
[0, 1, 2, 3, 4].
125124
sigma: hyper-parameter for logit under-normalization method for training
@@ -417,6 +416,10 @@ def forward(self, acts, labels, act_lens, label_lens):
417416
label_lens: Tensor of (batch) containing label length of each example
418417
"""
419418
if not acts.is_cuda:
419+
# Force FP32 until log_softmax() is implemented for fp16 on CPU
420+
if acts.dtype == torch.float16:
421+
acts = acts.float()
422+
420423
# Since CPU requires log_softmax to be computed explicitly, we need to perform grad clipping
421424
# *after* we have obtained the gradients of loss(logsoftmax()).
422425
# This is highly wasteful since it requires a copy of the entire joint tensor which is expensive.

nemo/collections/asr/parts/numba/rnnt_loss/utils/cpu_utils/cpu_rnnt.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -231,8 +231,8 @@ def cost_and_grad_kernel(
231231
)
232232

233233
# Scale llForward by FastEmit lambda
234-
llForward *= 1.0 + self.fastemit_lambda_
235-
llBackward *= 1.0 + self.fastemit_lambda_
234+
llForward += llForward * self.fastemit_lambda_
235+
llBackward += llBackward * self.fastemit_lambda_
236236

237237
diff = (llForward - llBackward).abs()
238238
if diff > 0.1:
@@ -300,6 +300,10 @@ def compute_betas_and_grads(
300300
Returns:
301301
Loglikelihood of the forward variable and inplace updates the grad tensor.
302302
"""
303+
# Patch for CPU + fp16
304+
if log_probs.dtype == torch.float16 and not log_probs.is_cuda:
305+
log_probs = log_probs.float()
306+
303307
idx = CpuRNNT_index(U, self.maxU_, self.minibatch_, self.alphabet_size_, self.batch_first)
304308
betas[idx(T - 1, U - 1)] = log_probs[idx(T - 1, U - 1) * 2]
305309

nemo/collections/asr/parts/numba/rnnt_loss/utils/rnnt_helper.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
import math
3131
from typing import Optional, Tuple
3232

33+
import numba
3334
import torch
3435
from numba import cuda
3536

@@ -112,7 +113,7 @@ def compute_costs_data(source: torch.Tensor, dest: torch.Tensor, fastemit_lambda
112113
if idx < length:
113114
copy_data_1d(source, dest, idx)
114115
dest[idx] *= -1.0
115-
dest[idx] *= 1.0 + fastemit_lambda
116+
dest[idx] *= numba.float32(1.0 + fastemit_lambda)
116117

117118

118119
def get_workspace_size(

nemo/core/utils/numba_utils.py

+36
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@
1717
import operator
1818
import os
1919

20+
from typing import Tuple, Union
21+
2022
from nemo.utils import model_utils
2123

2224
# Prevent Numba CUDA logs from showing at info level
@@ -26,6 +28,11 @@
2628
__NUMBA_DEFAULT_MINIMUM_VERSION__ = "0.53.0"
2729
__NUMBA_MINIMUM_VERSION__ = os.environ.get("NEMO_NUMBA_MINVER", __NUMBA_DEFAULT_MINIMUM_VERSION__)
2830

31+
__NUMBA_MINIMUM_VERSION_FP16_SUPPORTED__ = "0.57.0"
32+
NUMBA_FP16_SUPPORTED = model_utils.check_lib_version(
33+
'numba', __NUMBA_MINIMUM_VERSION_FP16_SUPPORTED__, operator=operator.ge
34+
)[0]
35+
2936

3037
NUMBA_INSTALLATION_MESSAGE = (
3138
"Could not import `numba`.\n"
@@ -148,6 +155,35 @@ def numba_cuda_is_supported(min_version: str) -> bool:
148155
return False
149156

150157

158+
def is_numba_cuda_fp16_supported(return_reason: bool = False) -> Union[bool, Tuple[bool, str]]:
159+
"""
160+
Utility method that returns a bool, stating if FP16 is supported for numba cuda kernels or not.
161+
162+
Returns:
163+
bool, whether Numba CUDA will support fp16 or not.
164+
"""
165+
reason = ""
166+
use_nvidia_binding = os.environ.get('NUMBA_CUDA_USE_NVIDIA_BINDING', None)
167+
if use_nvidia_binding is not None:
168+
use_nvidia_binding = use_nvidia_binding.lower() == "1"
169+
reason += "Env variable `NUMBA_CUDA_USE_NVIDIA_BINDING` is available and set to `1`. "
170+
else:
171+
use_nvidia_binding = False
172+
reason += "Env variable `NUMBA_CUDA_USE_NVIDIA_BINDING` is not available or has not set to `1`."
173+
174+
if NUMBA_FP16_SUPPORTED:
175+
reason += f"Numba CUDA FP16 is supported in installed numba version."
176+
else:
177+
reason += f"Numba CUDA FP16 is not supported in installed numba version."
178+
179+
result = use_nvidia_binding and NUMBA_FP16_SUPPORTED
180+
181+
if return_reason:
182+
return result, reason
183+
else:
184+
return result
185+
186+
151187
def skip_numba_cuda_test_if_unsupported(min_version: str):
152188
"""
153189
Helper method to skip pytest test case if numba cuda is not supported.

0 commit comments

Comments
 (0)