-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Megatron perceiver with tensor parallelism only #4318
Conversation
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
* Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
This pull request introduces 1 alert and fixes 1 when merging b25bfb5 into 94a464f - view on LGTM.com new alerts:
fixed alerts:
|
Signed-off-by: MaximumEntropy <[email protected]>
This pull request fixes 1 alert when merging dd836b5 into 94a464f - view on LGTM.com fixed alerts:
|
Signed-off-by: MaximumEntropy <[email protected]>
This pull request fixes 1 alert when merging 7608f3b into 09be885 - view on LGTM.com fixed alerts:
|
This pull request fixes 1 alert when merging b9e3ca5 into cd9b8af - view on LGTM.com fixed alerts:
|
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic PR!
Signed-off-by: MaximumEntropy <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for a great PR!
* Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Signed-off-by: arendu <[email protected]>
* Megatron BART BOS / EOS bug fix (#4495) * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. BART dataset fixes missing <EOS> for deocder output. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed extra padding from BARTDataset. Signed-off-by: Micha Livne <[email protected]> * GPT Prompt Learning Improvements (#4496) * Updated pipeline parallel code to speed up training Signed-off-by: Virginia Adams <[email protected]> * Load global batch size not local mini batch size Signed-off-by: Virginia Adams <[email protected]> * Python reformatting Signed-off-by: Virginia Adams <[email protected]> * Megatron perceiver with tensor parallelism only (#4318) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> * NMESC speaker counting algorithm update (#4500) * initial commit Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Default maj_vote = False, max_rp=0.25 Signed-off-by: Taejin Park <[email protected]> * doc strings and style fix Signed-off-by: Taejin Park <[email protected]> * Docstring minor edit Signed-off-by: Taejin Park <[email protected]> * Default False in the functions Signed-off-by: Taejin Park <[email protected]> * fixed repeated variable Signed-off-by: Taejin Park <[email protected]> * Default as maj_vote=False Signed-off-by: Taejin Park <[email protected]> * removed redundant part in wrtie_rttm func Signed-off-by: Taejin Park <[email protected]> * Removed unused function Signed-off-by: Taejin Park <[email protected]> * Updated and tested silence and very short samples Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Style fix and removing unnecessary parts Signed-off-by: Taejin Park <[email protected]> * unused variables are removed Signed-off-by: Taejin Park <[email protected]> * Fixed commented torch.jit.script Signed-off-by: Taejin Park <[email protected]> * majority voting update Signed-off-by: Taejin Park <[email protected]> * cancelling the update on speaker_utils and clus_diarizer Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * bug fix Signed-off-by: Taejin Park <[email protected]> * Added fp32 converting for torch.mm Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> * Fix dataset parameter typo on tacotron2 example yaml (#4471) Signed-off-by: saarus72 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Noam lr sched: do not force min_lr after max_steps (#4472) Signed-off-by: Adrian Lancucki <[email protected]> Co-authored-by: Adrian Lancucki <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Refactor for punctuation model (#4367) * Dataloader, collector, loss and metric for multiscale diarization decoder (#4187) * First commit Signed-off-by: Taejin Park <[email protected]> * Checked funtionality and imports Signed-off-by: Taejin Park <[email protected]> * fixed import issues Signed-off-by: Taejin Park <[email protected]> * Removed the changed made by mistake Signed-off-by: Taejin Park <[email protected]> * Style fix Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM errors 001 Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM and style fix Signed-off-by: Taejin Park <[email protected]> * Changed docstrings Signed-off-by: Taejin Park <[email protected]> * LGTM again Signed-off-by: Taejin Park <[email protected]> * Removed unnecessary torch setting lines Signed-off-by: Taejin Park <[email protected]> * Style fix and isort Signed-off-by: Taejin Park <[email protected]> * jbalam-nv comments reflected Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected comments and created _diar_label.py Signed-off-by: Taejin Park <[email protected]> * Typo fix and style fix Signed-off-by: Taejin Park <[email protected]> * Fixed target_spks[0] index error Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * LGTM unused import IterDataset Signed-off-by: Taejin Park <[email protected]> * revert collection doc year Signed-off-by: Taejin Park <[email protected]> * Code format error in collections.py Signed-off-by: Taejin Park <[email protected]> * fix collections space format error Signed-off-by: Taejin Park <[email protected]> * merged main correctly Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected all comments and tested Signed-off-by: Taejin Park <[email protected]> * style fix and LGTM Signed-off-by: Taejin Park <[email protected]> * rttm_filepath to rttm_file and removed self included funcs, tested Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * removed references to data_dir Signed-off-by: Matvei Novikov <[email protected]> * added missing parameters to data preparation script Signed-off-by: Matvei Novikov <[email protected]> * removed unnecessary file extension check Signed-off-by: Matvei Novikov <[email protected]> * Add ASR CTC Decoding module (#4342) * Initial commit Signed-off-by: smajumdar <[email protected]> * Full support for decoding strategy Signed-off-by: smajumdar <[email protected]> * Temp Signed-off-by: smajumdar <[email protected]> * Fix labels of y_sequence Signed-off-by: smajumdar <[email protected]> * Set support for sentencepiece subword merging Signed-off-by: smajumdar <[email protected]> * Fix char and word based token merge alignment Signed-off-by: smajumdar <[email protected]> * Revert incorrect change Signed-off-by: smajumdar <[email protected]> * Update docstring Signed-off-by: smajumdar <[email protected]> * Improve compatibility with greedy tokens and log probs Signed-off-by: smajumdar <[email protected]> * Update scripts to use decoding strategy Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix decoding of ctc models Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Option to disable mp in VAD via num_workers=1 (#4317) * Option to disable mp in VAD via num_workers=1 In certain environments python multiprocessing can deadlock. This adds a convenient version to disable by setting num_workers to 1. Signed-off-by: Georg Kucsko <[email protected]> * add none handling Signed-off-by: Georg Kucsko <[email protected]> * additional none handling Signed-off-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * remove redundant bias expand (#4382) * remove redundant bias expand Signed-off-by: Xiaowei Ren <[email protected]> * delete redundant code Signed-off-by: Xiaowei Ren <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * fixed style Signed-off-by: Matvei Novikov <[email protected]> * Add option for specifying wandb save_dir from config (#4379) * give option to user to specify wandb save dir via config Signed-off-by: Shantanu Acharya <[email protected]> * create save_dir directory for wandb logger if not exists Signed-off-by: Shantanu Acharya <[email protected]> * update save_dir get method with a default value Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [Bugfix][TTS] wrong order of returned tuple for general_collate_fn. (#4388) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Merge r1.10.0 main (#4398) * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * remove Copy of Signed-off-by: ericharper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [bugfix][TTS] pitch, voiced_mask, prob_voiced have the same values. (#4392) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing import error in some cases (#4401) Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing bugs in calling method ctc_decoder_predictions_tensor. (#4414) * updated ctc decoding calls. Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Update with new conformer checkpoints. (#4417) Signed-off-by: Matvei Novikov <[email protected]> * [TTS] add static method decorator. (#4443) * [TTS] add static method decorator. Signed-off-by: Xuesong Yang <[email protected]> * remove protect prefix Signed-off-by: Xuesong Yang <[email protected]> * fixed style error Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Xiaowei Ren <[email protected]> Co-authored-by: Shantanu Acharya <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> * bug fix - sample rate was being ignored in vocoder dataset when not loading mel Signed-off-by: Paarth Neekhara <[email protected]> * Add ITN pt (#4516) * Add ITN pt Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Update copyright year to 2022 on ITN pt rules and tests Signed-off-by: Guilherme Steinmann <[email protected]> * Fixed WER initialization in ASR_with_Nemo notebook (#4523) Signed-off-by: Ante Jukić <[email protected]> Co-authored-by: Ante Jukić <[email protected]> * Update cmudict (#4510) phoneme IY1 -> IH1 in NVIDIA Added phonemes for CUSTOMIZABLE Update cmudict file revision and its reference. Signed-off-by: Jason Roche <[email protected]> Co-authored-by: Jason Roche <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * [Add] Support for Different LRs with Param Groups (#4508) * add support for param groups Signed-off-by: stevehuang52 <[email protected]> * make config more general Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Weighted bucketing (#4474) * Add silence handling for speaker diarization pipeline (#4512) * initial commit Signed-off-by: nithinraok <[email protected]> * fixed silence wav file issue causing clustering to evaluate on null embeddings Signed-off-by: nithinraok <[email protected]> * fixed zero duration issue Signed-off-by: nithinraok <[email protected]> * updated with comments Signed-off-by: nithinraok <[email protected]> * minor doc change Signed-off-by: nithinraok <[email protected]> * update log Signed-off-by: nithinraok <[email protected]> * Fix runtime check (#4501) * Runtime check refinements Signed-off-by: Boris Fomitchev <[email protected]> * Added fp32 casting for ASR nets export Signed-off-by: Boris Fomitchev <[email protected]> * style Signed-off-by: Boris Fomitchev <[email protected]> * Used torch.float32 for clarity Signed-off-by: Boris Fomitchev <[email protected]> * Fixing parameters passing Signed-off-by: Boris Fomitchev <[email protected]> * Update finetune label models (#4504) * initial_script Signed-off-by: nithinraok <[email protected]> * move old script Signed-off-by: nithinraok <[email protected]> * remove finetune func from label models Signed-off-by: nithinraok <[email protected]> * style clean Signed-off-by: nithinraok <[email protected]> * updated config Signed-off-by: nithinraok <[email protected]> * update tutorial Signed-off-by: nithinraok <[email protected]> * lgtm fixes Signed-off-by: nithinraok <[email protected]> * updated based on comments Signed-off-by: nithinraok <[email protected]> * update doc Signed-off-by: nithinraok <[email protected]> * [ASR][Breaking Change] Update signature of Hypothesis alignments (#4511) * Preserve logprobs when preserving alignments Signed-off-by: smajumdar <[email protected]> * Update tests for rnnt gredy and beam search Signed-off-by: smajumdar <[email protected]> * Update all dependents of alignments Signed-off-by: smajumdar <[email protected]> * Update docs Signed-off-by: smajumdar <[email protected]> * Weighted bucketing (#4530) * Additional sentencepiece args - Byte fallback, split digits, split_on_whitespace (#4525) * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Add extra args Signed-off-by: MaximumEntropy <[email protected]> * Reset transformer Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix spm arg Signed-off-by: MaximumEntropy <[email protected]> * Fix help string Signed-off-by: MaximumEntropy <[email protected]> * Add support for ASR Adapter Auxiliary Losses (#4480) * Add support for access mixin registry of custom losses Signed-off-by: smajumdar <[email protected]> * add support for asr custom losses Signed-off-by: smajumdar <[email protected]> * Update for l2 loss Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Update registration of tensors to reset after finishing step Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Update SSL models Signed-off-by: smajumdar <[email protected]> * Add support for validation step properly registering tensors Signed-off-by: smajumdar <[email protected]> * Move reset of registry outside Signed-off-by: smajumdar <[email protected]> * update (#4520) Signed-off-by: stevehuang52 <[email protected]> * fix duplex inference with grammars (#4517) * fix duplex inference with grammars Signed-off-by: ekmb <[email protected]> * add ci test for duplex, fix electronic last sym bug Signed-off-by: ekmb <[email protected]> * test fix Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * update jenkins grammars Signed-off-by: ekmb <[email protected]> * add pt to the docs Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * disable test Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins refactor Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> Co-authored-by: Yang Zhang <[email protected]> * Add Bucketing support to TarredAudioToClassificationLabelDataset (#4465) * Add Bucketing support to TarredAudioToClassificationLabelDataset Signed-off-by: Ewald Enzinger <[email protected]> * Add MTEncDec Finetune support (#4540) * add FT support Signed-off-by: Abhinav Khattar <[email protected]> * rm preproc Signed-off-by: Abhinav Khattar <[email protected]> * review changes Signed-off-by: Abhinav Khattar <[email protected]> * add CI Signed-off-by: Abhinav Khattar <[email protected]> * newline fix Signed-off-by: Abhinav Khattar <[email protected]> * CI fix Signed-off-by: Abhinav Khattar <[email protected]> * clean up Signed-off-by: Abhinav Khattar <[email protected]> * post training cleanup Signed-off-by: Abhinav Khattar <[email protected]> * test Signed-off-by: Abhinav Khattar <[email protected]> * revert Signed-off-by: Abhinav Khattar <[email protected]> * CI test Signed-off-by: Abhinav Khattar <[email protected]> * revert CI changes Signed-off-by: Abhinav Khattar <[email protected]> * original CI Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Add nsys profiling (#4539) * add nsys profiling Signed-off-by: ericharper <[email protected]> * only access omegaconf in setup Signed-off-by: ericharper <[email protected]> * use robust get_rank function Signed-off-by: ericharper <[email protected]> * simplify Signed-off-by: ericharper <[email protected]> * Update megatron prompt learning interface to dialogue (#4545) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data …
* Megatron BART BOS / EOS bug fix (#4495) * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. BART dataset fixes missing <EOS> for deocder output. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed extra padding from BARTDataset. Signed-off-by: Micha Livne <[email protected]> * GPT Prompt Learning Improvements (#4496) * Updated pipeline parallel code to speed up training Signed-off-by: Virginia Adams <[email protected]> * Load global batch size not local mini batch size Signed-off-by: Virginia Adams <[email protected]> * Python reformatting Signed-off-by: Virginia Adams <[email protected]> * Megatron perceiver with tensor parallelism only (#4318) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> * NMESC speaker counting algorithm update (#4500) * initial commit Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Default maj_vote = False, max_rp=0.25 Signed-off-by: Taejin Park <[email protected]> * doc strings and style fix Signed-off-by: Taejin Park <[email protected]> * Docstring minor edit Signed-off-by: Taejin Park <[email protected]> * Default False in the functions Signed-off-by: Taejin Park <[email protected]> * fixed repeated variable Signed-off-by: Taejin Park <[email protected]> * Default as maj_vote=False Signed-off-by: Taejin Park <[email protected]> * removed redundant part in wrtie_rttm func Signed-off-by: Taejin Park <[email protected]> * Removed unused function Signed-off-by: Taejin Park <[email protected]> * Updated and tested silence and very short samples Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Style fix and removing unnecessary parts Signed-off-by: Taejin Park <[email protected]> * unused variables are removed Signed-off-by: Taejin Park <[email protected]> * Fixed commented torch.jit.script Signed-off-by: Taejin Park <[email protected]> * majority voting update Signed-off-by: Taejin Park <[email protected]> * cancelling the update on speaker_utils and clus_diarizer Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * bug fix Signed-off-by: Taejin Park <[email protected]> * Added fp32 converting for torch.mm Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> * Fix dataset parameter typo on tacotron2 example yaml (#4471) Signed-off-by: saarus72 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Noam lr sched: do not force min_lr after max_steps (#4472) Signed-off-by: Adrian Lancucki <[email protected]> Co-authored-by: Adrian Lancucki <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Refactor for punctuation model (#4367) * Dataloader, collector, loss and metric for multiscale diarization decoder (#4187) * First commit Signed-off-by: Taejin Park <[email protected]> * Checked funtionality and imports Signed-off-by: Taejin Park <[email protected]> * fixed import issues Signed-off-by: Taejin Park <[email protected]> * Removed the changed made by mistake Signed-off-by: Taejin Park <[email protected]> * Style fix Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM errors 001 Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM and style fix Signed-off-by: Taejin Park <[email protected]> * Changed docstrings Signed-off-by: Taejin Park <[email protected]> * LGTM again Signed-off-by: Taejin Park <[email protected]> * Removed unnecessary torch setting lines Signed-off-by: Taejin Park <[email protected]> * Style fix and isort Signed-off-by: Taejin Park <[email protected]> * jbalam-nv comments reflected Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected comments and created _diar_label.py Signed-off-by: Taejin Park <[email protected]> * Typo fix and style fix Signed-off-by: Taejin Park <[email protected]> * Fixed target_spks[0] index error Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * LGTM unused import IterDataset Signed-off-by: Taejin Park <[email protected]> * revert collection doc year Signed-off-by: Taejin Park <[email protected]> * Code format error in collections.py Signed-off-by: Taejin Park <[email protected]> * fix collections space format error Signed-off-by: Taejin Park <[email protected]> * merged main correctly Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected all comments and tested Signed-off-by: Taejin Park <[email protected]> * style fix and LGTM Signed-off-by: Taejin Park <[email protected]> * rttm_filepath to rttm_file and removed self included funcs, tested Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * removed references to data_dir Signed-off-by: Matvei Novikov <[email protected]> * added missing parameters to data preparation script Signed-off-by: Matvei Novikov <[email protected]> * removed unnecessary file extension check Signed-off-by: Matvei Novikov <[email protected]> * Add ASR CTC Decoding module (#4342) * Initial commit Signed-off-by: smajumdar <[email protected]> * Full support for decoding strategy Signed-off-by: smajumdar <[email protected]> * Temp Signed-off-by: smajumdar <[email protected]> * Fix labels of y_sequence Signed-off-by: smajumdar <[email protected]> * Set support for sentencepiece subword merging Signed-off-by: smajumdar <[email protected]> * Fix char and word based token merge alignment Signed-off-by: smajumdar <[email protected]> * Revert incorrect change Signed-off-by: smajumdar <[email protected]> * Update docstring Signed-off-by: smajumdar <[email protected]> * Improve compatibility with greedy tokens and log probs Signed-off-by: smajumdar <[email protected]> * Update scripts to use decoding strategy Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix decoding of ctc models Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Option to disable mp in VAD via num_workers=1 (#4317) * Option to disable mp in VAD via num_workers=1 In certain environments python multiprocessing can deadlock. This adds a convenient version to disable by setting num_workers to 1. Signed-off-by: Georg Kucsko <[email protected]> * add none handling Signed-off-by: Georg Kucsko <[email protected]> * additional none handling Signed-off-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * remove redundant bias expand (#4382) * remove redundant bias expand Signed-off-by: Xiaowei Ren <[email protected]> * delete redundant code Signed-off-by: Xiaowei Ren <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * fixed style Signed-off-by: Matvei Novikov <[email protected]> * Add option for specifying wandb save_dir from config (#4379) * give option to user to specify wandb save dir via config Signed-off-by: Shantanu Acharya <[email protected]> * create save_dir directory for wandb logger if not exists Signed-off-by: Shantanu Acharya <[email protected]> * update save_dir get method with a default value Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [Bugfix][TTS] wrong order of returned tuple for general_collate_fn. (#4388) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Merge r1.10.0 main (#4398) * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * remove Copy of Signed-off-by: ericharper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [bugfix][TTS] pitch, voiced_mask, prob_voiced have the same values. (#4392) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing import error in some cases (#4401) Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing bugs in calling method ctc_decoder_predictions_tensor. (#4414) * updated ctc decoding calls. Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Update with new conformer checkpoints. (#4417) Signed-off-by: Matvei Novikov <[email protected]> * [TTS] add static method decorator. (#4443) * [TTS] add static method decorator. Signed-off-by: Xuesong Yang <[email protected]> * remove protect prefix Signed-off-by: Xuesong Yang <[email protected]> * fixed style error Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Xiaowei Ren <[email protected]> Co-authored-by: Shantanu Acharya <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> * Add ITN pt (#4516) * Add ITN pt Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Update copyright year to 2022 on ITN pt rules and tests Signed-off-by: Guilherme Steinmann <[email protected]> * Fixed WER initialization in ASR_with_Nemo notebook (#4523) Signed-off-by: Ante Jukić <[email protected]> Co-authored-by: Ante Jukić <[email protected]> * Update cmudict (#4510) phoneme IY1 -> IH1 in NVIDIA Added phonemes for CUSTOMIZABLE Update cmudict file revision and its reference. Signed-off-by: Jason Roche <[email protected]> Co-authored-by: Jason Roche <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * [Add] Support for Different LRs with Param Groups (#4508) * add support for param groups Signed-off-by: stevehuang52 <[email protected]> * make config more general Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Weighted bucketing (#4474) * Add silence handling for speaker diarization pipeline (#4512) * initial commit Signed-off-by: nithinraok <[email protected]> * fixed silence wav file issue causing clustering to evaluate on null embeddings Signed-off-by: nithinraok <[email protected]> * fixed zero duration issue Signed-off-by: nithinraok <[email protected]> * updated with comments Signed-off-by: nithinraok <[email protected]> * minor doc change Signed-off-by: nithinraok <[email protected]> * update log Signed-off-by: nithinraok <[email protected]> * Fix runtime check (#4501) * Runtime check refinements Signed-off-by: Boris Fomitchev <[email protected]> * Added fp32 casting for ASR nets export Signed-off-by: Boris Fomitchev <[email protected]> * style Signed-off-by: Boris Fomitchev <[email protected]> * Used torch.float32 for clarity Signed-off-by: Boris Fomitchev <[email protected]> * Fixing parameters passing Signed-off-by: Boris Fomitchev <[email protected]> * Update finetune label models (#4504) * initial_script Signed-off-by: nithinraok <[email protected]> * move old script Signed-off-by: nithinraok <[email protected]> * remove finetune func from label models Signed-off-by: nithinraok <[email protected]> * style clean Signed-off-by: nithinraok <[email protected]> * updated config Signed-off-by: nithinraok <[email protected]> * update tutorial Signed-off-by: nithinraok <[email protected]> * lgtm fixes Signed-off-by: nithinraok <[email protected]> * updated based on comments Signed-off-by: nithinraok <[email protected]> * update doc Signed-off-by: nithinraok <[email protected]> * [ASR][Breaking Change] Update signature of Hypothesis alignments (#4511) * Preserve logprobs when preserving alignments Signed-off-by: smajumdar <[email protected]> * Update tests for rnnt gredy and beam search Signed-off-by: smajumdar <[email protected]> * Update all dependents of alignments Signed-off-by: smajumdar <[email protected]> * Update docs Signed-off-by: smajumdar <[email protected]> * Weighted bucketing (#4530) * Additional sentencepiece args - Byte fallback, split digits, split_on_whitespace (#4525) * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Add extra args Signed-off-by: MaximumEntropy <[email protected]> * Reset transformer Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix spm arg Signed-off-by: MaximumEntropy <[email protected]> * Fix help string Signed-off-by: MaximumEntropy <[email protected]> * Add support for ASR Adapter Auxiliary Losses (#4480) * Add support for access mixin registry of custom losses Signed-off-by: smajumdar <[email protected]> * add support for asr custom losses Signed-off-by: smajumdar <[email protected]> * Update for l2 loss Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Update registration of tensors to reset after finishing step Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Update SSL models Signed-off-by: smajumdar <[email protected]> * Add support for validation step properly registering tensors Signed-off-by: smajumdar <[email protected]> * Move reset of registry outside Signed-off-by: smajumdar <[email protected]> * update (#4520) Signed-off-by: stevehuang52 <[email protected]> * fix duplex inference with grammars (#4517) * fix duplex inference with grammars Signed-off-by: ekmb <[email protected]> * add ci test for duplex, fix electronic last sym bug Signed-off-by: ekmb <[email protected]> * test fix Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * update jenkins grammars Signed-off-by: ekmb <[email protected]> * add pt to the docs Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * disable test Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins refactor Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> Co-authored-by: Yang Zhang <[email protected]> * Add Bucketing support to TarredAudioToClassificationLabelDataset (#4465) * Add Bucketing support to TarredAudioToClassificationLabelDataset Signed-off-by: Ewald Enzinger <[email protected]> * Add MTEncDec Finetune support (#4540) * add FT support Signed-off-by: Abhinav Khattar <[email protected]> * rm preproc Signed-off-by: Abhinav Khattar <[email protected]> * review changes Signed-off-by: Abhinav Khattar <[email protected]> * add CI Signed-off-by: Abhinav Khattar <[email protected]> * newline fix Signed-off-by: Abhinav Khattar <[email protected]> * CI fix Signed-off-by: Abhinav Khattar <[email protected]> * clean up Signed-off-by: Abhinav Khattar <[email protected]> * post training cleanup Signed-off-by: Abhinav Khattar <[email protected]> * test Signed-off-by: Abhinav Khattar <[email protected]> * revert Signed-off-by: Abhinav Khattar <[email protected]> * CI test Signed-off-by: Abhinav Khattar <[email protected]> * revert CI changes Signed-off-by: Abhinav Khattar <[email protected]> * original CI Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Add nsys profiling (#4539) * add nsys profiling Signed-off-by: ericharper <[email protected]> * only access omegaconf in setup Signed-off-by: ericharper <[email protected]> * use robust get_rank function Signed-off-by: ericharper <[email protected]> * simplify Signed-off-by: ericharper <[email protected]> * Update megatron prompt learning interface to dialogue (#4545) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang …
* Megatron BART BOS / EOS bug fix (#4495) * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. BART dataset fixes missing <EOS> for deocder output. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed extra padding from BARTDataset. Signed-off-by: Micha Livne <[email protected]> * GPT Prompt Learning Improvements (#4496) * Updated pipeline parallel code to speed up training Signed-off-by: Virginia Adams <[email protected]> * Load global batch size not local mini batch size Signed-off-by: Virginia Adams <[email protected]> * Python reformatting Signed-off-by: Virginia Adams <[email protected]> * Megatron perceiver with tensor parallelism only (#4318) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> * NMESC speaker counting algorithm update (#4500) * initial commit Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Default maj_vote = False, max_rp=0.25 Signed-off-by: Taejin Park <[email protected]> * doc strings and style fix Signed-off-by: Taejin Park <[email protected]> * Docstring minor edit Signed-off-by: Taejin Park <[email protected]> * Default False in the functions Signed-off-by: Taejin Park <[email protected]> * fixed repeated variable Signed-off-by: Taejin Park <[email protected]> * Default as maj_vote=False Signed-off-by: Taejin Park <[email protected]> * removed redundant part in wrtie_rttm func Signed-off-by: Taejin Park <[email protected]> * Removed unused function Signed-off-by: Taejin Park <[email protected]> * Updated and tested silence and very short samples Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Style fix and removing unnecessary parts Signed-off-by: Taejin Park <[email protected]> * unused variables are removed Signed-off-by: Taejin Park <[email protected]> * Fixed commented torch.jit.script Signed-off-by: Taejin Park <[email protected]> * majority voting update Signed-off-by: Taejin Park <[email protected]> * cancelling the update on speaker_utils and clus_diarizer Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * bug fix Signed-off-by: Taejin Park <[email protected]> * Added fp32 converting for torch.mm Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> * Fix dataset parameter typo on tacotron2 example yaml (#4471) Signed-off-by: saarus72 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Noam lr sched: do not force min_lr after max_steps (#4472) Signed-off-by: Adrian Lancucki <[email protected]> Co-authored-by: Adrian Lancucki <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * Refactor for punctuation model (#4367) * Dataloader, collector, loss and metric for multiscale diarization decoder (#4187) * First commit Signed-off-by: Taejin Park <[email protected]> * Checked funtionality and imports Signed-off-by: Taejin Park <[email protected]> * fixed import issues Signed-off-by: Taejin Park <[email protected]> * Removed the changed made by mistake Signed-off-by: Taejin Park <[email protected]> * Style fix Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM errors 001 Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM and style fix Signed-off-by: Taejin Park <[email protected]> * Changed docstrings Signed-off-by: Taejin Park <[email protected]> * LGTM again Signed-off-by: Taejin Park <[email protected]> * Removed unnecessary torch setting lines Signed-off-by: Taejin Park <[email protected]> * Style fix and isort Signed-off-by: Taejin Park <[email protected]> * jbalam-nv comments reflected Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected comments and created _diar_label.py Signed-off-by: Taejin Park <[email protected]> * Typo fix and style fix Signed-off-by: Taejin Park <[email protected]> * Fixed target_spks[0] index error Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * LGTM unused import IterDataset Signed-off-by: Taejin Park <[email protected]> * revert collection doc year Signed-off-by: Taejin Park <[email protected]> * Code format error in collections.py Signed-off-by: Taejin Park <[email protected]> * fix collections space format error Signed-off-by: Taejin Park <[email protected]> * merged main correctly Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * Reflected all comments and tested Signed-off-by: Taejin Park <[email protected]> * style fix and LGTM Signed-off-by: Taejin Park <[email protected]> * rttm_filepath to rttm_file and removed self included funcs, tested Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * removed references to data_dir Signed-off-by: Matvei Novikov <[email protected]> * added missing parameters to data preparation script Signed-off-by: Matvei Novikov <[email protected]> * removed unnecessary file extension check Signed-off-by: Matvei Novikov <[email protected]> * Add ASR CTC Decoding module (#4342) * Initial commit Signed-off-by: smajumdar <[email protected]> * Full support for decoding strategy Signed-off-by: smajumdar <[email protected]> * Temp Signed-off-by: smajumdar <[email protected]> * Fix labels of y_sequence Signed-off-by: smajumdar <[email protected]> * Set support for sentencepiece subword merging Signed-off-by: smajumdar <[email protected]> * Fix char and word based token merge alignment Signed-off-by: smajumdar <[email protected]> * Revert incorrect change Signed-off-by: smajumdar <[email protected]> * Update docstring Signed-off-by: smajumdar <[email protected]> * Improve compatibility with greedy tokens and log probs Signed-off-by: smajumdar <[email protected]> * Update scripts to use decoding strategy Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Add tests and docs Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix speaker decoder timestamps Signed-off-by: smajumdar <[email protected]> * Fix decoding of ctc models Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Option to disable mp in VAD via num_workers=1 (#4317) * Option to disable mp in VAD via num_workers=1 In certain environments python multiprocessing can deadlock. This adds a convenient version to disable by setting num_workers to 1. Signed-off-by: Georg Kucsko <[email protected]> * add none handling Signed-off-by: Georg Kucsko <[email protected]> * additional none handling Signed-off-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * remove redundant bias expand (#4382) * remove redundant bias expand Signed-off-by: Xiaowei Ren <[email protected]> * delete redundant code Signed-off-by: Xiaowei Ren <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * fixed style Signed-off-by: Matvei Novikov <[email protected]> * Add option for specifying wandb save_dir from config (#4379) * give option to user to specify wandb save dir via config Signed-off-by: Shantanu Acharya <[email protected]> * create save_dir directory for wandb logger if not exists Signed-off-by: Shantanu Acharya <[email protected]> * update save_dir get method with a default value Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [Bugfix][TTS] wrong order of returned tuple for general_collate_fn. (#4388) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Merge r1.10.0 main (#4398) * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (#4364) (#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Fix electronic bug, new time ITN rule (#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Correct support for dataclasses in default module dim (#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b158f26a0b690edca7a84714e33752283923) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * remove Copy of Signed-off-by: ericharper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * [bugfix][TTS] pitch, voiced_mask, prob_voiced have the same values. (#4392) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing import error in some cases (#4401) Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Fixing bugs in calling method ctc_decoder_predictions_tensor. (#4414) * updated ctc decoding calls. Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> * fixed the ones for timestamp_utils.py Signed-off-by: Vahid <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> * Update with new conformer checkpoints. (#4417) Signed-off-by: Matvei Novikov <[email protected]> * [TTS] add static method decorator. (#4443) * [TTS] add static method decorator. Signed-off-by: Xuesong Yang <[email protected]> * remove protect prefix Signed-off-by: Xuesong Yang <[email protected]> * fixed style error Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Georg Kucsko <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Xiaowei Ren <[email protected]> Co-authored-by: Shantanu Acharya <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> * bug fix - sample rate was being ignored in vocoder dataset when not loading mel Signed-off-by: Paarth Neekhara <[email protected]> * Add ITN pt (#4516) * Add ITN pt Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Fix style Signed-off-by: Guilherme Steinmann <[email protected]> * Update copyright year to 2022 on ITN pt rules and tests Signed-off-by: Guilherme Steinmann <[email protected]> * Fixed WER initialization in ASR_with_Nemo notebook (#4523) Signed-off-by: Ante Jukić <[email protected]> Co-authored-by: Ante Jukić <[email protected]> * Update cmudict (#4510) phoneme IY1 -> IH1 in NVIDIA Added phonemes for CUSTOMIZABLE Update cmudict file revision and its reference. Signed-off-by: Jason Roche <[email protected]> Co-authored-by: Jason Roche <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> * [Add] Support for Different LRs with Param Groups (#4508) * add support for param groups Signed-off-by: stevehuang52 <[email protected]> * make config more general Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Weighted bucketing (#4474) * Add silence handling for speaker diarization pipeline (#4512) * initial commit Signed-off-by: nithinraok <[email protected]> * fixed silence wav file issue causing clustering to evaluate on null embeddings Signed-off-by: nithinraok <[email protected]> * fixed zero duration issue Signed-off-by: nithinraok <[email protected]> * updated with comments Signed-off-by: nithinraok <[email protected]> * minor doc change Signed-off-by: nithinraok <[email protected]> * update log Signed-off-by: nithinraok <[email protected]> * Fix runtime check (#4501) * Runtime check refinements Signed-off-by: Boris Fomitchev <[email protected]> * Added fp32 casting for ASR nets export Signed-off-by: Boris Fomitchev <[email protected]> * style Signed-off-by: Boris Fomitchev <[email protected]> * Used torch.float32 for clarity Signed-off-by: Boris Fomitchev <[email protected]> * Fixing parameters passing Signed-off-by: Boris Fomitchev <[email protected]> * Update finetune label models (#4504) * initial_script Signed-off-by: nithinraok <[email protected]> * move old script Signed-off-by: nithinraok <[email protected]> * remove finetune func from label models Signed-off-by: nithinraok <[email protected]> * style clean Signed-off-by: nithinraok <[email protected]> * updated config Signed-off-by: nithinraok <[email protected]> * update tutorial Signed-off-by: nithinraok <[email protected]> * lgtm fixes Signed-off-by: nithinraok <[email protected]> * updated based on comments Signed-off-by: nithinraok <[email protected]> * update doc Signed-off-by: nithinraok <[email protected]> * [ASR][Breaking Change] Update signature of Hypothesis alignments (#4511) * Preserve logprobs when preserving alignments Signed-off-by: smajumdar <[email protected]> * Update tests for rnnt gredy and beam search Signed-off-by: smajumdar <[email protected]> * Update all dependents of alignments Signed-off-by: smajumdar <[email protected]> * Update docs Signed-off-by: smajumdar <[email protected]> * Weighted bucketing (#4530) * Additional sentencepiece args - Byte fallback, split digits, split_on_whitespace (#4525) * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Add extra args Signed-off-by: MaximumEntropy <[email protected]> * Reset transformer Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix spm arg Signed-off-by: MaximumEntropy <[email protected]> * Fix help string Signed-off-by: MaximumEntropy <[email protected]> * Add support for ASR Adapter Auxiliary Losses (#4480) * Add support for access mixin registry of custom losses Signed-off-by: smajumdar <[email protected]> * add support for asr custom losses Signed-off-by: smajumdar <[email protected]> * Update for l2 loss Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Add unittests Signed-off-by: smajumdar <[email protected]> * Update registration of tensors to reset after finishing step Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Remove comment Signed-off-by: smajumdar <[email protected]> * Update SSL models Signed-off-by: smajumdar <[email protected]> * Add support for validation step properly registering tensors Signed-off-by: smajumdar <[email protected]> * Move reset of registry outside Signed-off-by: smajumdar <[email protected]> * update (#4520) Signed-off-by: stevehuang52 <[email protected]> * fix duplex inference with grammars (#4517) * fix duplex inference with grammars Signed-off-by: ekmb <[email protected]> * add ci test for duplex, fix electronic last sym bug Signed-off-by: ekmb <[email protected]> * test fix Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * update jenkins grammars Signed-off-by: ekmb <[email protected]> * add pt to the docs Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * disable test Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins refactor Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> * test Signed-off-by: ekmb <[email protected]> Co-authored-by: Yang Zhang <[email protected]> * Add Bucketing support to TarredAudioToClassificationLabelDataset (#4465) * Add Bucketing support to TarredAudioToClassificationLabelDataset Signed-off-by: Ewald Enzinger <[email protected]> * Add MTEncDec Finetune support (#4540) * add FT support Signed-off-by: Abhinav Khattar <[email protected]> * rm preproc Signed-off-by: Abhinav Khattar <[email protected]> * review changes Signed-off-by: Abhinav Khattar <[email protected]> * add CI Signed-off-by: Abhinav Khattar <[email protected]> * newline fix Signed-off-by: Abhinav Khattar <[email protected]> * CI fix Signed-off-by: Abhinav Khattar <[email protected]> * clean up Signed-off-by: Abhinav Khattar <[email protected]> * post training cleanup Signed-off-by: Abhinav Khattar <[email protected]> * test Signed-off-by: Abhinav Khattar <[email protected]> * revert Signed-off-by: Abhinav Khattar <[email protected]> * CI test Signed-off-by: Abhinav Khattar <[email protected]> * revert CI changes Signed-off-by: Abhinav Khattar <[email protected]> * original CI Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Add nsys profiling (#4539) * add nsys profiling Signed-off-by: ericharper <[email protected]> * only access omegaconf in setup Signed-off-by: ericharper <[email protected]> * use robust get_rank function Signed-off-by: ericharper <[email protected]> * simplify Signed-off-by: ericharper <[email protected]> * Update megatron prompt learning interface to dialogue (#4545) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assist…
* Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (NVIDIA#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (NVIDIA#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (NVIDIA#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (NVIDIA#4364) (NVIDIA#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (NVIDIA#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (NVIDIA#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (NVIDIA#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (NVIDIA#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (NVIDIA#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (NVIDIA#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (NVIDIA#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (NVIDIA#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (NVIDIA#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (NVIDIA#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]>
* Temp Signed-off-by: MaximumEntropy <[email protected]> * Add megatron dataset Signed-off-by: MaximumEntropy <[email protected]> * Update config and fix global batch fetcher Signed-off-by: MaximumEntropy <[email protected]> * Add dataset class Signed-off-by: MaximumEntropy <[email protected]> * Update comments Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Fix duplicate yaml key Signed-off-by: MaximumEntropy <[email protected]> * Translate method and preprocess script for raw text Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Fix arg name Signed-off-by: MaximumEntropy <[email protected]> * Fix other arg Signed-off-by: MaximumEntropy <[email protected]> * Change sampler back Signed-off-by: MaximumEntropy <[email protected]> * Move back to global batch fetcher to use distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Add text memmap data Signed-off-by: MaximumEntropy <[email protected]> * Update monitor Signed-off-by: MaximumEntropy <[email protected]> * Fixes for PP Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Truncate examples in text memmap Signed-off-by: MaximumEntropy <[email protected]> * NMT training batch interpolation key Signed-off-by: MaximumEntropy <[email protected]> * tarred data fix Signed-off-by: MaximumEntropy <[email protected]> * Change dataset type check Signed-off-by: MaximumEntropy <[email protected]> * Fix sampler Signed-off-by: MaximumEntropy <[email protected]> * Pass dataset cfg to determine type Signed-off-by: MaximumEntropy <[email protected]> * Log global step on validation step as well Signed-off-by: MaximumEntropy <[email protected]> * Fix NMT model saving with artifacts Signed-off-by: MaximumEntropy <[email protected]> * Initialize DDP in decode if not initialized. Needed for inference only mode Signed-off-by: MaximumEntropy <[email protected]> * Megatron NMT inference script Signed-off-by: MaximumEntropy <[email protected]> * Inference config file Signed-off-by: MaximumEntropy <[email protected]> * hardcode max delta temporarily Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * detokenizer if processor is not none Signed-off-by: MaximumEntropy <[email protected]> * Sampler config Signed-off-by: MaximumEntropy <[email protected]> * Compat with configs without sampler arg Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Comment for validation dataset type Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer building Signed-off-by: MaximumEntropy <[email protected]> * CI test for megatron nmt Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenizer in restore Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * O2 restore from fix Signed-off-by: MaximumEntropy <[email protected]> * Remove print Signed-off-by: MaximumEntropy <[email protected]> * Change tokenizer model name in config Signed-off-by: MaximumEntropy <[email protected]> * Logging Signed-off-by: MaximumEntropy <[email protected]> * Set seed for distributed sampler Signed-off-by: MaximumEntropy <[email protected]> * Cluster debugging messages Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix max generation delta Signed-off-by: MaximumEntropy <[email protected]> * No LM Init Signed-off-by: MaximumEntropy <[email protected]> * Use nlp save restore connector Signed-off-by: MaximumEntropy <[email protected]> * Remove useless infer args Signed-off-by: MaximumEntropy <[email protected]> * Typo Signed-off-by: MaximumEntropy <[email protected]> * UTF8 safe print of translation result Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Add save restore connector back with comment Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Empty to restart * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Check for test ds Signed-off-by: MaximumEntropy <[email protected]> * set fusion to false Signed-off-by: MaximumEntropy <[email protected]> * Initial perceiver encoder Signed-off-by: MaximumEntropy <[email protected]> * Perceiver with PP=1 Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn Signed-off-by: MaximumEntropy <[email protected]> * CI test and remove init cross attn arg Signed-off-by: MaximumEntropy <[email protected]> * Remove init cross attn layers from file Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Clean up Signed-off-by: MaximumEntropy <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (NVIDIA#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (NVIDIA#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (NVIDIA#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (NVIDIA#4364) (NVIDIA#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (NVIDIA#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (NVIDIA#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Refactor bias act fusion Signed-off-by: MaximumEntropy <[email protected]> * Update NMT config Signed-off-by: MaximumEntropy <[email protected]> * Fix electronic bug, new time ITN rule (NVIDIA#4355) * fix electronic bug Signed-off-by: ekmb <[email protected]> * add new itn time rule Signed-off-by: ekmb <[email protected]> * revert domain changes Signed-off-by: ekmb <[email protected]> * remove repetition Signed-off-by: ekmb <[email protected]> * Update ci tests Signed-off-by: MaximumEntropy <[email protected]> * Correct support for dataclasses in default module dim (NVIDIA#4372) * Correct support for dataclasses in default module dim Signed-off-by: smajumdar <[email protected]> * Fix path for save of results Signed-off-by: smajumdar <[email protected]> * fix pad id bug (NVIDIA#4377) Signed-off-by: Yi Dong <[email protected]> * Question answering bug fix (NVIDIA#4381) * refactor dialogue state tracking for modelling/dataset interoperability Signed-off-by: Zhilin Wang <[email protected]> * fix style changes Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * fix style raised by lgtm Signed-off-by: Zhilin Wang <[email protected]> * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR NVIDIA#3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * style fix * style fix * correct typo * style fix * style fix Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix ASR Typos in tutorials (NVIDIA#4384) * Fix typos Signed-off-by: smajumdar <[email protected]> * Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383) Signed-off-by: tbartley94 <[email protected]> Co-authored-by: tbartley94 <[email protected]> (cherry picked from commit 0322b15) Co-authored-by: Travis Bartley <[email protected]> * Add Docs for NeMo Adapters (NVIDIA#4369) Signed-off-by: smajumdar <[email protected]> * Update NeMo docs (NVIDIA#4397) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Punctuation and capitalization tests race condition (NVIDIA#4399) * Add draft of race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Minor improvements Signed-off-by: PeganovAnton <[email protected]> * More race condition fixes Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * Improve error message Signed-off-by: PeganovAnton <[email protected]> * bias act fusion changes Signed-off-by: MaximumEntropy <[email protected]> * Address comments Signed-off-by: MaximumEntropy <[email protected]> * Fix geglu without fusion Signed-off-by: MaximumEntropy <[email protected]> * Reset files to main Signed-off-by: MaximumEntropy <[email protected]> * Remove hidden blocks Signed-off-by: MaximumEntropy <[email protected]> * Fix style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Travis Bartley <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Signed-off-by: Hainan Xu <[email protected]>
What does this PR do ?
Implements a Megatron-based perceiver encoder with tensor parallel only.
Collection: NLP
Changelog
megatron_perceiver_encoders.py
file and add corresponding configuration options in yaml files.Usage
Set the following
aayn_base_megatron.yaml
ormegatron_t5_config.yaml
encoder_arch: perceiver
hidden_steps: 32
num_self_attention_per_cross_attention: 2
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information