mT5 whole word masking and T5 finetuning config fixes #3776

MaximumEntropy · 2022-03-01T07:00:14Z

Signed-off-by: MaximumEntropy [email protected]

What does this PR do ?

This PR makes two fixes to T5 before r1.7.0 (a) Sets async grad allreduce to false for bf16 O2 (b) undoes whole word masking logic for sentencepiece

Collection: NLP

Changelog

Add specific line by line info of high level changes in this PR.

Usage

N/A

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: MaximumEntropy <[email protected]>

… t5_masking_and_o2_fixes

Signed-off-by: MaximumEntropy <[email protected]>

… t5_masking_and_o2_fixes

Signed-off-by: MaximumEntropy <[email protected]>

ericharper

LGTM.

* Tn bug 1.7.0 (#3730) * fix es and fr bug Signed-off-by: Yang Zhang <[email protected]> * add file Signed-off-by: Yang Zhang <[email protected]> * [TTS] Fix bugs in E2E TTS, Mixer-TTS and FastPitch (#3740) * fix bugs Signed-off-by: Oktai Tatanov <[email protected]> * fix bug in e2e tts and mixer tts Signed-off-by: Oktai Tatanov <[email protected]> * Mirror AN4 data while servers are down (#3743) Signed-off-by: smajumdar <[email protected]> * Bugfix for GPT eval (#3744) * use tokens_cut not tokens Signed-off-by: ericharper <[email protected]> * remove precision conversion and comment jit for bias gelu Signed-off-by: ericharper <[email protected]> * revert comment update mbs in config Signed-off-by: ericharper <[email protected]> * calculate micro_batch_size during complete and compute_logprobs Signed-off-by: ericharper <[email protected]> * ASR SSL update (#3746) * ssl update Signed-off-by: sam1373 <[email protected]> * tutorial update Signed-off-by: sam1373 <[email protected]> * Fix SSL configs for 1.7 (#3748) * ssl update Signed-off-by: sam1373 <[email protected]> * tutorial update Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * punct process bug fix (#3747) Signed-off-by: ekmb <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * updated conformer models. (#3741) Signed-off-by: Vahid <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Yuya/megatron t5 glue eval (#3751) * Add megatron t5 glue eval-only script Signed-off-by: Yu Yao <[email protected]> * Update megatron t5 glue eval default configs Signed-off-by: Yu Yao <[email protected]> * Update megatron t5 glue eval configs Signed-off-by: Yu Yao <[email protected]> * Update config comments Signed-off-by: Yu Yao <[email protected]> Co-authored-by: Yu Yao <[email protected]> * Specify gpus in SSL notebook (#3753) * ssl update Signed-off-by: sam1373 <[email protected]> * tutorial update Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * specify gpus Signed-off-by: sam1373 <[email protected]> * Duplex model inference fix, money encoder fix (#3754) Signed-off-by: ekmb <[email protected]> * Update docs for RNNT and overriding fused batch size (#3755) Signed-off-by: smajumdar <[email protected]> * fix consumed samples calculation + PTune Model bugs (#3738) * fix the way computing consumed samples Signed-off-by: Yi Dong <[email protected]> * fixed ptune model Signed-off-by: Yi Dong <[email protected]> * make sure notebook is working Signed-off-by: Yi Dong <[email protected]> * added try-catch Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fix directories in ssl notebook (#3758) * ssl update Signed-off-by: sam1373 <[email protected]> * tutorial update Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * revert configs Signed-off-by: sam1373 <[email protected]> * specify gpus Signed-off-by: sam1373 <[email protected]> * update dirs Signed-off-by: sam1373 <[email protected]> * TN docs update (#3735) * TN docs update: audio based docs added, quick start, ref fixed, etc Signed-off-by: ekmb <[email protected]> * add deployment script dir and Sp TN Signed-off-by: ekmb <[email protected]> Co-authored-by: Yang Zhang <[email protected]> * Update Tacotron2_Training.ipynb (#3769) Signed-off-by: Jason <[email protected]> * fix dockerfile (#3778) Signed-off-by: Yang Zhang <[email protected]> * Prompt-Tuning-Documentation (#3777) * Update megatron.rst * Updated example prompt tuning script's doc string * Update megatron.rst * Update megatron.rst Co-authored-by: Eric Harper <[email protected]> * Prompt tuning bug fix (#3780) * Making updated code backwards compatible with previous prompt tuned models Signed-off-by: Virginia Adams <[email protected]> * Fixed backward compatiablity bug Signed-off-by: Virginia Adams <[email protected]> * Removed random import Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * revert changes (#3785) Signed-off-by: Yang Zhang <[email protected]> * Fixed soft prompt eval loading bug (#3805) Signed-off-by: Virginia Adams <[email protected]> * mT5 whole word masking and T5 finetuning config fixes (#3776) * O2 and whole word masking changes Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update yaml Signed-off-by: MaximumEntropy <[email protected]> * Tok and O2 fix Signed-off-by: MaximumEntropy <[email protected]> * Fix arg passing Signed-off-by: MaximumEntropy <[email protected]> * Fix checkpoint path Signed-off-by: MaximumEntropy <[email protected]> * Style fixes Signed-off-by: MaximumEntropy <[email protected]> * Raise error if FP16 training is tried with O2 recipe. (#3806) * raise error Signed-off-by: ericharper <[email protected]> * update assert Signed-off-by: ericharper <[email protected]> * update error message Signed-off-by: ericharper <[email protected]> * update error message Signed-off-by: ericharper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * remove test Signed-off-by: ericharper <[email protected]> * revert bad merges Signed-off-by: ericharper <[email protected]> * revert change partitions Signed-off-by: ericharper <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Yu Yao <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]>

O2 and whole word masking changes

c7c7700

Signed-off-by: MaximumEntropy <[email protected]>

MaximumEntropy requested a review from ericharper March 1, 2022 07:00

MaximumEntropy added 4 commits March 1, 2022 10:20

Style

6919d4c

Signed-off-by: MaximumEntropy <[email protected]>

Update yaml

1502a23

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 't5_masking_and_o2_fixes' of github.com:NVIDIA/NeMo into…

0f74221

… t5_masking_and_o2_fixes

Tok and O2 fix

344ffc2

Signed-off-by: MaximumEntropy <[email protected]>

MaximumEntropy changed the title ~~O2 and whole word masking changes~~ mT5 whole word masking and T5 finetuning config fixes Mar 2, 2022

MaximumEntropy changed the base branch from r1.7.0 to r1.7.1 March 2, 2022 18:28

MaximumEntropy added 7 commits March 2, 2022 10:28

Merge branch 'r1.7.1' into t5_masking_and_o2_fixes

82a34af

Merge branch 'r1.7.1' into t5_masking_and_o2_fixes

96c5daa

Fix arg passing

bc2580a

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 't5_masking_and_o2_fixes' of github.com:NVIDIA/NeMo into…

aee1441

… t5_masking_and_o2_fixes

Fix checkpoint path

8ac107a

Signed-off-by: MaximumEntropy <[email protected]>

Style fixes

494f8ad

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 'r1.7.1' into t5_masking_and_o2_fixes

3710ce3

ericharper approved these changes Mar 7, 2022

View reviewed changes

ericharper merged commit 836d813 into r1.7.1 Mar 7, 2022

ericharper deleted the t5_masking_and_o2_fixes branch March 7, 2022 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mT5 whole word masking and T5 finetuning config fixes #3776

mT5 whole word masking and T5 finetuning config fixes #3776

MaximumEntropy commented Mar 1, 2022

ericharper left a comment

mT5 whole word masking and T5 finetuning config fixes #3776

mT5 whole word masking and T5 finetuning config fixes #3776

Conversation

MaximumEntropy commented Mar 1, 2022

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

ericharper left a comment

Choose a reason for hiding this comment