-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save model parallel .nemo in ExpManager #6115
Conversation
… attribute Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
…om/NVIDIA/NeMo into adithyare/save_model_parallel_nemo
Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
Signed-off-by: arendu <[email protected]>
…om/NVIDIA/NeMo into adithyare/save_model_parallel_nemo
@titu1994 I added the pleasefixme for the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks correct, minor modification below
Signed-off-by: arendu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm I have PR ready to remove the please fix me for RNNT
* patch to allow using tokenizers without additional_special_tokens_ids attribute Signed-off-by: arendu <[email protected]> * save tp pp > 1 .nemo in exp manager Signed-off-by: arendu <[email protected]> * Better rank checking for model parallel > 1 .nemo saving Signed-off-by: MaximumEntropy <[email protected]> * Safety check Signed-off-by: MaximumEntropy <[email protected]> * check for nlp model Signed-off-by: arendu <[email protected]> * custom on save checkpoint for NLPModel Signed-off-by: arendu <[email protected]> * minor update Signed-off-by: arendu <[email protected]> * minor updates Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * added pleasefixme Signed-off-by: arendu <[email protected]> * updated Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: arendu <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* patch to allow using tokenizers without additional_special_tokens_ids attribute Signed-off-by: arendu <[email protected]> * save tp pp > 1 .nemo in exp manager Signed-off-by: arendu <[email protected]> * Better rank checking for model parallel > 1 .nemo saving Signed-off-by: MaximumEntropy <[email protected]> * Safety check Signed-off-by: MaximumEntropy <[email protected]> * check for nlp model Signed-off-by: arendu <[email protected]> * custom on save checkpoint for NLPModel Signed-off-by: arendu <[email protected]> * minor update Signed-off-by: arendu <[email protected]> * minor updates Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * reverting custom save logic Signed-off-by: arendu <[email protected]> * added pleasefixme Signed-off-by: arendu <[email protected]> * updated Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: arendu <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: hsiehjackson <[email protected]>
What does this PR do ?
save_best_model and always_save_nemo will work with tp/pp > 1 with this PR.
Collection: [NLP,ASR]
Changelog
Usage
# Add a code snippet demonstrating how to use this
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information