-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Megatron GPT model finetuning #6210
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: MaximumEntropy <[email protected]>
…IDIA/NeMo into sandeepsub/gpt_sft_stable_rebase_main
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
…IDIA/NeMo into sandeepsub/gpt_sft_stable_rebase_main
Signed-off-by: MaximumEntropy <[email protected]>
…IDIA/NeMo into sandeepsub/gpt_sft_stable_rebase_main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loos good. left some comments.
return model | ||
|
||
|
||
def load_from_checkpoint_dir(cls, cfg, trainer, modify_confg_fn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can put this into a utility function. It is used a lot in other places to load from checkpoint dir
text = self.prompt_template.replace('{input}', original_context).replace('{output}', output) | ||
|
||
if self.separate_prompt_and_response_with_newline and self.prompt_template is None: | ||
text = context + '\n' + output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use user provided separators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the prompt_template should cover this case right?
if self.prompt_template is not None: | ||
import ipdb | ||
|
||
ipdb.set_trace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the debug statement?
from argparse import ArgumentParser | ||
from multiprocessing import Pool | ||
|
||
from sacremoses import MosesDetokenizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it part of the plan to release NIV and T0 data preprocessing scripts? We would like others to SFT GPT with the same instruction dataset?
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
…IDIA/NeMo into sandeepsub/gpt_sft_stable_rebase_main
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
* copy from sft_from_gpt * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed tokenization and example * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * maybe remove (got from upstream) * Eval metrics while finetuning Signed-off-by: MaximumEntropy <[email protected]> * Add missing args Signed-off-by: MaximumEntropy <[email protected]> * Add arg Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Wrap in try except Signed-off-by: MaximumEntropy <[email protected]> * Try fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Add separate validation and test batch sizes Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Add assert Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix checkpoint name Signed-off-by: MaximumEntropy <[email protected]> * Explict sampling args Signed-off-by: MaximumEntropy <[email protected]> * Update t0 script Signed-off-by: MaximumEntropy <[email protected]> * Add niv2 script Signed-off-by: MaximumEntropy <[email protected]> * Change workers Signed-off-by: MaximumEntropy <[email protected]> * Fix labels Signed-off-by: MaximumEntropy <[email protected]> * Ignore download Signed-off-by: MaximumEntropy <[email protected]> * Minor fixes Signed-off-by: MaximumEntropy <[email protected]> * Add dist opt support Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor Signed-off-by: MaximumEntropy <[email protected]> * Allow skipping validation Signed-off-by: MaximumEntropy <[email protected]> * Fix tokenization and padding to max batch Signed-off-by: MaximumEntropy <[email protected]> * Adds several configurable flags for Megatron GPT models (NVIDIA#5991) * Initial Signed-off-by: MaximumEntropy <[email protected]> * Multiple fixes Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add to CI test Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * check position embs for gpt prompt learning Signed-off-by: Adi Renduchintala <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update args Signed-off-by: MaximumEntropy <[email protected]> * Disable tts unit test Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Update Jenkinsfile Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'. Signed-off-by: khcs <[email protected]> * update config to to use correct key Signed-off-by: ericharper <[email protected]> * revert Jenkinsfile back to fused_adam Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Signed-off-by: khcs <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: ericharper <[email protected]> * Fast glu activations (NVIDIA#6058) * fast glu activations Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Clean up activation list Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Explicitly check for united embeddings when logging params (NVIDIA#6085) * Explicitly check for united embeddings Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Option for model extracted dir Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Add index mapping dir Signed-off-by: MaximumEntropy <[email protected]> * Assistant prompt Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Remove ipdb Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Override dropout Signed-off-by: MaximumEntropy <[email protected]> * Change sampler Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Roll back again Signed-off-by: MaximumEntropy <[email protected]> * Revert TTS Signed-off-by: MaximumEntropy <[email protected]> * Reset TTS Signed-off-by: MaximumEntropy <[email protected]> * Revert further Signed-off-by: MaximumEntropy <[email protected]> * Revert more to main Signed-off-by: MaximumEntropy <[email protected]> * Fix Test DS Signed-off-by: MaximumEntropy <[email protected]> * Address PR comments Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add the option to provide a prompt template via fstrings Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add CI test Signed-off-by: MaximumEntropy <[email protected]> * fix ci test Signed-off-by: MaximumEntropy <[email protected]> * Fix CI test Signed-off-by: MaximumEntropy <[email protected]> * Minor Signed-off-by: MaximumEntropy <[email protected]> * Fix CI Signed-off-by: MaximumEntropy <[email protected]> * Fix CI Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix CI Signed-off-by: MaximumEntropy <[email protected]> * Fix workers issue Signed-off-by: MaximumEntropy <[email protected]> * Fix workers Signed-off-by: MaximumEntropy <[email protected]> --------- Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Signed-off-by: khcs <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: soares-f <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: hsiehjackson <[email protected]>
What does this PR do ?
Adds the ability to fine-tune Megatron GPT Models.
Collection: NLP
Changelog
Usage
# Add a code snippet demonstrating how to use this
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information