Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds several configurable flags for Megatron GPT models #5991

Merged
merged 33 commits into from
Feb 18, 2023

Conversation

MaximumEntropy
Copy link
Contributor

@MaximumEntropy MaximumEntropy commented Feb 10, 2023

What does this PR do ?

This PR adds the following functionality for Megatron GPT models.

  1. Disable biases.
  2. Add different activation functions - geglu, swiglu, reglu, squared relu
  3. Different transformer layer configs - pre-ln, post-ln, normformer.
  4. RoPE
  5. Disable both hidden and attention dropout.
  6. RMSNorm Normalization
  7. Untie embedding and output layer.

Collection: NLP

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the NLP label Feb 10, 2023
@MaximumEntropy MaximumEntropy marked this pull request as draft February 10, 2023 18:51
ericharper
ericharper previously approved these changes Feb 10, 2023
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@MaximumEntropy MaximumEntropy marked this pull request as ready for review February 13, 2023 22:52
@github-actions github-actions bot added the CI label Feb 13, 2023
Copy link
Collaborator

@khcs khcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as unified config flags. Only minor review on flag consistency regarding "rotary_percent(age)". Thanks!

Jenkinsfile Outdated Show resolved Hide resolved
Jenkinsfile Outdated Show resolved Hide resolved
@khcs khcs self-requested a review February 15, 2023 23:49
khcs
khcs previously approved these changes Feb 15, 2023
Copy link
Collaborator

@khcs khcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for integrating the config changes and also adding the CI test, Sandeep!

Signed-off-by: MaximumEntropy <[email protected]>
@@ -508,16 +508,15 @@ def _get_total_params_across_model_parallel_groups_gpt_bert(self, model):
num_parameters_on_device = sum(
[sum([p.nelement() for p in model_module.parameters()]) for model_module in model]
)
if parallel_state.get_pipeline_model_parallel_world_size() > 1 and parallel_state.is_pipeline_last_stage(
if parallel_state.get_pipeline_model_parallel_world_size() > 1 and parallel_state.is_pipeline_first_stage(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah because word embeddings are no longer present in the last pipeline stage if you untie embeddings and output weights. But they should always be present in the first pipeline stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the comment below.

ignore_virtual=True
):
# substract the embedding weights on the last virtual stage
num_word_embedding_parameters = sum([p.nelement() for p in model[-1].word_embeddings_weight()])
num_parameters_on_device -= num_word_embedding_parameters
else:
num_parameters_on_device = sum([p.nelement() for p in model.parameters()])

if parallel_state.get_pipeline_model_parallel_world_size() > 1 and parallel_state.is_pipeline_last_stage(
if parallel_state.get_pipeline_model_parallel_world_size() > 1 and parallel_state.is_pipeline_first_stage(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed?

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>
@okuchaiev
Copy link
Member

@khcs and @ericharper is this good to merge now?

@ericharper
Copy link
Collaborator

@khcs and @ericharper is this good to merge now?

No, the CI test with pp=2 is failing

@ericharper
Copy link
Collaborator

@khcs and @ericharper is this good to merge now?

No, the CI test with pp=2 is failing

Update: it looks like we found the issue. Waiting to see if CI passes now.

Copy link
Collaborator

@khcs khcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and passing all CI test now! Thanks all!

@MaximumEntropy MaximumEntropy merged commit 4a56631 into main Feb 18, 2023
@MaximumEntropy MaximumEntropy deleted the sandeepsub/megatron_lm_gpt_compat branch February 18, 2023 05:13
MaximumEntropy added a commit that referenced this pull request Mar 4, 2023
* Initial

Signed-off-by: MaximumEntropy <[email protected]>

* Multiple fixes

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add to CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* check position embs for gpt prompt learning

Signed-off-by: Adi Renduchintala <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update args

Signed-off-by: MaximumEntropy <[email protected]>

* Disable tts unit test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

* Update Jenkinsfile

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>

* update config to to use correct key

Signed-off-by: ericharper <[email protected]>

* revert Jenkinsfile back to fused_adam

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>
@hiyijian
Copy link

hiyijian commented Mar 8, 2023

It seems that RoPE can not work with sequence_parallel

titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* Initial

Signed-off-by: MaximumEntropy <[email protected]>

* Multiple fixes

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add to CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* check position embs for gpt prompt learning

Signed-off-by: Adi Renduchintala <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update args

Signed-off-by: MaximumEntropy <[email protected]>

* Disable tts unit test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

* Update Jenkinsfile

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>

* update config to to use correct key

Signed-off-by: ericharper <[email protected]>

* revert Jenkinsfile back to fused_adam

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>
ericharper added a commit that referenced this pull request Apr 6, 2023
* copy from sft_from_gpt

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed tokenization and example

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* maybe remove (got from upstream)

* Eval metrics while finetuning

Signed-off-by: MaximumEntropy <[email protected]>

* Add missing args

Signed-off-by: MaximumEntropy <[email protected]>

* Add arg

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Wrap in try except

Signed-off-by: MaximumEntropy <[email protected]>

* Try fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add separate validation and test batch sizes

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add assert

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix checkpoint name

Signed-off-by: MaximumEntropy <[email protected]>

* Explict sampling args

Signed-off-by: MaximumEntropy <[email protected]>

* Update t0 script

Signed-off-by: MaximumEntropy <[email protected]>

* Add niv2 script

Signed-off-by: MaximumEntropy <[email protected]>

* Change workers

Signed-off-by: MaximumEntropy <[email protected]>

* Fix labels

Signed-off-by: MaximumEntropy <[email protected]>

* Ignore download

Signed-off-by: MaximumEntropy <[email protected]>

* Minor fixes

Signed-off-by: MaximumEntropy <[email protected]>

* Add dist opt support

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Minor

Signed-off-by: MaximumEntropy <[email protected]>

* Allow skipping validation

Signed-off-by: MaximumEntropy <[email protected]>

* Fix tokenization and padding to max batch

Signed-off-by: MaximumEntropy <[email protected]>

* Adds several configurable flags for Megatron GPT models (#5991)

* Initial

Signed-off-by: MaximumEntropy <[email protected]>

* Multiple fixes

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add to CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* check position embs for gpt prompt learning

Signed-off-by: Adi Renduchintala <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update args

Signed-off-by: MaximumEntropy <[email protected]>

* Disable tts unit test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

* Update Jenkinsfile

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>

* update config to to use correct key

Signed-off-by: ericharper <[email protected]>

* revert Jenkinsfile back to fused_adam

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>

* Fast glu activations (#6058)

* fast glu activations

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Clean up activation list

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: MaximumEntropy <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Explicitly check for united embeddings when logging params (#6085)

* Explicitly check for united embeddings

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: MaximumEntropy <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Option for model extracted dir

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add index mapping dir

Signed-off-by: MaximumEntropy <[email protected]>

* Assistant prompt

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Remove ipdb

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Override dropout

Signed-off-by: MaximumEntropy <[email protected]>

* Change sampler

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Roll back again

Signed-off-by: MaximumEntropy <[email protected]>

* Revert TTS

Signed-off-by: MaximumEntropy <[email protected]>

* Reset TTS

Signed-off-by: MaximumEntropy <[email protected]>

* Revert further

Signed-off-by: MaximumEntropy <[email protected]>

* Revert more to main

Signed-off-by: MaximumEntropy <[email protected]>

* Fix Test DS

Signed-off-by: MaximumEntropy <[email protected]>

* Address PR comments

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add the option to provide a prompt template via fstrings

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add CI test

Signed-off-by: MaximumEntropy <[email protected]>

* fix ci test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Minor

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix workers issue

Signed-off-by: MaximumEntropy <[email protected]>

* Fix workers

Signed-off-by: MaximumEntropy <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: soares-f <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* Initial

Signed-off-by: MaximumEntropy <[email protected]>

* Multiple fixes

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add to CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* check position embs for gpt prompt learning

Signed-off-by: Adi Renduchintala <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update args

Signed-off-by: MaximumEntropy <[email protected]>

* Disable tts unit test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

* Update Jenkinsfile

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>

* update config to to use correct key

Signed-off-by: ericharper <[email protected]>

* revert Jenkinsfile back to fused_adam

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* copy from sft_from_gpt

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed tokenization and example

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* maybe remove (got from upstream)

* Eval metrics while finetuning

Signed-off-by: MaximumEntropy <[email protected]>

* Add missing args

Signed-off-by: MaximumEntropy <[email protected]>

* Add arg

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Wrap in try except

Signed-off-by: MaximumEntropy <[email protected]>

* Try fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add separate validation and test batch sizes

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add assert

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix checkpoint name

Signed-off-by: MaximumEntropy <[email protected]>

* Explict sampling args

Signed-off-by: MaximumEntropy <[email protected]>

* Update t0 script

Signed-off-by: MaximumEntropy <[email protected]>

* Add niv2 script

Signed-off-by: MaximumEntropy <[email protected]>

* Change workers

Signed-off-by: MaximumEntropy <[email protected]>

* Fix labels

Signed-off-by: MaximumEntropy <[email protected]>

* Ignore download

Signed-off-by: MaximumEntropy <[email protected]>

* Minor fixes

Signed-off-by: MaximumEntropy <[email protected]>

* Add dist opt support

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Minor

Signed-off-by: MaximumEntropy <[email protected]>

* Allow skipping validation

Signed-off-by: MaximumEntropy <[email protected]>

* Fix tokenization and padding to max batch

Signed-off-by: MaximumEntropy <[email protected]>

* Adds several configurable flags for Megatron GPT models (NVIDIA#5991)

* Initial

Signed-off-by: MaximumEntropy <[email protected]>

* Multiple fixes

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add to CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* check position embs for gpt prompt learning

Signed-off-by: Adi Renduchintala <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update args

Signed-off-by: MaximumEntropy <[email protected]>

* Disable tts unit test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

* Update Jenkinsfile

Changed optimizer for GPT training from 'fused_adam' to 'distributed_fused_adam'.

Signed-off-by: khcs <[email protected]>

* update config to to use correct key

Signed-off-by: ericharper <[email protected]>

* revert Jenkinsfile back to fused_adam

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>

* Fast glu activations (NVIDIA#6058)

* fast glu activations

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Clean up activation list

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: MaximumEntropy <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Explicitly check for united embeddings when logging params (NVIDIA#6085)

* Explicitly check for united embeddings

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: MaximumEntropy <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Option for model extracted dir

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Add index mapping dir

Signed-off-by: MaximumEntropy <[email protected]>

* Assistant prompt

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Remove ipdb

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Override dropout

Signed-off-by: MaximumEntropy <[email protected]>

* Change sampler

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Roll back again

Signed-off-by: MaximumEntropy <[email protected]>

* Revert TTS

Signed-off-by: MaximumEntropy <[email protected]>

* Reset TTS

Signed-off-by: MaximumEntropy <[email protected]>

* Revert further

Signed-off-by: MaximumEntropy <[email protected]>

* Revert more to main

Signed-off-by: MaximumEntropy <[email protected]>

* Fix Test DS

Signed-off-by: MaximumEntropy <[email protected]>

* Address PR comments

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add the option to provide a prompt template via fstrings

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add CI test

Signed-off-by: MaximumEntropy <[email protected]>

* fix ci test

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI test

Signed-off-by: MaximumEntropy <[email protected]>

* Minor

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* Fix CI

Signed-off-by: MaximumEntropy <[email protected]>

* Fix workers issue

Signed-off-by: MaximumEntropy <[email protected]>

* Fix workers

Signed-off-by: MaximumEntropy <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Adi Renduchintala <[email protected]>
Signed-off-by: khcs <[email protected]>
Signed-off-by: ericharper <[email protected]>
Co-authored-by: soares-f <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: ericharper <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants