Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix adapters and ptuning for amp O2 #7285

Merged
merged 14 commits into from
Aug 22, 2023

Conversation

guyueh1
Copy link
Contributor

@guyueh1 guyueh1 commented Aug 21, 2023

What does this PR do ?

Fix the issues when using megatron_amp_O2 on PEFT models

Collection: [Note which collection this PR will affect]

Changelog

  • Transform adapters to fp16/bf16 when the base model uses megatron_amp_O2. When using amp_O2 we should make model parameters in fp16/bf16. Previous PR Add precision and megatron_amp_O2 configs to adapters #7232 handles the precision issue in ColumnParallelLinear and RowParallelLinear in PEFT models, and this PR handles precision casting of the rest of the modules such as layernorm.
  • Explicitly set the ptuning inference table to untrainable for FP16/BF16 modules. Previous whitelist only includes the name of the inference table as in a FP32 model. We add its name in FP16/BF16 models to whitelist.
  • Explicitly set ptuning inference table to be untrainable in AdapterPTuningModel like in PTuningModel

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

* Under megatron_amp_O2, transform the adapter modules to low precision after instantiation

Signed-off-by: Guyue Huang <[email protected]>

Conflicts:
	nemo/collections/nlp/modules/common/megatron/adapters/parallel_adapters.py
* Fix the first_stage_of_pipeline detection for half models
* Fix the freezing of InferenceTable for half models

Signed-off-by: Guyue Huang <[email protected]>
* When unfreezing adapters, we explicitly set inference embedding table
in prompt encoder to be untrainable.

Signed-off-by: Guyue Huang <[email protected]>
@github-actions github-actions bot added core Changes to NeMo Core NLP labels Aug 21, 2023
@guyueh1 guyueh1 marked this pull request as ready for review August 21, 2023 19:10
@github-actions github-actions bot removed the core Changes to NeMo Core label Aug 22, 2023
Copy link
Collaborator

@arendu arendu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

@arendu arendu merged commit 69af78c into main Aug 22, 2023
15 checks passed
@arendu arendu deleted the guyueh1/fix_adapters_and_ptuning_for_ampO2_rebased branch August 22, 2023 14:52
styagi130 pushed a commit to styagi130/NeMo that referenced this pull request Aug 23, 2023
* Transform adapter modules to fp16/bf16 under amp_O2

* Under megatron_amp_O2, transform the adapter modules to low precision after instantiation

Signed-off-by: Guyue Huang <[email protected]>

Conflicts:
	nemo/collections/nlp/modules/common/megatron/adapters/parallel_adapters.py

* Fix ptuning under amp O2

* Fix the first_stage_of_pipeline detection for half models
* Fix the freezing of InferenceTable for half models

Signed-off-by: Guyue Huang <[email protected]>

* Fix MegatronGPTAdapterPTuningModel

* When unfreezing adapters, we explicitly set inference embedding table
in prompt encoder to be untrainable.

Signed-off-by: Guyue Huang <[email protected]>

* Add comments for feature explanation

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ptuning and lora model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* Put the casting of adapters in their instantiaion

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* small fix for state dict

Signed-off-by: jasonwan <[email protected]>

* optional model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Guyue Huang <[email protected]>
Signed-off-by: jasonwan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jasonwan <[email protected]>

Signed-off-by: Siddharth Tyagi <[email protected]>

Signed-off-by: Siddharth Tyagi <[email protected]>
dorotat-nv pushed a commit to dorotat-nv/NeMo that referenced this pull request Aug 24, 2023
* Transform adapter modules to fp16/bf16 under amp_O2

* Under megatron_amp_O2, transform the adapter modules to low precision after instantiation

Signed-off-by: Guyue Huang <[email protected]>

Conflicts:
	nemo/collections/nlp/modules/common/megatron/adapters/parallel_adapters.py

* Fix ptuning under amp O2

* Fix the first_stage_of_pipeline detection for half models
* Fix the freezing of InferenceTable for half models

Signed-off-by: Guyue Huang <[email protected]>

* Fix MegatronGPTAdapterPTuningModel

* When unfreezing adapters, we explicitly set inference embedding table
in prompt encoder to be untrainable.

Signed-off-by: Guyue Huang <[email protected]>

* Add comments for feature explanation

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ptuning and lora model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* Put the casting of adapters in their instantiaion

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* small fix for state dict

Signed-off-by: jasonwan <[email protected]>

* optional model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Guyue Huang <[email protected]>
Signed-off-by: jasonwan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jasonwan <[email protected]>
Signed-off-by: dorotat <[email protected]>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Transform adapter modules to fp16/bf16 under amp_O2

* Under megatron_amp_O2, transform the adapter modules to low precision after instantiation

Signed-off-by: Guyue Huang <[email protected]>

Conflicts:
	nemo/collections/nlp/modules/common/megatron/adapters/parallel_adapters.py

* Fix ptuning under amp O2

* Fix the first_stage_of_pipeline detection for half models
* Fix the freezing of InferenceTable for half models

Signed-off-by: Guyue Huang <[email protected]>

* Fix MegatronGPTAdapterPTuningModel

* When unfreezing adapters, we explicitly set inference embedding table
in prompt encoder to be untrainable.

Signed-off-by: Guyue Huang <[email protected]>

* Add comments for feature explanation

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ptuning and lora model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* Put the casting of adapters in their instantiaion

Signed-off-by: Guyue Huang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* small fix for state dict

Signed-off-by: jasonwan <[email protected]>

* optional model_parallel_config

Signed-off-by: jasonwan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Guyue Huang <[email protected]>
Signed-off-by: jasonwan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jasonwan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants