Option to remove bias terms from Megatron transformers #3973

MaximumEntropy · 2022-04-12T06:12:15Z

What does this PR do ?

Adds to option to remove exclude terms from megatron transformer weight matrices.

Collection: NLP

Changelog

Usage

Change yaml config to

bias: False
bias_gelu_fusion: False
bias_dropout_add_fusion: False

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: MaximumEntropy <[email protected]>

lgtm-com · 2022-04-12T06:22:50Z

This pull request introduces 1 alert when merging a53d663 into ef59a86 - view on LGTM.com

new alerts:

1 for Unused import

Signed-off-by: MaximumEntropy <[email protected]>

lgtm-com · 2022-04-12T21:07:52Z

This pull request introduces 1 alert when merging 9784ec0 into ef59a86 - view on LGTM.com

new alerts:

1 for Unused import

lgtm-com · 2022-04-12T21:22:54Z

This pull request introduces 4 alerts when merging d506340 into ef59a86 - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

lgtm-com · 2022-04-13T00:23:58Z

This pull request introduces 4 alerts when merging ce455f6 into 1a0575b - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

Signed-off-by: MaximumEntropy <[email protected]>

lgtm-com · 2022-04-14T19:07:00Z

This pull request introduces 4 alerts when merging 77d75d0 into d4408cc - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

lgtm-com · 2022-04-14T23:24:17Z

This pull request introduces 4 alerts when merging c9ffe41 into 0be1e94 - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

michalivne

LGTM! See minor comment regarding two configs line (if not independent, consider use one?)

michalivne · 2022-04-14T23:23:13Z

examples/nlp/language_modeling/conf/megatron_bart_config.yaml

@@ -69,6 +69,8 @@ model:
  gradient_as_bucket_view: True # Allocate gradients in a contiguous bucket to save memory (less fragmentation and buffer memory)
  bias_gelu_fusion: True # Use a kernel that fuses the bias addition from weight matrices with the subsequent gelu activation.
  masked_softmax_fusion: True # Use a kernel that fuses the attention softmax with it's mask.
+  bias_dropout_add_fusion: True # Use a kernel that fuses the bias addition, dropout and residual connection addition.


Can you set the two independently to either True/False?

Signed-off-by: MaximumEntropy <[email protected]>

…e_bias_terms

lgtm-com · 2022-04-15T18:08:50Z

This pull request introduces 4 alerts when merging fedb582 into e4ee26b - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

ericharper

LGTM. Thanks!

lgtm-com · 2022-04-15T21:03:20Z

This pull request introduces 4 alerts when merging ae39532 into 9005f23 - view on LGTM.com

new alerts:

3 for Wrong number of arguments in a call
1 for Unused import

MaximumEntropy added 14 commits March 28, 2022 15:54

Temp

99037b5

Signed-off-by: MaximumEntropy <[email protected]>

Merge

f37fe52

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 'main' of github.com:NVIDIA/NeMo into main

12e6574

Merge branch 'main' of github.com:NVIDIA/NeMo into main

5d2cdbb

Merge branch 'main' of github.com:NVIDIA/NeMo into main

db6a25b

Merge branch 'main' of github.com:NVIDIA/NeMo into main

15513e3

Merge branch 'main' of github.com:NVIDIA/NeMo into main

06baa1c

Merge branch 'main' of github.com:NVIDIA/NeMo into main

3f6ed7e

Merge branch 'main' of github.com:NVIDIA/NeMo into main

c0b029e

initial

0932e31

Signed-off-by: MaximumEntropy <[email protected]>

More bias term changes

b1a218c

Signed-off-by: MaximumEntropy <[email protected]>

Fix merge

feaafb6

Signed-off-by: MaximumEntropy <[email protected]>

Add an option to remove bias terms

152601b

Signed-off-by: MaximumEntropy <[email protected]>

change bart config

a53d663

Signed-off-by: MaximumEntropy <[email protected]>

MaximumEntropy requested review from ericharper and michalivne April 12, 2022 06:12

MaximumEntropy added 2 commits April 12, 2022 13:52

Style

9784ec0

Signed-off-by: MaximumEntropy <[email protected]>

Style and move dropout_add to a separate func

d506340

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 'main' into remove_bias_terms

ce455f6

Style

77d75d0

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 'main' into remove_bias_terms

c9ffe41

michalivne previously approved these changes Apr 14, 2022

View reviewed changes

MaximumEntropy added 2 commits April 15, 2022 10:52

Fix bias check condition

e02c90f

Signed-off-by: MaximumEntropy <[email protected]>

Merge branch 'remove_bias_terms' of github.com:NVIDIA/NeMo into remov…

1b998cd

…e_bias_terms

MaximumEntropy dismissed michalivne’s stale review via 1b998cd April 15, 2022 17:58

Merge branch 'main' into remove_bias_terms

fedb582

Merge branch 'main' into remove_bias_terms

ae39532

ericharper approved these changes Apr 15, 2022

View reviewed changes

MaximumEntropy merged commit c978f4f into main Apr 15, 2022

MaximumEntropy deleted the remove_bias_terms branch April 15, 2022 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to remove bias terms from Megatron transformers #3973

Option to remove bias terms from Megatron transformers #3973

MaximumEntropy commented Apr 12, 2022

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 13, 2022

lgtm-com bot commented Apr 14, 2022

lgtm-com bot commented Apr 14, 2022

michalivne left a comment

michalivne Apr 14, 2022

lgtm-com bot commented Apr 15, 2022

ericharper left a comment

lgtm-com bot commented Apr 15, 2022

Option to remove bias terms from Megatron transformers #3973

Option to remove bias terms from Megatron transformers #3973

Conversation

MaximumEntropy commented Apr 12, 2022

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 12, 2022

lgtm-com bot commented Apr 13, 2022

lgtm-com bot commented Apr 14, 2022

lgtm-com bot commented Apr 14, 2022

michalivne left a comment

Choose a reason for hiding this comment

michalivne Apr 14, 2022

Choose a reason for hiding this comment

lgtm-com bot commented Apr 15, 2022

ericharper left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Apr 15, 2022