Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to remove bias terms from Megatron transformers #3973

Merged
merged 23 commits into from
Apr 15, 2022

Conversation

MaximumEntropy
Copy link
Contributor

What does this PR do ?

Adds to option to remove exclude terms from megatron transformer weight matrices.

Collection: NLP

Changelog

Usage

  • Change yaml config to
bias: False
bias_gelu_fusion: False
bias_dropout_add_fusion: False

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@lgtm-com
Copy link

lgtm-com bot commented Apr 12, 2022

This pull request introduces 1 alert when merging a53d663 into ef59a86 - view on LGTM.com

new alerts:

  • 1 for Unused import

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Apr 12, 2022

This pull request introduces 1 alert when merging 9784ec0 into ef59a86 - view on LGTM.com

new alerts:

  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 12, 2022

This pull request introduces 4 alerts when merging d506340 into ef59a86 - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 13, 2022

This pull request introduces 4 alerts when merging ce455f6 into 1a0575b - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Apr 14, 2022

This pull request introduces 4 alerts when merging 77d75d0 into d4408cc - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Apr 14, 2022

This pull request introduces 4 alerts when merging c9ffe41 into 0be1e94 - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

michalivne
michalivne previously approved these changes Apr 14, 2022
Copy link
Collaborator

@michalivne michalivne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! See minor comment regarding two configs line (if not independent, consider use one?)

@@ -69,6 +69,8 @@ model:
gradient_as_bucket_view: True # Allocate gradients in a contiguous bucket to save memory (less fragmentation and buffer memory)
bias_gelu_fusion: True # Use a kernel that fuses the bias addition from weight matrices with the subsequent gelu activation.
masked_softmax_fusion: True # Use a kernel that fuses the attention softmax with it's mask.
bias_dropout_add_fusion: True # Use a kernel that fuses the bias addition, dropout and residual connection addition.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you set the two independently to either True/False?

@lgtm-com
Copy link

lgtm-com bot commented Apr 15, 2022

This pull request introduces 4 alerts when merging fedb582 into e4ee26b - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@lgtm-com
Copy link

lgtm-com bot commented Apr 15, 2022

This pull request introduces 4 alerts when merging ae39532 into 9005f23 - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a call
  • 1 for Unused import

@MaximumEntropy MaximumEntropy merged commit c978f4f into main Apr 15, 2022
@MaximumEntropy MaximumEntropy deleted the remove_bias_terms branch April 15, 2022 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants