Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neural Machine Translation with Megatron Transformer Models (Tensor Parallel and Tarred Datasets Only) #3861

Merged
merged 41 commits into from
Mar 29, 2022

Conversation

MaximumEntropy
Copy link
Contributor

What does this PR do ?

Adds support to train Neural Machine Translation models with Megatron Transformer models that support tensor parallelism.

Collection: NLP

Changelog

  • Add a MegatronNMT class that inherits from MegatronLMEncoderDecoderModel and overrides appropriate functions and calls classmethods from MTEncDecModel
  • Change appropriate methods in MTEncDecModel to classmethods so they can be called in MegatronNMTModel.
  • Add a config file and training script.

Usage

examples/nlp/machine_translation/megatron_nmt_training.py -cn aayn_base_megatron model.tensor_model_parallel_size=2

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 1 alert when merging 9d9bd77 into c4ec081 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 1 alert when merging 8968a5b into c4ec081 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 1 alert when merging 5816680 into c4ec081 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 1 alert when merging fdac001 into c4ec081 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 1 alert when merging 2dcb808 into c4ec081 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 19, 2022

This pull request introduces 4 alerts when merging af80f37 into c4ec081 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

@lgtm-com
Copy link

lgtm-com bot commented Mar 20, 2022

This pull request introduces 4 alerts when merging 6becb3c into c4ec081 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 20, 2022

This pull request introduces 4 alerts when merging 9e08de6 into c4ec081 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 20, 2022

This pull request introduces 4 alerts when merging 21f32d1 into c4ec081 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

@lgtm-com
Copy link

lgtm-com bot commented Mar 28, 2022

This pull request introduces 4 alerts when merging 3960a0f into 41c4ab7 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

@lgtm-com
Copy link

lgtm-com bot commented Mar 28, 2022

This pull request introduces 4 alerts when merging 7f805fc into 41c4ab7 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Signature mismatch in overriding method
  • 1 for Unreachable code

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Mar 28, 2022

This pull request introduces 2 alerts when merging 0dd0012 into 41c4ab7 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Mar 28, 2022

This pull request introduces 1 alert when merging b632b83 into 41c4ab7 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 770a7a8 into 45f8f32 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 7b0d8a5 into 45f8f32 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 0ca7883 into 92dd783 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 2268232 into 77acfee - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

ericharper
ericharper previously approved these changes Mar 29, 2022
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 86d7099 into 70963a2 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

Signed-off-by: MaximumEntropy <[email protected]>
Copy link
Collaborator

@michalivne michalivne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! see minor comments

@@ -296,7 +296,12 @@ def main():
src_lines = []
for line in src_f:
src_lines.append(line.strip().split('\t'))
assert len(all_reverse_scores) == len(all_lm_scores) == len(all_forward_scores) == len(src_lines)
assert len(all_reverse_scores) == len(all_lm_scores) == len(all_forward_scores) == len(src_lines), (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove printing out or add meaningful message

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging ea2e3a5 into 70963a2 - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

@lgtm-com
Copy link

lgtm-com bot commented Mar 29, 2022

This pull request introduces 1 alert when merging 0e2f636 into 679f14c - view on LGTM.com

new alerts:

  • 1 for Signature mismatch in overriding method

Copy link
Collaborator

@michalivne michalivne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michalivne michalivne merged commit ff91628 into main Mar 29, 2022
@michalivne michalivne deleted the megatron_nmt branch March 29, 2022 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants