Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Megatron Encoder-Decoder Sampler Function #6095

Merged
merged 21 commits into from
Mar 7, 2023

Conversation

michalivne
Copy link
Collaborator

@michalivne michalivne commented Feb 23, 2023

What does this PR do ?

This PR adds support in external sampler function to Megatron encoder-decoder.

Collection: [Note which collection this PR will affect]

Changelog

  1. Added support in a custom token sampling function to MegatronLMEncoderDecoderModel.decode.
  2. Added greedy and top k / top p token sampling methods.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: Micha Livne <[email protected]>
@github-actions github-actions bot added the NLP label Feb 23, 2023
Copy link
Contributor

@MaximumEntropy MaximumEntropy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

MaximumEntropy
MaximumEntropy previously approved these changes Feb 28, 2023
2. Improved deocde docstring.

Signed-off-by: Micha Livne <[email protected]>
MaximumEntropy
MaximumEntropy previously approved these changes Feb 28, 2023
log_probs, token_ids = torch.max(torch.nn.functional.log_softmax(output_tensor, dim=-1), dim=-1)
log_probs, token_ids = sample_token_fn(logits=output_tensor[:, -1, :])
# enforce valid range of token ids
token_ids = torch.clamp(token_ids, max=tokenizer.vocab_size - 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but this seems like a weird use of clamp. Should a token_ids > vocab raise an error? @ericharper

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can happen due to vocabulary padding (mostly in the early stages of training).

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ericharper ericharper merged commit 609ad76 into main Mar 7, 2023
@ericharper ericharper deleted the megatron_encoder_decoder-sampling-decode branch March 7, 2023 19:06
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* 1. Work0in-progress.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support custom token sample function to MegatronLMEncoderDecoderModel.decode.
2. Added greedy and top k / top p token sample methods.

Signed-off-by: Micha Livne <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* 1. Debugging.

* 1. Removed commented code.
2. Improved deocde docstring.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed variable name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed device assignemnt.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed a bug in PP decoding log probs shape.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed device assignment.

Signed-off-by: Micha Livne <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Micha Livne <[email protected]>
Co-authored-by: Micha Livne <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <[email protected]>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* 1. Work0in-progress.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support custom token sample function to MegatronLMEncoderDecoderModel.decode.
2. Added greedy and top k / top p token sample methods.

Signed-off-by: Micha Livne <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 1. Debugging.

* 1. Debugging.

* 1. Removed commented code.
2. Improved deocde docstring.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed variable name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed device assignemnt.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed a bug in PP decoding log probs shape.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed device assignment.

Signed-off-by: Micha Livne <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Micha Livne <[email protected]>
Co-authored-by: Micha Livne <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants