Micro batch loader for bert model #6046

shanmugamr1992 · 2023-02-17T07:55:12Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

for more information, see https://pre-commit.ci

yidong72

Looks good. Just have two questions about this PR.

yidong72 · 2023-02-28T21:16:53Z

nemo/collections/nlp/models/language_modeling/megatron_bert_model.py

@@ -359,21 +375,22 @@ def training_step(self, batch, batch_idx):
        if self.cfg.precision == 16:
            loss_scale = self.trainer.precision_plugin.scaler._scale
            if loss_scale is not None:
-                self.log('loss_scale', loss_scale)
+                self.log('loss_scale', loss_scale, batch_size=1)


what is this batch_size=1 do here?

It throws an error otherwise. Pretty much pulled it from the gpt micro batch loader code . @erhoo82 might know better.

It's a hack to make PTL happy, we need to raise an issue with them.

yidong72 · 2023-02-28T21:19:35Z

nemo/collections/nlp/models/language_modeling/megatron_bert_model.py

+    def on_train_batch_end(self, outputs, dataloader_iter: Any, batch_idx: int, unused: Optional[int] = 0) -> None:
+        super().on_train_batch_end(outputs, dataloader_iter, batch_idx)
+
+        # TODO: Replace with newer override for scheduler.step() instead of


why is this related to the new microbatch loader?

This as well pulled from gpt3 code . @erhoo82 might know better.

We use that method so we need to override it with the dataloader_iter signature

github-actions · 2023-03-16T01:57:19Z

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

ericharper

LGTM. Thanks!

* Micro batch loader for bert model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Micro batch loader for bert model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Micro batch loader for bert model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: hsiehjackson <[email protected]>

Micro batch loader for bert model

6c4f157

github-actions bot added the NLP label Feb 17, 2023

pre-commit-ci bot and others added 9 commits February 17, 2023 07:56

[pre-commit.ci] auto fixes from pre-commit.com hooks

f610fff

for more information, see https://pre-commit.ci

Merge branch 'main' into micro_bert

abc3ecd

Added bug fix

2e6df43

Merge branch 'micro_bert' of github.com:NVIDIA/NeMo into micro_bert

6bfd9c1

[pre-commit.ci] auto fixes from pre-commit.com hooks

c2323f0

for more information, see https://pre-commit.ci

Merge branch 'main' into micro_bert

6a57c46

Merge branch 'main' into micro_bert

0c5162e

Merge branch 'main' into micro_bert

95ffbe6

Merge branch 'main' into micro_bert

0d06ef7

okuchaiev requested a review from yidong72 February 27, 2023 18:27

yidong72 reviewed Feb 28, 2023

View reviewed changes

github-actions bot added the stale label Mar 16, 2023

shanmugamr1992 force-pushed the micro_bert branch from ee01230 to 0d06ef7 Compare March 16, 2023 19:49

shanmugamr1992 changed the base branch from main to r1.17.0 March 16, 2023 19:50

Merge branch 'r1.17.0' into micro_bert

34914e9

ericharper approved these changes Mar 16, 2023

View reviewed changes

shanmugamr1992 merged commit fa949a3 into r1.17.0 Mar 17, 2023

shanmugamr1992 deleted the micro_bert branch March 17, 2023 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Micro batch loader for bert model #6046

Micro batch loader for bert model #6046

shanmugamr1992 commented Feb 17, 2023

yidong72 left a comment

yidong72 Feb 28, 2023

shanmugamr1992 Mar 1, 2023

ericharper Mar 16, 2023

yidong72 Feb 28, 2023

shanmugamr1992 Mar 1, 2023

ericharper Mar 16, 2023 •

edited

Loading

github-actions bot commented Mar 16, 2023

ericharper left a comment

Micro batch loader for bert model #6046

Micro batch loader for bert model #6046

Conversation

shanmugamr1992 commented Feb 17, 2023

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

yidong72 left a comment

Choose a reason for hiding this comment

yidong72 Feb 28, 2023

Choose a reason for hiding this comment

shanmugamr1992 Mar 1, 2023

Choose a reason for hiding this comment

ericharper Mar 16, 2023

Choose a reason for hiding this comment

yidong72 Feb 28, 2023

Choose a reason for hiding this comment

shanmugamr1992 Mar 1, 2023

Choose a reason for hiding this comment

ericharper Mar 16, 2023 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Mar 16, 2023

ericharper left a comment

Choose a reason for hiding this comment

ericharper Mar 16, 2023 •

edited

Loading