Fix DeepSpeedPlugin with IterableDataset #7362

leezu · 2021-05-04T23:21:57Z

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

pytorch_lightning/plugins/training_type/deepspeed.py

codecov · 2021-05-04T23:23:06Z

Codecov Report

Merging #7362 (e91c4be) into master (28103c6) will decrease coverage by 5%.
The diff coverage is 10%.

@@           Coverage Diff           @@
##           master   #7362    +/-   ##
=======================================
- Coverage      92%     87%    -5%     
=======================================
  Files         200     200            
  Lines       12983   12992     +9     
=======================================
- Hits        11937   11359   -578     
- Misses       1046    1633   +587

SeanNaren · 2021-05-05T09:23:38Z

Thanks @leezu! Let me know your thoughts on the proposed change, and I can help make a test for this to get this merged.

SeanNaren · 2021-05-06T10:28:48Z

I added a test, and modified the code to now use auto. Auto will try to use the batch sampler if it exists else will default to 1.

SeanNaren · 2021-05-06T10:30:04Z

pytorch_lightning/plugins/training_type/deepspeed.py

+        if hasattr(self.lightning_module, 'train_dataloader'):
+            train_dataloader = self.lightning_module.train_dataloader()
+            if hasattr(train_dataloader, 'batch_sampler'):
+                batch_size = train_dataloader.batch_sampler.batch_size


If anyone can suggest anything cleaner please do :)

probably not much as train_dataloader() is callable not just an attribute...

This could break if the user provides several dataloaders to the CombinedLoader.

Borda · 2021-05-06T11:13:10Z

@leezu pls use and follow the bullet list from template

tchaton

LGTM !

tchaton · 2021-05-06T11:51:03Z

pytorch_lightning/plugins/training_type/deepspeed.py

+        if hasattr(self.lightning_module, 'train_dataloader'):
+            train_dataloader = self.lightning_module.train_dataloader()
+            if hasattr(train_dataloader, 'batch_sampler'):
+                batch_size = train_dataloader.batch_sampler.batch_size


This could break if the user provides several dataloaders to the CombinedLoader.

leezu · 2021-05-06T13:43:32Z

Thank you @SeanNaren. LGTM

SeanNaren · 2021-05-07T09:46:18Z

Thanks @leezu :)

* deepspeed add train_micro_batch_size_per_gpu argument * Update naming and doc * Modify to use auto naming convention, add test * Add iterable tests * Fix tests, attempt by mocking * Import correct package * Fix comparison * Set as special test * Remove import * Add Changelog Co-authored-by: SeanNaren <[email protected]> (cherry picked from commit 98b94b8)

leezu added 2 commits May 4, 2021 13:05

deepspeed add train_micro_batch_size_per_gpu argument

7c76ba0

Update naming and doc

0507f6e

leezu requested review from awaelchli, justusschock, SeanNaren and tchaton as code owners May 4, 2021 23:21

kaushikb11 reviewed May 4, 2021

View reviewed changes

pytorch_lightning/plugins/training_type/deepspeed.py Outdated Show resolved Hide resolved

Borda added the bug Something isn't working label May 4, 2021

Borda added this to the v1.3 milestone May 4, 2021

Modify to use auto naming convention, add test

b51f330

leezu requested review from Borda, carmocca and williamFalcon as code owners May 6, 2021 10:20

Add iterable tests

937efd5

SeanNaren approved these changes May 6, 2021

View reviewed changes

SeanNaren reviewed May 6, 2021

View reviewed changes

tchaton approved these changes May 6, 2021

View reviewed changes

leezu mentioned this pull request May 6, 2021

release 1.3.0 #7404

Merged

11 tasks

SeanNaren added 2 commits May 6, 2021 14:53

Fix tests, attempt by mocking

99e9b12

Import correct package

5cf6de7

ananthsub approved these changes May 6, 2021

View reviewed changes

SeanNaren added 3 commits May 6, 2021 15:31

Fix comparison

bb9c86d

Set as special test

c31c8e0

Remove import

68850db

Borda removed this from the v1.3 milestone May 6, 2021

Borda added this to the v1.3.x milestone May 6, 2021

SeanNaren added 2 commits May 7, 2021 10:08

Merge branch 'master' into deepspeed

2b5dfb2

Add Changelog

e91c4be

SeanNaren merged commit 98b94b8 into Lightning-AI:master May 7, 2021

leezu deleted the deepspeed branch May 7, 2021 14:24

carmocca mentioned this pull request May 10, 2021

1.3.x cherries 🍒 #7467

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix DeepSpeedPlugin with IterableDataset #7362

Fix DeepSpeedPlugin with IterableDataset #7362

leezu commented May 4, 2021 •

edited by SeanNaren

Loading

codecov bot commented May 4, 2021 •

edited

Loading

SeanNaren commented May 5, 2021

SeanNaren commented May 6, 2021

SeanNaren May 6, 2021

Borda May 6, 2021

tchaton May 6, 2021

Borda May 6, 2021

Borda commented May 6, 2021

tchaton left a comment

tchaton May 6, 2021

leezu commented May 6, 2021

SeanNaren commented May 7, 2021

Fix DeepSpeedPlugin with IterableDataset #7362

Fix DeepSpeedPlugin with IterableDataset #7362

Conversation

leezu commented May 4, 2021 • edited by SeanNaren Loading

Before submitting

PR review

Did you have fun?

codecov bot commented May 4, 2021 • edited Loading

Codecov Report

SeanNaren commented May 5, 2021

SeanNaren commented May 6, 2021

SeanNaren May 6, 2021

Choose a reason for hiding this comment

Borda May 6, 2021

Choose a reason for hiding this comment

tchaton May 6, 2021

Choose a reason for hiding this comment

Borda May 6, 2021

Choose a reason for hiding this comment

Borda commented May 6, 2021

tchaton left a comment

Choose a reason for hiding this comment

tchaton May 6, 2021

Choose a reason for hiding this comment

leezu commented May 6, 2021

SeanNaren commented May 7, 2021

leezu commented May 4, 2021 •

edited by SeanNaren

Loading

codecov bot commented May 4, 2021 •

edited

Loading