-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDP sampler #1513
DDP sampler #1513
Conversation
Hello @justusschock! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-04-17 08:15:31 UTC |
Codecov Report
@@ Coverage Diff @@
## master #1513 +/- ##
======================================
Coverage 91% 91%
======================================
Files 67 67
Lines 3784 3786 +2
======================================
+ Hits 3439 3441 +2
Misses 345 345 |
is there a way how to do it without adding extra parameters in API? |
I think this is difficult, since for custom dataloaders it will be hard to determine when it should be set and when not. Previously we had the check, whether a sampler was set, which was always false since the sampler is internally set by PyTorch. |
|
||
if need_dist_sampler: | ||
need_dist_sampler = (self.use_ddp or self.use_ddp2 or self.use_tpu) | ||
if self.replace_sampler_ddp and need_dist_sampler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding a new flag for this, is it possible to just check for the default samplers:
(isinstance(dataloader.sampler, RandomSampler) or isinstance(dataloader.sampler, SequentialSampler) or instance(dataloader.sampler, _InfiniteConstantSampler)) and need_dist_sampler?
I had a hard to find regression that happened because my custom sampler was overridden after #1425. Considering replace_sampler_ddp is set to True by default I think a lot of users will run into similar issues both in terms of regressions and new projects. E.g. users are going to expect that when they write a new sampler and pass it in the dataloader it'll be used without having to change a setting somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, but I don't think, that this is a good idea. If I explicitly set a standard sampler and don't want it to be replaced, I can't do anything this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I'd prefer the trainer flag default to False at least, but let's leave it up to the opinion of others @PyTorchLightning/core-contributors
@justusschock awesome addition. Leaving it True means ddp/etc work without code changes, nor having to remember anything else. Maybe a solution is a warning to one or the other? |
a warning would be nice... |
Before submitting
What does this PR do?
Fixes #1506 .
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃