-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass args to ShardedDataParallel #9483
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9483 +/- ##
======================================
Coverage 93% 93%
======================================
Files 180 180
Lines 15066 15066
======================================
+ Hits 13986 13991 +5
+ Misses 1080 1075 -5 |
@@ -41,6 +41,7 @@ def configure_ddp(self) -> None: | |||
sharded_optimizer=self.lightning_module.trainer.optimizers, | |||
# For multi-node training, enabling bucketing will improve performance. | |||
reduce_buffer_size=self._REDUCE_BUFFER_SIZE_DEFAULT if self.num_nodes > 1 else 0, | |||
**self._ddp_kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SeanNaren should we explicitly check if reduce_buffer_size
is part of the kwargs to avoid errors with it being configured twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great point, will do a followup here!
What does this PR do?
Fixes #9467
Pass kwargs to wrapper, just like DDP does with DDP wrapper
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃