-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support gradient accumulation using Horovod's backward_passes_per_step
#11911
Support gradient accumulation using Horovod's backward_passes_per_step
#11911
Conversation
for more information, see https://pre-commit.ci
Co-authored-by: Rohit Gupta <[email protected]>
Co-authored-by: Rohit Gupta <[email protected]>
…krshrimali/pytorch-lightning into feature/11732_grad_accumulation_horovod
Co-authored-by: ananthsub <[email protected]>
…krshrimali/pytorch-lightning into feature/11732_grad_accumulation_horovod
for more information, see https://pre-commit.ci
…and devices=count
…krshrimali/pytorch-lightning into feature/11732_grad_accumulation_horovod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always try to keep tests as minimal as possible
Co-authored-by: Carlos Mocholí <[email protected]>
Thanks, @carmocca for the review and suggestions. Just to let you know, I'll make the changes to merge the cpu and gpu tests into one as you suggested, and will update this PR again. I've committed through your suggestions for some of your comments, and others will come along with my next commit. Thank you for the suggestions again! :)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
As a side note, we should open an issue about rethinking Horovod testing.
…ep` (#11911) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rohit Gupta <[email protected]> Co-authored-by: ananthsub <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>
What does this PR do?
This PR attempts to fix #11732.
Uses
backward_passes_per_step
kwarg in Horovod'sDistributedOptimizer
for the purpose as mentioned in the issue.Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃