-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the logic to check for accumulation steps with deepspeed #9826
Conversation
Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id> |
Codecov Report
@@ Coverage Diff @@
## master #9826 +/- ##
======================================
- Coverage 93% 93% -0%
======================================
Files 177 178 +1
Lines 15527 15584 +57
======================================
+ Hits 14386 14431 +45
- Misses 1141 1153 +12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rohitgr7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
What does this PR do?
Just like IPUs, deepspeed do support dict values and GradAccumScheduler for gradient accumulation.
Does your PR introduce any breaking changes? If yes, please list them.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃