-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bugfix] Reduce memory leaks #8490
Conversation
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #8490 +/- ##
=======================================
- Coverage 92% 88% -4%
=======================================
Files 217 217
Lines 14367 14390 +23
=======================================
- Hits 13260 12669 -591
- Misses 1107 1721 +614 |
Co-authored-by: Ethan Harris <[email protected]>
pytorch_lightning/trainer/connectors/logger_connector/result.py
Outdated
Show resolved
Hide resolved
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR breaking TPUs training. Looking into it.
# while training on 8 and more cores. | ||
for opt in self.optimizers: | ||
for p, v in opt.state.items(): | ||
opt.state[p] = apply_to_collection(v, torch.Tensor, move_data_to_device, self.root_device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaushikb11 here you are calling self.root_device
anyway, despite the comment above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not sure to understand the reasoning too.
What does this PR do?
This PR moves the optimizer states back to cpu on teardown and move ResultCollection Extra on cpu too.
Fixes #8463
Fixes #8430
Investigation for memory left:
Does your PR introduce any breaking changes ? If yes, please list them.
No.
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃