move device-specific teardown logic from training loop to accelerator #5973

awaelchli · 2021-02-14T23:25:17Z

What does this PR do?

Follow up to #5743

on_train_end device-specific teardown should be handled by accelerator.
Makes training loop device agnostic

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Check that target branch and milestone match!

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-02-14T23:27:23Z

Codecov Report

Merging #5973 (b6e2fbd) into master (ae4dca9) will decrease coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #5973   +/-   ##
======================================
- Coverage      90%     90%   -0%     
======================================
  Files         170     170           
  Lines       11789   11786    -3     
======================================
- Hits        10669   10637   -32     
- Misses       1120    1149   +29

pytorch_lightning/accelerators/gpu.py

tchaton

LGTM !

tchaton · 2021-02-15T13:21:17Z

pytorch_lightning/accelerators/gpu.py

@@ -27,6 +27,7 @@ def on_train_start(self):

    def on_train_end(self):
        # clean up memory
+        self.model.cpu()


Should we do this for TPU too ?

tchaton · 2021-02-15T13:21:48Z

pytorch_lightning/trainer/training_loop.py

-        self.trainer.accelerator_backend.on_train_end()
-
-        # clear mem
-        if self.trainer._device_type == DeviceType.GPU:


Nice cleaning !

awaelchli added 2 commits February 15, 2021 00:18

on train end

f1aa929

switch order

47ec3fd

awaelchli added the refactor label Feb 14, 2021

awaelchli requested review from Borda, carmocca, justusschock, SeanNaren, tchaton and williamFalcon as code owners February 14, 2021 23:25

carmocca reviewed Feb 15, 2021

View reviewed changes

pytorch_lightning/accelerators/gpu.py Show resolved Hide resolved

carmocca approved these changes Feb 15, 2021

View reviewed changes

Borda added this to the 1.2 milestone Feb 15, 2021

Borda approved these changes Feb 15, 2021

View reviewed changes

Borda enabled auto-merge (squash) February 15, 2021 07:53

Borda added the ready PRs ready to be merged label Feb 15, 2021

tchaton approved these changes Feb 15, 2021

View reviewed changes

mergify bot added 7 commits February 15, 2021 13:22

Merge branch 'master' into refactor/teardown

04ac93c

Merge branch 'master' into refactor/teardown

2ae1bd9

Merge branch 'master' into refactor/teardown

c22f9ae

Merge branch 'master' into refactor/teardown

23d770d

Merge branch 'master' into refactor/teardown

1275aca

Merge branch 'master' into refactor/teardown

c61735b

Merge branch 'master' into refactor/teardown

72ad287

Borda merged commit aa60c08 into master Feb 15, 2021

Borda deleted the refactor/teardown branch February 15, 2021 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move device-specific teardown logic from training loop to accelerator #5973

move device-specific teardown logic from training loop to accelerator #5973

awaelchli commented Feb 14, 2021 •

edited by Borda

Loading

codecov bot commented Feb 14, 2021 •

edited

Loading

tchaton left a comment

tchaton Feb 15, 2021

tchaton Feb 15, 2021

move device-specific teardown logic from training loop to accelerator #5973

move device-specific teardown logic from training loop to accelerator #5973

Conversation

awaelchli commented Feb 14, 2021 • edited by Borda Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Feb 14, 2021 • edited Loading

Codecov Report

tchaton left a comment

Choose a reason for hiding this comment

tchaton Feb 15, 2021

Choose a reason for hiding this comment

tchaton Feb 15, 2021

Choose a reason for hiding this comment

awaelchli commented Feb 14, 2021 •

edited by Borda

Loading

codecov bot commented Feb 14, 2021 •

edited

Loading