Add Trainer max_time argument + Callback #6823

awaelchli · 2021-04-04T18:30:49Z

What does this PR do?

Partial #6795

RFC about naming suggestions and API (class, arguments, methods).

Proposed solution:

# set max_time in Trainer (common use case):
trainer Trainer(max_time="HH:MM:SS")

# or via callback with customizable options
from pytorch_lightning.callbacks import Timer
timer = Timer(duration="HH:MM:SS", interval=("step" | "epoch"), verbose=(True | False))
trainer = Trainer(callbacks=[timer])

# user can access time elapsed for different stages
trainer.fit(model)
timer.elapsed("train")

trainer.validate()
timer.elapsed("validate")

trainer.test()
timer.elapsed("test")

TODO:

Finalize API (request for suggestions)
Add docs, example
Add more tests

Comment on the PR with preferences/opinion about the following questions:

Should max_time be exposed in the Trainer?
Should the Timer be a callback or be built into the Training loop?
Should there be Timers for val/test?

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

pep8speaks · 2021-04-04T18:30:52Z

Hello @awaelchli! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-16 11:00:49 UTC

codecov · 2021-04-04T18:32:11Z

Codecov Report

Merging #6823 (3989fc7) into master (f645df5) will decrease coverage by 2%.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master   #6823    +/-   ##
=======================================
- Coverage      92%     90%    -2%     
=======================================
  Files         194     195     +1     
  Lines       12386   12904   +518     
=======================================
+ Hits        11414   11611   +197     
- Misses        972    1293   +321

jlperla · 2021-04-05T13:32:25Z

I actually am worried this is not the right design. @edenafek had sugggested this should be built into the core of the max_time and we definietley don't want to have a second early stopping callback. These things are too fragile.

jlperla · 2021-04-05T13:48:57Z

Ah @awaelchli Sorry I missed the stuff at the top about We can also make a max_time Trainer argument and inject the Callback internally.

If that is what is planned, and this is not a user-enabled argument, then the internal design is none of my business of course :-)

pytorch_lightning/callbacks/timer.py

ananthsub · 2021-04-13T05:48:06Z

pytorch_lightning/callbacks/timer.py

+    def on_train_start(self, trainer, *args, **kwargs) -> None:
+        self._start_time = datetime.now()
+
+    def on_train_batch_end(self, trainer, *args, **kwargs) -> None:
+        if self._interval != Interval.step:
+            return
+        self._check_time_remaining(trainer)
+
+    def on_train_epoch_end(self, trainer, *args, **kwargs) -> None:
+        if self._interval != Interval.epoch:
+            return
+        self._check_time_remaining(trainer)


in case people extend this callback, *args/**kwargs makes these callbacks harder to provide typehints for

I'm not sure I understand, because I am a typehint noob.
The arguments are unused, therefore they can be of type Any.
If I did specify all the args, linter will complain they are unused.

carmocca

Pushed a few commits with improvements.

LGTM

ananthsub · 2021-04-15T16:33:33Z

pytorch_lightning/callbacks/timer.py

+    def on_train_start(self, *args, **kwargs) -> None:
+        self._start_time[RunningStage.TRAINING] = datetime.now()
+
+    def on_train_end(self, *args, **kwargs) -> None:
+        self._end_time[RunningStage.TRAINING] = datetime.now()
+
+    def on_validation_start(self, *args, **kwargs) -> None:
+        self._start_time[RunningStage.VALIDATING] = datetime.now()
+
+    def on_validation_end(self, *args, **kwargs) -> None:
+        self._end_time[RunningStage.VALIDATING] = datetime.now()
+
+    def on_test_start(self, *args, **kwargs) -> None:
+        self._start_time[RunningStage.TESTING] = datetime.now()
+
+    def on_test_end(self, *args, **kwargs) -> None:
+        self._end_time[RunningStage.TESTING] = datetime.now()


n00b q: would this fail with daylight savings time? should we use time.monotonic() in case system clocks are reset or rewound?

I switched everything to monotonic, but note now time_elapsed() etc returns seconds.
Hope that's fine.

pytorch_lightning/callbacks/timer.py

Co-authored-by: Akihiro Nitta <[email protected]>

awaelchli added 3 commits April 4, 2021 20:05

add timer class

c98321b

add simple test

6a39660

shorter name

2ac1cb6

awaelchli added callback feature Is an improvement or enhancement labels Apr 4, 2021

awaelchli added this to the 1.3 milestone Apr 4, 2021

awaelchli mentioned this pull request Apr 4, 2021

Add more early stopping options #6795

Closed

awaelchli added 15 commits April 7, 2021 00:34

trainer callback configuration

cdda687

interval default to step

a291b89

handle unsupported interval choice

8aa979d

handle load and save

7399e7b

add start time property

27caeff

add time elapsed test

ffba8c3

complete test

f243b94

add trainer docs

6ae6654

update docs

b8ff17c

more tests

8bd7be7

Merge branch 'master' into feature/timer

89aa9a7

fix min steps timer test

ad4cf80

add resume test

4bd2c6e

add changelog

5e45883

yapf + isort

8b95dfa

awaelchli changed the title ~~Callback for setting max training duration~~ Add Trainer max_time argument + Callback Apr 7, 2021

awaelchli added 2 commits April 7, 2021 11:30

update trainer docs

36f4906

add more docs

e4dcf07

awaelchli marked this pull request as ready for review April 7, 2021 10:29

awaelchli added 4 commits April 12, 2021 18:06

add dict example

d5b9074

update timer docstring

8e505fb

udpate typehint

1519333

track val/test/predict/times

f52935e

ananthsub reviewed Apr 13, 2021

View reviewed changes

pytorch_lightning/callbacks/timer.py Show resolved Hide resolved

ananthsub reviewed Apr 13, 2021

View reviewed changes

awaelchli and others added 12 commits April 14, 2021 13:04

track time for all stages

a89ac0c

enum nonsense

c8b22fb

make duration optional

8ef429d

fix duration=None

4410909

add test

4d4e22d

add None test

1f1c982

Merge branch 'master' into feature/timer

baae22b

Improve coverage

f8cfb23

Typo

fb54c4b

Refactor enum usage

48d8fa1

Typing

96b2c78

Docs

b2ebba9

carmocca approved these changes Apr 15, 2021

View reviewed changes

ananthsub reviewed Apr 15, 2021

View reviewed changes

awaelchli added 2 commits April 16, 2021 00:41

seconds

fe4fae0

fix stage key in checkpoint

24b0d3a

awaelchli commented Apr 15, 2021

View reviewed changes

pytorch_lightning/callbacks/timer.py Show resolved Hide resolved

awaelchli added the ready PRs ready to be merged label Apr 15, 2021

skip windows

df4ddce

akihironitta reviewed Apr 16, 2021

View reviewed changes

pytorch_lightning/callbacks/timer.py Outdated Show resolved Hide resolved

SeanNaren approved these changes Apr 16, 2021

View reviewed changes

Update pytorch_lightning/callbacks/timer.py

3989fc7

Co-authored-by: Akihiro Nitta <[email protected]>

awaelchli merged commit 67d2160 into master Apr 16, 2021

awaelchli deleted the feature/timer branch April 16, 2021 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Trainer max_time argument + Callback #6823

Add Trainer max_time argument + Callback #6823

awaelchli commented Apr 4, 2021 •

edited by carmocca

Loading

pep8speaks commented Apr 4, 2021 •

edited

Loading

codecov bot commented Apr 4, 2021 •

edited

Loading

jlperla commented Apr 5, 2021

jlperla commented Apr 5, 2021

ananthsub Apr 13, 2021

awaelchli Apr 14, 2021

carmocca left a comment

ananthsub Apr 15, 2021

awaelchli Apr 15, 2021

Add Trainer max_time argument + Callback #6823

Add Trainer max_time argument + Callback #6823

Conversation

awaelchli commented Apr 4, 2021 • edited by carmocca Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

pep8speaks commented Apr 4, 2021 • edited Loading

Comment last updated at 2021-04-16 11:00:49 UTC

codecov bot commented Apr 4, 2021 • edited Loading

Codecov Report

jlperla commented Apr 5, 2021

jlperla commented Apr 5, 2021

ananthsub Apr 13, 2021

Choose a reason for hiding this comment

awaelchli Apr 14, 2021

Choose a reason for hiding this comment

carmocca left a comment

Choose a reason for hiding this comment

ananthsub Apr 15, 2021

Choose a reason for hiding this comment

awaelchli Apr 15, 2021

Choose a reason for hiding this comment

awaelchli commented Apr 4, 2021 •

edited by carmocca

Loading

pep8speaks commented Apr 4, 2021 •

edited

Loading

codecov bot commented Apr 4, 2021 •

edited

Loading