Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] #2541

Merged

Conversation

zerogerc
Copy link
Contributor

@zerogerc zerogerc commented Jul 7, 2020

What does this PR do?

Fixes #1633

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@mergify mergify bot requested a review from a team July 7, 2020 16:54
tests/trainer/test_states.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #2541 into master will increase coverage by 32%.
The diff coverage is 97%.

@@           Coverage Diff            @@
##           master   #2541     +/-   ##
========================================
+ Coverage      58%     90%    +32%     
========================================
  Files          79      80      +1     
  Lines        7281    7282      +1     
========================================
+ Hits         4245    6574   +2329     
+ Misses       3036     708   -2328     

@Borda Borda changed the title feature/1633 Add tracking of basic states in Trainer. Add tracking of basic states in Trainer. Jul 7, 2020
@Borda Borda added the feature Is an improvement or enhancement label Jul 7, 2020
@Borda Borda added this to the 0.9.0 milestone Jul 7, 2020
@mergify mergify bot requested a review from a team July 7, 2020 17:42
@mergify mergify bot requested a review from a team July 7, 2020 17:43
@mergify mergify bot requested a review from a team July 7, 2020 17:43
@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2020

This pull request is now in conflict... :(

@zerogerc zerogerc force-pushed the feature/1633_trainer_phases branch from 68bddde to 17947ab Compare July 8, 2020 08:35
@@ -397,6 +399,7 @@ def train(self):
# user could press ctrl+c many times... only shutdown once
if not self.interrupted:
self.interrupted = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that self.interrupted can be removed in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, more of them could be cleaned...

@awaelchli
Copy link
Contributor

I was thinking maybe it would be better and cleaner if we made these state updates in decorators and add the decorators to methdods like the training loop, validation loop etc.
Haven't thought it through in detail. What do you think?
#1633 (comment)

@zerogerc
Copy link
Contributor Author

Hi @awaelchli! Yes I think it could look nicer with decorators, I try to implement it and mention you as soon as it's ready.

@williamFalcon
Copy link
Contributor

ummm. can you give an example?
i’d prefer to avoid decorators as much as possible

@zerogerc
Copy link
Contributor Author

zerogerc commented Jul 15, 2020

@williamFalcon my idea is something like this to wrap run functions (fit, test). However it requires a little bit of restructuring =)

def decorate_run(run_fn):
    def wrapper():
        try:
            state = RUNNING
            run_fn()
            state = FINISHED
        except KeyboardInterrupt as e:
            state = INTERRUPTED
    return wrapper

@zerogerc zerogerc force-pushed the feature/1633_trainer_phases branch from 17947ab to 4d533ed Compare July 16, 2020 11:36
pytorch_lightning/trainer/states.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/states.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/states.py Show resolved Hide resolved
pytorch_lightning/trainer/states.py Outdated Show resolved Hide resolved
tests/trainer/test_states.py Outdated Show resolved Hide resolved
tests/trainer/test_states.py Outdated Show resolved Hide resolved
tests/trainer/test_states.py Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team July 18, 2020 02:02
@zerogerc zerogerc requested review from awaelchli and removed request for a team July 20, 2020 09:20
@mergify mergify bot requested a review from a team July 20, 2020 09:21
@mergify
Copy link
Contributor

mergify bot commented Jul 20, 2020

This pull request is now in conflict... :(

@zerogerc zerogerc force-pushed the feature/1633_trainer_phases branch from c94404f to 47b3c6a Compare July 21, 2020 08:18
@zerogerc zerogerc removed the request for review from a team July 21, 2020 08:19
@zerogerc
Copy link
Contributor Author

zerogerc commented Aug 7, 2020

Sure. I could do it as well. Next PR

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify mergify bot requested a review from a team August 7, 2020 14:29
@Borda Borda force-pushed the feature/1633_trainer_phases branch from 4af5fdd to b512b05 Compare August 7, 2020 14:56
@zerogerc zerogerc force-pushed the feature/1633_trainer_phases branch from b512b05 to f9f762e Compare August 7, 2020 14:57
@Borda
Copy link
Member

Borda commented Aug 7, 2020

@zerogerc one test was failing, but it seems unrelated, so I have just rebased it again, let's see if it helps...

Copy link
Contributor

@nateraw nateraw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just a quick question...

@@ -396,6 +397,7 @@ def __init__(
self.interrupted = False
self.should_stop = False
self.running_sanity_check = False
self.state = TrainerState.INITIALIZING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you set trainer's self.state to TrainerState.INITIALIZING...(see next comment)

return self.state

trainer = Trainer(default_root_dir=tmpdir)
trainer.state = TrainerState.INITIALIZING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you setting trainer.state = TrainerState.INITIALIZING if its set in the init of Trainer? Should this be ==?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
trainer.state = TrainerState.INITIALIZING

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be wrong, just let me know what's going on here 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point by nate, we should make the state read only property! and only the trainer should to write to the internal attribute trainer._state

@awaelchli
Copy link
Contributor

@williamFalcon said he wants to wait with this, so we should probably wait for his approval before automerging

@Borda
Copy link
Member

Borda commented Aug 7, 2020

@williamFalcon said he wants to wait with this, so we should probably wait for his approval before auto merging

yes, this goes after v0.9, so I ll put there wip just in case...

@Borda Borda changed the title Add tracking of basic states in Trainer. Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] Aug 7, 2020
@Borda Borda modified the milestones: 0.9.0, 0.9.x Aug 7, 2020
@mergify
Copy link
Contributor

mergify bot commented Aug 8, 2020

This pull request is now in conflict... :(

@awaelchli awaelchli modified the milestones: 0.9.x, 1.0.0 Aug 8, 2020
@williamFalcon
Copy link
Contributor

this looks great! let's merge it

@williamFalcon williamFalcon merged commit e9846dd into Lightning-AI:master Aug 9, 2020
@Borda
Copy link
Member

Borda commented Aug 9, 2020

this looks great! let's merge it

I know that's why we made it =) just you we a bit conservative and leave it for 0.9.1
@zerogerc let's update all tests checks accordingly #2541 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add Trainer states
7 participants