-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add current_epoch to dumped_params #3261
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3261 +/- ##
=======================================
+ Coverage 86% 87% +1%
=======================================
Files 117 117
Lines 9353 9618 +265
=======================================
+ Hits 8074 8357 +283
+ Misses 1279 1261 -18 |
|
Nice!! @awaelchli will it support user-defined callbacks for |
I'm not sure, but we recently added support to persist state of callbacks, so we should be able to dump their state too and restore them easily if that's the problem? For further questions better ask directly here #3160 |
The Tuner class looks really promising! |
Just to update, the |
This pull request is now in conflict... :( |
@maxjeblick mind add a test to check that it is fixed... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are probably lots of other trainer attributes that need to be dumped, for example global_step?
Co-authored-by: Adrian Wälchli <[email protected]>
@awaelchli you are probably right. I think the tuner algorithms needs an refactoring after v1.0. |
Shouldn't tuner be refactored in such a way that it won't call .fit again? Then we won't need to dump and restore anything. What it should do is just suggest and user should change the values and rerun? |
@rohitgr7 we use |
@SkafteNicki I am not suggesting to rewrite anything. The reason we need to dump and restore is because these attributes are not re-initialized and current workflow is something like: trainer.tuner.lr_find()
# suggest
trainer.fit() What I am suggesting is: trainer.tuner.lr_find()
# suggest
# and tell user to reinitalize trainer, LM, and LDM(if any) with the updated hparams from the suggestions.
this is good alternative but this will work for only Trainer and not for LM and LDM since there are chances that some attributes might be changed during |
What does this PR do?
Fixes #3260 by restoring the current_epoch after finishing the batch size finder.