-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep track of the best model's path saved by ModelCheckpoint #1799
Keep track of the best model's path saved by ModelCheckpoint #1799
Conversation
This pull request is now in conflict... :( |
…est model's path Currently, only the best metric value is directly tracked. This new attribute will help in uses cases where the trained model needs to be used or tracked right after training.
f16a602
to
ed3e6f3
Compare
Codecov Report
@@ Coverage Diff @@
## master #1799 +/- ##
======================================
- Coverage 88% 88% -0%
======================================
Files 74 74
Lines 4654 4664 +10
======================================
+ Hits 4076 4083 +7
- Misses 578 581 +3 |
@kepler nice! |
Thanks, @williamFalcon. |
Hello @kepler! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-05-31 10:39:39 UTC |
@williamFalcon let me know if this is enough. |
@kepler that's great! |
@williamFalcon are you sure this is already merged to |
@kepler you're right! sorry about that. let's merge this PR then :) Let's just get the tests to pass |
@williamFalcon no problem. I have to say I don't understand why the tests failed (for Ubuntu and MacOS but passed for Windows). Could you please rerun them, just in case? Otherwise, I'll trigger it with a small commit (in about two hours). |
https://github.com/PyTorchLightning/pytorch-lightning/pull/1799/checks?check_run_id=712909136 they don't pass on windows. For windows this is just not correctly indicated. On linux and macos, you have a print statement in your doctests which produce some output although no output is expected and thus the tests fail |
So it was indeed a problem with the doctest (totally missed it). Thanks, @justusschock. |
Win is pure and not have implemented all functions properly... |
@Borda This should be ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls clarify the best
var and add note to changelog
@@ -265,7 +276,8 @@ def _do_check_save(self, filepath, current, epoch): | |||
self.kth_value = self.best_k_models[self.kth_best_model] | |||
|
|||
_op = min if self.mode == 'min' else max | |||
self.best = _op(self.best_k_models.values()) | |||
self.best_model = _op(self.best_k_models, key=self.best_k_models.get) | |||
self.best = self.best_k_models[self.best_model] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here the best means best_score
? otherwise, it is a bit confusing to have best
and best_model
pls add types to the init 🐰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, self.best
is the best "monitored quantity". But that's existing naming. Changing it will break backwards compatibility. If that's OK, I can surely rename it. Otherwise, best_model
could be best_model_path
. I used best_model
to keep consistency with the existing kth_best_model
attribute.
As for type hints, there's no changes to the arguments. Where specifically would I add them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PyTorchLightning/core-contributors are we fine to rename best
>> best_score
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer it (best_score
), but if we do that we need to provide our users with a utility to upgrade their checkpoints (basically switch best
to best_score
) and temporarily have a warning which catches old checkpoints at load time and alerts users of the utility to convert. it's a very simple utility but we should provide it for our users if we move forward with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyjordan you mean to keep compatible with past saved checkpoint which contains best
? That shall be simple, in loading, we will map best
>> best_score
... and yes we shall wrap the deprecated best
with some warning :]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I renamed best
to best_model_score
and best_model
to best_model_path
, since having best_score
and best_model
would still be a bit confusing (IMHO). To keep consistency, I also renamed kth_best_model
to kth_best_model_path
.
I added properties for best
and kth_best_model
that log a deprecation warning and return the correct value.
When loading a checkpoint, if it's in an old format, the value for best
is simply assigned to best_model_score
. In my opinion, adding a warning in this part will not really help the user, as there's not much they can do.
Let me know of any further changes.
Also rename `ModelCheckpoint.best_model` (added in this PR) to `ModelCheckpoint.best_model_path`, for consistency, and `kth_best_model` to `kth_best_model_path`.
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just minor things
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
This pull request is now in conflict... :( |
* Add an additional attribute to ModelCheckpoint to keep track of the best model's path Currently, only the best metric value is directly tracked. This new attribute will help in uses cases where the trained model needs to be used or tracked right after training. * Add small description and usage example to docs * Fix PEP8 issues * Fix doctest example * Fix expected output in doctest * Apply suggestions from code review * Show example as code block instead of doctest * Apply suggestions from code review * Update CHANGELOG.md * Rename `ModelCheckpoint.best` to `ModelCheckpoint.best_model_score` Also rename `ModelCheckpoint.best_model` (added in this PR) to `ModelCheckpoint.best_model_path`, for consistency, and `kth_best_model` to `kth_best_model_path`. * Update pytorch_lightning/trainer/training_io.py Co-authored-by: Jirka Borovec <[email protected]> * Apply suggestions from code review Co-authored-by: Jirka Borovec <[email protected]> * Add warning when loading checkpoint from an old version Co-authored-by: Jirka Borovec <[email protected]>
Before submitting
What does this PR do?
Add an additional attribute to
ModelCheckpoint
to keep track of the best model's path. Currently, only the best metric value is directly tracked.When training finishes, if we want to know where the best model is, we need to either do:
or
which are both somewhat cumbersome.
Since the best value is already tracked, this PR simply tracks the saved path as well.
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Glad to make PTL even more practical.