Raise a WARNING when someone tried to load the best checkpoint when one has not been set. #12501

aprbw · 2022-03-29T04:14:20Z

🚀 Feature

Raise a WARNING when someone call:
Trainer.test(ckpt_path='best')
or
Trainer.test(ckpt_path='best')
if there is no user-defined ModelCheckpoint in callbacks

Motivation

I'm frustrated when there is a deadline approaching, and I need to update my PL to the latest version, and it breaks a bunch of things, and one of thing that silently break is the model checkpointing.

If someone is calling Trainer.test(ckpt_path='best') they are expecting the best model to be loaded, not the latest model. If it is loading the latest model instead (due to user-defined ModelCheckpoint was not set), please raise a warning and not just silently load the latest model. Freaked me out real bad, I literally lost sleep over this.

Pitch

Raise a WARNING when someone call:
Trainer.test(ckpt_path='best')
or
Trainer.test(ckpt_path='best')
if there is no user-defined ModelCheckpoint in callbacks

Alternatives

None

Additional context

None

cc @tchaton @justusschock @awaelchli @Borda @ananthsub @ninginthecloud @rohitgr7 @carmocca @jjenniferdai

The text was updated successfully, but these errors were encountered:

carmocca · 2022-03-30T12:06:43Z

Hi @aprbw! Sorry that you had a bad experience with this.

When you create a Trainer(), we add a instance ModelCheckpoint() internally as long as you don't set Trainer(enable_model_checkpointing=False).

This ModelCheckpoint() instance by default has monitor=None which means it doesn't track any specific logged value, instead the last model saved.
Instances with this configuration still set best_model_path, the attribute checked with test(ckpt_path='best').

Even though this was confusing to you, this is expected behavior for other users who want to set 'best' and forget without having to take into account the particular checkpointing configuration.

I am not sure we can show a warning here because this is intended.

On a similar vein: #12485

aprbw · 2022-03-31T08:02:39Z

Sorry that you had a bad experience with this.

Sorry about my rant.

Instances with this configuration still set best_model_path, the attribute checked with test(ckpt_path='best').

I see, so there are cases where this is a desired behavior. However, I still think that this is still confusing. I'm also not sure what the correct solution should be.

stale · 2022-04-30T13:06:11Z

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

aprbw added the needs triage Waiting to be triaged by maintainers label Mar 29, 2022

ananthsub added checkpointing Related to checkpointing callback: model checkpoint and removed needs triage Waiting to be triaged by maintainers labels Mar 29, 2022

carmocca added working as intended Working as intended design Includes a design discussion and removed checkpointing Related to checkpointing labels Mar 30, 2022

carmocca mentioned this issue Mar 30, 2022

[RFC] Create a ModelCheckpointBase callback #6504

Closed

stale bot added the won't fix This will not be worked on label Apr 30, 2022

carmocca closed this as completed May 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise a WARNING when someone tried to load the best checkpoint when one has not been set. #12501

Raise a WARNING when someone tried to load the best checkpoint when one has not been set. #12501

aprbw commented Mar 29, 2022 •

edited by github-actions bot

Loading

carmocca commented Mar 30, 2022

aprbw commented Mar 31, 2022

stale bot commented Apr 30, 2022

Raise a WARNING when someone tried to load the best checkpoint when one has not been set. #12501

Raise a WARNING when someone tried to load the best checkpoint when one has not been set. #12501

Comments

aprbw commented Mar 29, 2022 • edited by github-actions bot Loading

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

carmocca commented Mar 30, 2022

aprbw commented Mar 31, 2022

stale bot commented Apr 30, 2022

aprbw commented Mar 29, 2022 •

edited by github-actions bot

Loading