ModelCheckpoint fails at garbage collecting checkpoint passed to Trainer.resume_from_checkpoint #5090
Labels
bug
Something isn't working
help wanted
Open to be worked on
priority: 1
Medium priority task
won't fix
This will not be worked on
🐛 Bug
When passing a checkpoint to
Trainer
viaresume_from_checkpoint
, it is not tracked/garbage collected byModelCheckpoint
class. Instead, a new checkpoint is instantiated and gargabe collected/updated as usual.Please reproduce using the BoringModel and post here
https://colab.research.google.com/drive/1QJrLngpOZg1MOgAtZH5kRo_s6u-Hjh0n?usp=sharing
Expected behavior
Checkpoint passed to Trainer.resume_from_checkpoint is garbage collected.
If this is not desired behavior, I think a sentence or 2 should be added to the docs on the intended behavior.
Environment
Additional context
If this is due to epoch mismatching and not a design choice, #5007 #4655 #2401 could be possibly related.
The text was updated successfully, but these errors were encountered: