-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run on_train_epoch_end
after the LM for callbacks that monitor
#16567
Conversation
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure HPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure IPU
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to 🟢 link-check
These checks are required after the changes to Thank you for your contribution! 💜
|
What does this PR do?
Unblocks #16520
EarlyStopping
andModelCheckpoint
run checkson_train_epoch_end
.After #16520 lands, if you were logging on
LightningModule.training_epoch_end
you need to change it toLightningModule.on_train_epoch_end
But
Callback
s run their hooks before theLightningModule
, so these two callbacks fail to monitor a key that was loggedon_train_epoch_end
because the monitored key is not available when they run.This was not an important problem before #16520 because
training_epoch_end
was a working alternative. It runs afteron_train_epoch_end
.This PR suggests a hacky workaround of calling this hook later just for these two Callback classes
Other suggestions are welcome.
Does your PR introduce any breaking changes? If yes, please list them.
Yes, the hook call order has changed.
cc @awaelchli @Borda @justusschock @carmocca