-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove memory-retaining epoch-end hooks #16520
Conversation
2f301a1
to
55a5a51
Compare
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure HPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure IPU
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🟢 lightning_app: Tests workflow
These checks are required after the changes to 🟢 lightning_app: Examples
These checks are required after the changes to 🟢 lightning_app: Azure
These checks are required after the changes to 🟢 lightning_app: Docs
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to 🟢 link-check
These checks are required after the changes to Thank you for your contribution! 💜
|
e7b6f0f
to
b27175e
Compare
according to [Lightning PR 16520](Lightning-AI/pytorch-lightning#16520).
Is there any way to get the old behavior without adding boiler plate? |
I am wondering the following:
|
@42elenz I had to add |
Migration guide
training_epoch_end -> on_train_epoch_end
The same suggestions apply to those implementing
Callback.training_epoch_end
validation_epoch_end -> on_validation_epoch_end
The same suggestions apply to those implementing
Callback.validation_epoch_end
test_epoch_end -> on_test_epoch_end
The same suggestions apply to those implementing
Callback.test_epoch_end
Example with two DataLoaders
Example with strategy="dp" (DataParallel)
If you have questions about how to migrate your use case, you can ask in this PR.
What does this PR do?
Removes the
training_epoch_end
,validation_epoch_end
, andtest_epoch_end
hooks.In favor of
on_train_epoch_end
,on_validation_epoch_end
, andon_test_epoch_end
.These hooks were becoming problematic as just implementing them could lead to memory issues if the user was unaware of their implementation.
They also increased the loop's complexity and were hard to hack or customize externally.
At runtime, we check whether the old hooks are overridden, and fail if they are with an error message that points to the migration guide above
Blocked by #16567
Fixes #8731
Closes #9380
Closes #9968
Closes #10878
Closes #11554
Follow-up things to address:
#8479: need to remove outputs from
on_predict_epoch_end
Does your PR introduce any breaking changes? If yes, please list them.
Removes the hooks described above.
cc @Borda @justusschock @carmocca @awaelchli