-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove is_global_zero
check in training_epoch_loop
#12134
Conversation
This might break third-party loggers whose integration is not part of the PL codebase though and who didn't decorate the save function but still implement it. Also: is it documented somewhere, that this is required? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! But I still see value in @justusschock comment. This would have to be explicit in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok with me as an intermediate step for #11676 (would address the comment of @justusschock and shift the responsibility to the log method directly)
@justusschock I agree, this is a breaking change. @daniellepintz could you update the changelog to document the behavior change here? The motivation is the same as #8589 . In this case, users may want to have a logger per process to publish metrics. This can be very useful in distributed training for detecting outliers. Currently, the loop is hardcoded to only flush the metrics on rank 0. In v1.7 this behavior will change given |
I updated the changelog! |
What does this PR do?
This check in
training_epoch_loop
is unnecessary so we can remove it.This is because
save
on theLightningLoggerBase
API is not implemented:https://github.com/PyTorchLightning/pytorch-lightning/blob/b29b07e9788311326bca4779d70e89eb36bfc57f/pytorch_lightning/loggers/base.py#L173-L174
And
save
is only overridden for CSV and TB loggers, and both implementations are decorated with @rank_zero_only:https://github.com/PyTorchLightning/pytorch-lightning/blob/b29b07e9788311326bca4779d70e89eb36bfc57f/pytorch_lightning/loggers/csv_logs.py#L204-L207
https://github.com/PyTorchLightning/pytorch-lightning/blob/b29b07e9788311326bca4779d70e89eb36bfc57f/pytorch_lightning/loggers/tensorboard.py#L254-L264
Fixes #12127
Does your PR introduce any breaking changes? If yes, please list them.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃