-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly reset metric objects in self.log #7055
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7055 +/- ##
========================================
Coverage ? 87%
========================================
Files ? 196
Lines ? 12577
Branches ? 0
========================================
Hits ? 10949
Misses ? 1628
Partials ? 0 |
pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Outdated
Show resolved
Hide resolved
if self._internal_type == ResultStoreType.INSIDE_BATCH_TRAIN_LOOP: | ||
for opt_idx in list(epoch_metrics): | ||
epoch_metrics[opt_idx].reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain this check? at the surface, inside the batch train loop reads like we're not at the epoch end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure about what is going on myself, but apparently the self._internal_type
is equal to ResultStoreType.INSIDE_BATCH_TRAIN_LOOP
even when we are at epoch end for all training metrics. self._internal_type
is equal to ResultStoreType.OUTSIDE_BATCH_TRAIN_LOOP
for validation metrics (atleast when this reset
function is called).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I think BATCH
should be removed.
But I'd also like to clear all this entirely at some point.
pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Outdated
Show resolved
Hide resolved
thanks for the fix @SkafteNicki ! |
Co-authored-by: ananthsub <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working for me 😃
Nice job @SkafteNicki, thanks for the swift fix. |
Hello @SkafteNicki! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-04-19 08:46:31 UTC |
if self._internal_type == ResultStoreType.INSIDE_BATCH_TRAIN_LOOP: | ||
for opt_idx in list(epoch_metrics): | ||
epoch_metrics[opt_idx].reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I think BATCH
should be removed.
But I'd also like to clear all this entirely at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
@@ -587,6 +578,14 @@ def get_non_metrics_keys(self): | |||
""" | |||
return [k for k, v in self.items() if not isinstance(v, Metric)] | |||
|
|||
def reset(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def reset(self) -> None: | |
def reset_metrics(self) -> None: |
Call at the end of epoch to reset Result objects | ||
""" | ||
for dl_idx in range(self.num_dataloaders): | ||
epoch_metrics = self._internals[dl_idx] if not self.has_reduced else self._internals_reduced[dl_idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current usage, self.has_reduced
should always be True right ?
Better to add assert self.has_reduced == True
there and use self._internals_reduced
directly ?
What does this PR do?
Fixes #7052
Since commit Lightning-AI/torchmetrics@19b77cc in torchmetrics, the integration with lightning has been broken if the metric object have been tried to be logged directly. This is because the lightning logging implicit depends on the
self._computed
variable not being reset wheneverreset
is called.This PR tries to solve this by making sure that
reset
of metric objects is only called once at the very end. Please note that I am not so familiar with the logger connection of the code base.As an alternative we revert the changes in torchmetrics. However, it does make some sense that the
self._computed
variable should be reset whenreset
is called.cc: @ethanwharris
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃