Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segmentation.MeanIoU #2698

Merged
merged 8 commits into from
Sep 11, 2024
Merged

Conversation

vkinakh
Copy link
Contributor

@vkinakh vkinakh commented Aug 21, 2024

What does this PR do?

Fixes #2558
Fixes a few typos

  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

This PR fixes the segmentation.MeanIoU metric.

Issues

  • when the metric is updated via calling self.update, the self.score is accumulated with each call. Then when self.compute is called, accumulated metric is returned
  • when the metric is updated via calling self.forward, metric works correct

It is happens because when MeanIoU is updated via self.forward, it calls self._reduce_states method, which updates score using the following formula:

reduced = ((self._update_count - 1) * global_state + local_state).float() / self._update_count

where global_state is the score accumulated over previous steps and local_state is the score on current batch.

Solution

To solve this issue:

  • use sum reduce function for self.score
  • add state num_batches to keep number of processed batches
  • add increment of num_batches in every self.update call
  • in self.compute return sum of scores divided by number of processed batches

📚 Documentation preview 📚: https://torchmetrics--2698.org.readthedocs.build/en/2698/

@vkinakh vkinakh closed this Aug 21, 2024
@vkinakh vkinakh reopened this Aug 21, 2024
@vkinakh vkinakh marked this pull request as draft August 21, 2024 16:42
@vkinakh vkinakh marked this pull request as ready for review August 21, 2024 16:59
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69%. Comparing base (4ce2278) to head (162761d).
Report is 88 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #2698    +/-   ##
=======================================
- Coverage      69%     69%    -0%     
=======================================
  Files         329     316    -13     
  Lines       18071   17909   -162     
=======================================
- Hits        12497   12335   -162     
  Misses       5574    5574            

@Borda Borda changed the title Fix segmentation.MeanIoU (#2558) Fix segmentation.MeanIoU Aug 22, 2024
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work, can we please add a test to prevent this issue in the future?

src/torchmetrics/segmentation/mean_iou.py Show resolved Hide resolved
@vkinakh
Copy link
Contributor Author

vkinakh commented Aug 24, 2024

I have also noticed that names and descriptions in test for GeneralizedDiceScore are just copy-paste of MeanIoU tests. Tests are correct, but names and descriptions are incorrect and misleading

Should I create a new issue?

class TestMeanIoU(MetricTester):
"""Test class for `MeanIoU` metric."""
@pytest.mark.parametrize("ddp", [pytest.param(True, marks=pytest.mark.DDP), False])
def test_mean_iou_class(self, preds, target, input_format, include_background, ddp):
"""Test class implementation of metric."""
self.run_class_metric_test(
ddp=ddp,
preds=preds,
target=target,
metric_class=GeneralizedDiceScore,
reference_metric=partial(
_reference_generalized_dice,
input_format=input_format,
include_background=include_background,
reduce=True,
),
metric_args={
"num_classes": NUM_CLASSES,
"include_background": include_background,
"input_format": input_format,
},
)
def test_mean_iou_functional(self, preds, target, input_format, include_background):
"""Test functional implementation of metric."""
self.run_functional_metric_test(
preds=preds,
target=target,
metric_functional=generalized_dice_score,
reference_metric=partial(
_reference_generalized_dice,
input_format=input_format,
include_background=include_background,
reduce=False,
),
metric_args={
"num_classes": NUM_CLASSES,
"include_background": include_background,
"per_class": False,
"input_format": input_format,
},
)

@Borda Borda added the bug / fix Something isn't working label Aug 28, 2024
@Borda
Copy link
Member

Borda commented Sep 2, 2024

I have also noticed that names and descriptions in test for GeneralizedDiceScore are just copy-paste of MeanIoU tests.

that was a typo, fixing it in #2709

@Borda Borda self-requested a review September 3, 2024 10:02
@SkafteNicki SkafteNicki added this to the v1.4.x milestone Sep 3, 2024
@SkafteNicki
Copy link
Member

Added generalized testing that forward/update works as expected (comment #2698 (comment)) in commit 3d3d7b5. Hopefully, this check should pass for all other metrics, else lets move this to another PR.

@vkinakh
Copy link
Contributor Author

vkinakh commented Sep 3, 2024

Added generalized testing that forward/update works as expected (comment #2698 (comment)) in commit 3d3d7b5. Hopefully, this check should pass for all other metrics, else lets move this to another PR.

I suggest adding a test case to verify the metric aggregation behavior: the aggregated metric of a batch with 2*N predictions should match the aggregated metrics from two separate updates, each with N predictions. Are there any metrics that are not expected to behave this way?

@mergify mergify bot added the ready label Sep 4, 2024
@Borda Borda enabled auto-merge (squash) September 6, 2024 09:05
vkinakh and others added 6 commits September 11, 2024 17:16
- use sum reduce function for score
- add state `num_batches` to keep number of processed batches
- add increment of `num_batches` in every `update` call
- in `compute` return sum of scores divided by number of processed batches
@Borda Borda disabled auto-merge September 11, 2024 22:10
@Borda Borda merged commit cb1ab37 into Lightning-AI:master Sep 11, 2024
52 of 57 checks passed
Borda pushed a commit that referenced this pull request Sep 11, 2024
- use sum reduce function for score
- add state `num_batches` to keep number of processed batches
- add increment of `num_batches` in every `update` call
- in `compute` return sum of scores divided by number of processed batches

---------

Co-authored-by: Nicki Skafte <[email protected]>
(cherry picked from commit cb1ab37)
Borda pushed a commit that referenced this pull request Sep 13, 2024
- use sum reduce function for score
- add state `num_batches` to keep number of processed batches
- add increment of `num_batches` in every `update` call
- in `compute` return sum of scores divided by number of processed batches

---------

Co-authored-by: Nicki Skafte <[email protected]>
(cherry picked from commit cb1ab37)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

segmentation.MeanIoU is wrong
4 participants