The compute_groups may lead to unexpected behavior #2596

YicunDuanUMich · 2024-06-12T23:19:14Z

🐛 Bug

When I use MetricCollection to store similar metrics, the compute_groups feature will automatically merge the states without any warning. I have two similar metrics "star classification accuracy" and "galaxy classification accuracy" whose only difference is a string attribute source_type_filter ("star" v.s. "galaxy") which instructs them to conduct different filter behaviors. During validation, i find that "star classification accuracy" is always equal to "galaxy classification accuracy" because MetricCollection merges their states.

To Reproduce

I attach my hydra config file here to show my env:

 metrics:
        _target_: torchmetrics.MetricCollection
        _convert_: "partial"
        metrics:
          source_type_accuracy:
            _target_: bliss.encoder.metrics.SourceTypeAccuracy
            flux_bin_cutoffs: [200, 400, 600, 800, 1000]
          source_type_accuracy_star:
            _target_: bliss.encoder.metrics.SourceTypeAccuracy
            flux_bin_cutoffs: [200, 400, 600, 800, 1000]
            source_type_filter: "star"
          source_type_accuracy_galaxy:
            _target_: bliss.encoder.metrics.SourceTypeAccuracy
            flux_bin_cutoffs: [200, 400, 600, 800, 1000]
            source_type_filter: "galaxy"

Expected behavior

The default value of compute_groups in MetricCollection is False

Environment

TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source): 0.11.3
Python & PyTorch Version (e.g., 1.0): 2.0
Any other relevant information such as OS (e.g., Linux): Linux

Additional context

The text was updated successfully, but these errors were encountered:

github-actions · 2024-06-12T23:19:36Z

Hi! thanks for your contribution!, great first issue!

SkafteNicki · 2024-07-22T07:07:24Z

Hi @YicunDuanUMich, sorry for the late reply.
I am pretty sure that the bug you are seeing was fixes in later versions of torchmetrics. In particular I think this PR probably contains the solution for your issues: #2571.
Just as an example, here is a script that initializes the classification metric precision with both average="macro" and average="weighted" e.g. two metric where there states are the same and they only differ how values are aggregated in the end.

import torchmetrics
import torch

collection = torchmetrics.MetricCollection({
    "micro_precision": torchmetrics.Precision(task="multiclass", average='weighted', num_classes=3),
    "macro_precision": torchmetrics.Precision(task="multiclass", average='macro', num_classes=3),
})

for _ in range(3):
    x = torch.randn(10, 3).softmax(dim=1)
    y = torch.randint(0, 3, (10,))
    collection.update(x, y)
    out = collection.compute()
    print(out)

In v0.11.3 of torchmetrics I am seeing somewhat the same behavior you are describing, that the second metric is not being updated correctly. However in the newest version of torchmetrics things are working as expected.
Therefore, please update to the newest version of torchmetrics.
Closing issue.

YicunDuanUMich added bug / fix Something isn't working help wanted Extra attention is needed labels Jun 12, 2024

SkafteNicki closed this as completed Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The compute_groups may lead to unexpected behavior #2596

The compute_groups may lead to unexpected behavior #2596

YicunDuanUMich commented Jun 12, 2024

github-actions bot commented Jun 12, 2024

SkafteNicki commented Jul 22, 2024

The compute_groups may lead to unexpected behavior #2596

The compute_groups may lead to unexpected behavior #2596

Comments

YicunDuanUMich commented Jun 12, 2024

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

github-actions bot commented Jun 12, 2024

SkafteNicki commented Jul 22, 2024