Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd behaviour with macro averaging #1664

Closed
irisdum opened this issue Mar 28, 2023 · 2 comments · Fixed by #1821
Closed

Odd behaviour with macro averaging #1664

irisdum opened this issue Mar 28, 2023 · 2 comments · Fixed by #1821
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Milestone

Comments

@irisdum
Copy link

irisdum commented Mar 28, 2023

🐛 Bug

I have a doubt concerning the macro averaging implementation. I find that odd that given the same "pred" and "trg" tensor, if you increase the num_class parameter, the metric value decreases. It seems that the value of a class, which is not the given sample, is set to 0. Therefore, when applying “macro” averaging, this metric value set at 0 is still taken into account in the mean per class.
Here is an example with the F1 score, but this behaviour is found with other classification metrics as accuracy and recall.

To Reproduce

import torch
import torchmetrics
from torchmetrics import F1Score
print(" Torchmetrics version {}".format(torchmetrics.__version__))
target = torch.tensor([2, 1, 0, 0])
preds = torch.tensor([2, 1, 0, 1])
for i in range(3,9):
    f1_score=F1Score(task="multiclass",num_classes=i,average="macro")
    torch_met=f1_score(preds, target)
    print("num class {} f1 score {}".format(i,torch_met))

Output :
Torchmetrics version 0.11.4
num class 3 f1 score 0.7777777910232544
num class 4 f1 score 0.5833333730697632
num class 5 f1 score 0.46666669845581055
num class 6 f1 score 0.3888888955116272
num class 7 f1 score 0.3333333432674408
num class 8 f1 score 0.2916666865348816

Expected behavior

To my mind, we would expect the F1 score to be constant even when num_class value increases. If a class is not in a sample, it should not impact the classification metric value.
To avoid that effect, we could consider set to Nan the value of the class which are not in the sample and then apply nanmean.

Environment

  • TorchMetrics version 0.11.4 installed with pip :
  • Python 3.10 & PyTorch Version 2.0.0:
@irisdum irisdum added bug / fix Something isn't working help wanted Extra attention is needed labels Mar 28, 2023
@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

@SkafteNicki SkafteNicki self-assigned this Mar 29, 2023
@SkafteNicki
Copy link
Member

Hi @irisdum,
Thanks for reporting this, i will look into it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
3 participants
@SkafteNicki @irisdum and others