Improve AUC numeric stability #224

SkafteNicki · 2021-05-04T13:54:42Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Fixes #219
In rare case when the input is very large, the dx here
https://github.com/PyTorchLightning/metrics/blob/cb6899bb47c7d30f8626d6ef8c28cc29efce82d6/torchmetrics/functional/classification/auc.py#L41-L49
will even for monotone increasing/decreasing (input that is already sorted) result in a mix of positive and negative errors making the algorithm raise the error. This happens when x[i] and x[i+1] is the same number but due to numerical stability x[i+1]-x[i] will randomly either be a small negative or positive number.

This PR solves it by adding a small tolerance to the checks.

I checked with the provided repo/script in the discussion that it correctly solves the error.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

pep8speaks · 2021-05-04T13:54:45Z

Hello @SkafteNicki! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-05-05 17:35:31 UTC

codecov · 2021-05-04T13:56:30Z

Codecov Report

Merging #224 (9692400) into master (33864db) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #224   +/-   ##
=======================================
  Coverage   96.80%   96.80%           
=======================================
  Files          92       92           
  Lines        3005     3005           
=======================================
  Hits         2909     2909           
  Misses         96       96

Flag	Coverage Δ
Linux	`79.06% <66.66%> (ø)`
Windows	`79.06% <66.66%> (ø)`
cpu	`96.80% <100.00%> (ø)`
macOS	`96.80% <100.00%> (ø)`
pytest	`96.80% <100.00%> (ø)`
python3.6	`95.74% <100.00%> (ø)`
python3.8	`96.77% <100.00%> (+0.09%)`	⬆️
python3.9	`96.67% <100.00%> (ø)`
torch1.3.1	`95.74% <100.00%> (ø)`
torch1.4.0	`95.87% <100.00%> (?)`
torch1.8.1	`96.67% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
torchmetrics/functional/classification/auc.py	`87.50% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33864db...9692400. Read the comment docs.

maximsch2 · 2021-05-04T16:00:28Z

torchmetrics/functional/classification/auc.py

-        if (dx < 0).any():
-            if (dx <= 0).all():
+        if (dx + tol < 0).any():
+            if (dx <= tol).all():


am I reading it correctly that if I have

dx = [-1e-7, 5e-7, 5e-7, ... 10000 times ...., 5e-7] then we'll deduce incorrect direction here?

yes, it seems that direction/sign is changed, so shall ve preserve it...

maybe I am misunderstanding the question...
the tensor
dx = [1e-7, 5e-7, 5e-7, ... 10000 times ...., 5e-7]
implies that direction=1 right? so if we say that numerical instability leads to a change of sign in the first element
dx = [-1e-7, 5e-7, 5e-7, ... 10000 times ...., 5e-7]
then it will still be direction=1 with the change.

Sorry, let's maybe use this instead:
dx = [-1.0, 5e-7, 5e-7, ... 10000 times ...., 5e-7]
Now both (dx+tol<0).any() and (dx <= tol).all() are true and we'll discover direction as -1, wheras the whole thing is incorrectly sorted.

thanks, I see the problem now...
Do you have anyway to solve this?

Numerical issues are usually tricky. I'm suggesting we remove checks for the internal callers of this function, let me quickly show an example.

@SkafteNicki , check out #230

@maximsch2 looks good, closing this in favour of yours

SkafteNicki added 2 commits May 4, 2021 11:04

num_stability

e2033f9

update

9134732

SkafteNicki added the bug / fix Something isn't working label May 4, 2021

SkafteNicki requested review from ananyahjha93, Borda, justusschock and tchaton as code owners May 4, 2021 13:54

changelog

b3eb4cf

Borda approved these changes May 4, 2021

View reviewed changes

Merge branch 'master' into auc_numeric_stability

794439e

Borda enabled auto-merge (squash) May 4, 2021 13:59

Borda added the ready label May 4, 2021

Merge branch 'master' into auc_numeric_stability

de3655c

maximsch2 reviewed May 4, 2021

View reviewed changes

Borda disabled auto-merge May 4, 2021 16:32

mergify bot added 2 commits May 4, 2021 18:42

Merge branch 'master' into auc_numeric_stability

c3890e8

Merge branch 'master' into auc_numeric_stability

61fccd9

Borda requested a review from a team May 4, 2021 22:16

Merge branch 'master' into auc_numeric_stability

0c13511

SkafteNicki removed the ready label May 5, 2021

Merge branch 'master' into auc_numeric_stability

9692400

SkafteNicki closed this May 6, 2021

Borda deleted the auc_numeric_stability branch May 10, 2021 08:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve AUC numeric stability #224

Improve AUC numeric stability #224

SkafteNicki commented May 4, 2021 •

edited by Borda

Loading

pep8speaks commented May 4, 2021 •

edited

Loading

codecov bot commented May 4, 2021 •

edited

Loading

maximsch2 May 4, 2021

Borda May 4, 2021

SkafteNicki May 4, 2021

maximsch2 May 4, 2021 •

edited

Loading

SkafteNicki May 5, 2021

maximsch2 May 5, 2021

maximsch2 May 5, 2021

SkafteNicki May 6, 2021

Improve AUC numeric stability #224

Improve AUC numeric stability #224

Conversation

SkafteNicki commented May 4, 2021 • edited by Borda Loading

Before submitting

What does this PR do?

PR review

Did you have fun?

pep8speaks commented May 4, 2021 • edited Loading

Comment last updated at 2021-05-05 17:35:31 UTC

codecov bot commented May 4, 2021 • edited Loading

Codecov Report

maximsch2 May 4, 2021

Choose a reason for hiding this comment

Borda May 4, 2021

Choose a reason for hiding this comment

SkafteNicki May 4, 2021

Choose a reason for hiding this comment

maximsch2 May 4, 2021 • edited Loading

Choose a reason for hiding this comment

SkafteNicki May 5, 2021

Choose a reason for hiding this comment

maximsch2 May 5, 2021

Choose a reason for hiding this comment

maximsch2 May 5, 2021

Choose a reason for hiding this comment

SkafteNicki May 6, 2021

Choose a reason for hiding this comment

SkafteNicki commented May 4, 2021 •

edited by Borda

Loading

pep8speaks commented May 4, 2021 •

edited

Loading

codecov bot commented May 4, 2021 •

edited

Loading

maximsch2 May 4, 2021 •

edited

Loading