-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix nDCG can not be called with negative relevance targets #378
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also add a test to cover this behaviour as nothing was failing till now...
Codecov Report
@@ Coverage Diff @@
## master #378 +/- ##
=======================================
Coverage 96.44% 96.45%
=======================================
Files 120 120
Lines 3801 3804 +3
=======================================
+ Hits 3666 3669 +3
Misses 135 135
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@paul-grundmann could you try changing the |
@paul-grundmann mind check the failing tests and #378 (comment) |
It seems that the random generation of inputs can lead to invalid inputs for the calculation of the DCG (e.g. a single target with 0 relevance). This leads to a division by zero:
Should I add a test in the nDCG calculation if the ideal DCG is zero and then return 0.0 as a result? |
It seems to me that maybe the ndcg should be changed to account for division by zero. Looking at sklearns implementation: |
Yes, I think the functional implementation should behave like the |
- Use the scikit-learn implementation of nDCG - Removed the test for non binary targets in test_ndcg.py and replaced the default parameters in the error test with a custom one that does not check for binary targets - set the _input_retrieval_scores_non_binary_target low to -1 to reduce the test failure rate
for more information, see https://pre-commit.ci
Ok I had some time to play around since the recent code change introduced a lot more failing tests. On the other hand the current architecture of the tests. In the I think the first problem is more drastically because the tests can actually fail randomly. With relevance targets of only -1 instead of -2 I had a lot of successful test runs but one or two failing at some point. Maybe the inputs need to be generated differently or at least with a check if the targets with k=1 contain at least one relevant sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐰
Hello @paul-grundmann! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-07-28 16:07:39 UTC |
for more information, see https://pre-commit.ci
Before submitting
What does this PR do?
Fixes #377
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Of course I had fun fixing this issue :)