-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
average_precision_score does not return correct AP when all negative ground truth labels #8245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
y_true
is all negative labels
If someone can help me understand how to correctly do the False Positive calculation, I can submit a Pull Request. For this bug, recall will be 1 since there are 0 True Positives and False Negatives. However, to calculate precision, I need to understand the correct way to find False Positives. The above sample results in I am not sure I quite understand how the threshold calculation is being performed here. |
Can you check if #7356 fixes this? |
No it doesn't. I think we just need to have to do something like this to handle this edge case: recall = 1 if tps[-1] == 0 else tps / tps[-1] @varunagrawal if you do a PR please add a non-regression test with only zeros in |
@lesteve can you please also specify what needs to be done for precision? Or should that be as is? |
I believe the code works as it is. You can add a test with only 1s in |
@lesteve hoping you can take a look at the PR. |
Updates on this. Is this merged? |
The current functionality of average_precision has changed. I'm planning to submit a new PR for that. Will close this when the other PR is ready. |
The standard TREC Eval is able to compute AP and other metrics on the same data. |
so how did you solve it ? |
What's the status on this? |
Same problem here with sklearn 0.23.2 |
Finally got the PR in. Sorry about the delay. |
Any update on this issue? |
Just updated sklearn and this still appears to be an issue? |
It's been 5 years now with this issue. I've opened the PR and updated it countless times but the only blocker is the approving review. |
Hi @lesteve, are you planning to approve this PR? Please let us know. Thank you! |
@varunagrawal thanks for submitting a fix :) can you link your update so that I can copy it to my own code base? Also, I think I'm finding that this issue also presents itself when a class has no examples. For example, obviously the following example y_true will trigger this bug
But I think it will still be present even removing the all 0 vector because the final column is all 0s. Do you also find this to be the case? If so, does your PR cover this case? If not, perhaps there's something wrong with my own code. |
Hey there. Is this issue going to be fixed? |
Hi it's 2022 now and this is still an issue, any plans for fixing it? This issue persists for when |
Hi, it's 2022 now and this is issue has been fixed (see #19085). I encourage you to test the release candidate for version 1.1.0: ```pip install scikit-learn==1.1.0rc1``. |
Ok this works both with all -1 and 0, Thanks! |
Description
average_precision_score
does not return correct AP wheny_true
is all negative labels.Steps/Code to Reproduce
One can run this piece of dummy code:
It returns
nan
instead the correct value with the error:Expected Results
As per this Stackoverflow answer, Recall = 1 when FN=0, since 100% of the TP were discovered and Precision = 1 when FP=0, since no there were no spurious results.
Actual Results
Current output is:
Versions
Linux-4.4.0-59-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609]
NumPy 1.12.0
SciPy 0.18.1
Scikit-Learn 0.18.1
The text was updated successfully, but these errors were encountered: