-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[WIP] Eleven point average precision #9091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Remove the eleven average precision score Add better tests.
Based on work by @ndingwall
I will not finish this PR, as I have no need for the eleven point. I am just doing the PR so that the code doesn't get lost. @ndingwall, if you're interest, please do take over. |
I wrote a simpler reference implementation which does not always agree with the one here, see this gist. |
@amueller @jnothman @ndingwall can we please make some progress on this PR? This issue has gotten a bit out of hand at this point. |
what is your response to @vene's gist?
|
@jnothman the gist seems to be a to-the-point implementation of the AP metric defined in the Pascal VOC paper. I'd like to hear comments from @ndingwall. |
I've took a look at @vene's gist and the discrepancies seem to come from weird behavior of To check what was going on, I printed the out Fixing that is probably out of scope for this PR! |
I don't like interpolated average precision because it seems like cheating to me (as discussed in #7356); I only included it in the previous incarnation of this PR since it was an easy addition to a function I was already rewriting and might have satisfied #4577. But as I noted in #7356 (comment), the Pascal VOC evaluation code doesn't actually use 11-point AP, and it's not clear whether we should be implementing something that agrees with their code, or something that agrees with their definition (on page 11 of the challenge documentation), or both. There's discussion of that in #4577. In any case, given my views about |
@ndingwall speaking on behalf of the Computer Vision community, the VOC AP is still a widely used metric for reporting average precision. Here's the kicker: I took a look at Ross Girshick's AP code and the comments state that the interpolated version is the correct one and not the 11 pt version. You can see it here. That being said, I think it would be immensely useful for scikit-learn to implement a Pascal VOC Average Precision metric since most modern workflows use Python. I agree with your point that having a separate function for VOC specific Average Precision computation (with clarifying documentation) would be the right thing to do. I am of course willing to help see this through, both in terms of code as well as management. |
@varunagrawal that sounds good. I have implementations of both 11-point interpolation and the interpolation you linked to if you want them. 11-point is in the existing PR, and the other one is just something like EDIT: here's a 5x faster implementation to compute an interpolated AP:
|
(Although you should hold off until @amueller / @jnothman / @GaelVaroquaux give their thoughts on making a new function). |
I am alright with including standard metrics if the fact that they are
problematic is clearly noted and referenced
|
@amueller @GaelVaroquaux ping! |
Any updates on this? It's been a month since the last activity on this issue. |
I don't think you should anticipate @GaelVaroquaux finishing this but you're welcome to open a PR completing the work and try persuade reviewers that it's sufficiently useful and well-being documented given the discussion here |
(Although you should hold off until @amueller / @jnothman /
@GaelVaroquaux give their thoughts on making a new function).
No strong thoughts. I am not against this, as long as it is clear, well
documented, and tested.
However, don't expect me to invest time in it. I am too busy and it's not
high on my priority list.
|
Seems like we're not really very enthusiastic about inclusion of the algorithm in sklearn anymore. |
Continuation of #9017 (itself a continuation of #7356).
Adds an eleven-point interpolation method for average precision, as described in the IR book.