8000 kaggle AUC != sklearn AUC · Issue #6711 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

kaggle AUC != sklearn AUC #6711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
IraKorshunova opened this issue Apr 25, 2016 · 14 comments
Closed

kaggle AUC != sklearn AUC #6711

IraKorshunova opened this issue Apr 25, 2016 · 14 comments
Labels
Milestone

Comments

@IraKorshunova
Copy link

I've written code to compare Kaggle's and sklearn's ROC AUC, and they appear to be very different.
The code to reproduce my results you can find here:
https://github.com/IraKorshunova/metrics_test
It gives:

Public Kaggle AUC: 0.77880025032
Public Sklearn AUC 0.757001519802

AUC package in R gives the same score as Kaggle.

@jnothman
Copy link
Member
jnothman commented Apr 25, 2016

May be due to #3864, i.e. it relates to the handling of small differences between scores.

@amueller
Copy link
Member

or interpolation? but probably not. Would be nice to know the definition of AUC that kaggle uses without diving into the code...

@IraKorshunova
Copy link
Author

@mblondel
Copy link
Member
mblondel commented May 2, 2016

I think small differences are to be expected although here the difference seems large.

BTW, in our unit tests, we test our implementation against this alternative implementation:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/tests/test_ranking.py#L87

@IraKorshunova
Copy link
Author

The alternative gives AUC, which is very close to Kaggle's:

Public Kaggle AUC: 0.77880025032
Public Sklearn AUC 0.757001519802
Public Alternative sklearn AUC 0.776839407575

@mblondel
Copy link
Member
mblondel commented May 2, 2016

Could you plot the ROC curve?

@IraKorshunova
Copy link
Author

figure_1

@chenhe95
Copy link
Contributor
chenhe95 commented Oct 5, 2016

I will take a look into this.

@chenhe95
Copy link
Contributor
chenhe95 commented Oct 21, 2016

This issue seems to be fixed now with sklearn version 0.18
The issue was fixed on this merge 49d126f

@amueller
Copy link
Member

Thanks for checking @chenhe95. Closing as fixed by #7331

@mblondel
Copy link
Member
8000 mblondel commented Oct 25, 2016 via email

@chenhe95
Copy link
Contributor

You might have read the comment I made (which was incorrect) prior to editing my post, since it says replied from mail.

@jnothman
Copy link
Member

@chenhe95 just to be sure, you retracted and no longer stand by your comment that blamed some difference on drop_intermediate?

@chenhe95
Copy link
Contributor

Yes. drop_intermediate was not the cause.
The cause was 49d126f which has already been taken care of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants
0