kaggle AUC != sklearn AUC #6711

IraKorshunova · 2016-04-25T14:08:01Z

I've written code to compare Kaggle's and sklearn's ROC AUC, and they appear to be very different.
The code to reproduce my results you can find here:
https://github.com/IraKorshunova/metrics_test
It gives:

Public Kaggle AUC: 0.77880025032
Public Sklearn AUC 0.757001519802

AUC package in R gives the same score as Kaggle.

The text was updated successfully, but these errors were encountered:

jnothman · 2016-04-25T14:17:55Z

May be due to #3864, i.e. it relates to the handling of small differences between scores.

amueller · 2016-04-25T15:56:22Z

or interpolation? but probably not. Would be nice to know the definition of AUC that kaggle uses without diving into the code...

IraKorshunova · 2016-04-25T16:12:25Z

it's here: https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/auc.py

mblondel · 2016-05-02T08:31:19Z

I think small differences are to be expected although here the difference seems large.

BTW, in our unit tests, we test our implementation against this alternative implementation:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/tests/test_ranking.py#L87

IraKorshunova · 2016-05-02T08:48:16Z

The alternative gives AUC, which is very close to Kaggle's:

Public Kaggle AUC: 0.77880025032
Public Sklearn AUC 0.757001519802
Public Alternative sklearn AUC 0.776839407575

mblondel · 2016-05-02T09:29:58Z

Could you plot the ROC curve?

IraKorshunova · 2016-05-02T09:54:11Z

chenhe95 · 2016-10-05T16:16:29Z

I will take a look into this.

chenhe95 · 2016-10-21T01:23:51Z

This issue seems to be fixed now with sklearn version 0.18
The issue was fixed on this merge 49d126f

amueller · 2016-10-21T18:21:14Z

Thanks for checking @chenhe95. Closing as fixed by #7331

mblondel · 2016-10-25T18:44:30Z

Try to find what commit introduced this change. Also, are there test failures when you comment out these lines?

chenhe95 · 2016-10-25T18:54:59Z

You might have read the comment I made (which was incorrect) prior to editing my post, since it says replied from mail.

jnothman · 2016-10-25T20:51:48Z

@chenhe95 just to be sure, you retracted and no longer stand by your comment that blamed some difference on drop_intermediate?

chenhe95 · 2016-10-25T21:36:57Z

Yes. drop_intermediate was not the cause.
The cause was 49d126f which has already been taken care of.

jnothman mentioned this issue Apr 25, 2016

Bug in metrics.roc_auc_score #3864

Closed

nelson-liu mentioned this issue May 29, 2016

roc_auc_score computation is wrong for large samples #6842

Closed

amueller added the Bug label Aug 15, 2016

amueller added this to the 0.18 milestone Aug 31, 2016

amueller mentioned this issue Sep 6, 2016

error in average_precision_score #5379

Closed

amueller modified the milestones: 0.18, 0.19 Sep 22, 2016

amueller modified the milestone: 0.19 Sep 29, 2016

amueller closed this as completed Oct 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

kaggle AUC != sklearn AUC #6711

kaggle AUC != sklearn AUC #6711

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kaggle AUC != sklearn AUC #6711

kaggle AUC != sklearn AUC #6711

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!