8000 DOC: sklearn.metrics.auc_score should mention that using probabilities will give better scores · Issue #1393 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC: sklearn.metrics.auc_score should mention that using probabilities will give better scores #1393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tjanez opened this issue Nov 22, 2012 · 16 comments

Comments

@tjanez
Copy link
Contributor
tjanez commented Nov 22, 2012

The documentation at: http://scikit-learn.org/dev/modules/generated/sklearn.metrics.auc_score.html#sklearn.metrics.auc_score
says that y_score can be either probability estimates of the positive class, or binary decisions.

It should warn the reader that by using binary decisions, it is only able to compute AUC as if the classifier only returned probabilities 0 and 1 and thus not give the "real" AUC.

Here is an example:

from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn import cross_validation
from sklearn import datasets

data = datasets.load_digits()
X, y = data.data, data.target
# make the classification problem binary
X = X[(y == 8) | (y == 6)]
y = y[(y == 8) | (y == 6)]

clf = LogisticRegression(C=0.001)

k_fold = cross_validation.KFold(len(y), k=10, indices=True, shuffle=True, random_state=18)

AUCs = []
AUCs_proba = []
for train, test in k_fold:
    clf.fit(X[train], y[train])
    AUCs.append(metrics.auc_score(y[test], clf.predict(X[test])))
    AUCs_proba.append(metrics.auc_score(y[test], clf.predict_proba(X[test])[:, 1]))

print "AUCs: "
print AUCs
print "AUCs (with probabilities): "
print AUCs_proba

This is the output:

AUCs: 
[1.0, 0.97222222222222221, 1.0, 0.97058823529411764, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
AUCs (with probabilities): 
[1.0, 1.0, 1.0, 0.99673202614379086, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

I admit this is not a very good example, as the difference between AUCs and AUCs_proba could be a lot bigger in practice, but I wanted to use a built-in data set.

Note that AUC computed from binary decisions is always inferior to the AUC computed with probability estimates.

@tjanez
Copy link
Contributor Author
tjanez commented Nov 22, 2012

Great!

Please reference the relevant pull request here, so I can give you my input.

@mblondel
Copy link
Member

I think that what is meant by binary decision is not the output of predict but the output of decision_function. And binary refers to binary classification, not binary values. This is indeed a bit unclear. AUC is the area under the ROC curve and the ROC curve consists in computing the true positive and false positive rates for different decision thresholds. So y_score needs to be real values. We could check if np.unique(y_score) contains only 2 values and raise an exception in that case.

@GaelVaroquaux
Copy link
Member

We could check if np.unique(y_score) contains only 2 values and raise
an exception in that case.

I'd rather have a warning: it could be legitimate. That said a warning
would be useful.

@tjanez
Copy link
Contributor Author
tjanez commented Nov 23, 2012

This is a bit unclear.

Yes, I agree. Isn't decision_function a method of regression models? Computing AUC for such models doesn't make sense.
Anyhow, the documentation should be clearer about this.

We could check if np.unique(y_score) contains only 2 values

I agree with @mblondel that we should check for this case and with @GaelVaroquaux that it should only be a warning.
For example, you could have a classifier that doesn't give you probabilities, only 0s and 1s. In this case, when computing the AUC, you would interpret 0s as probability 0.0 and 1s as probability 1.0.

@mblondel
Copy link
Member

Yes, I agree. Isn't decision_function a method of regression models? Computing AUC for such models doesn't make sense.

decision_function gives you the dot product between the coefficient vectors and the data. It can be interpreted as a score, hence it can be used for AUC.

@tjanez
Copy link
Contributor Author
tjanez commented Nov 23, 2012

decision_function gives you the dot product between the coefficient vectors and the data. It can be interpreted as a score, hence it can be used for AUC.

Yes, it can be interpreted as score, but you also need class values to be able to compute the ROC curve and from that the AUC.

That's why I said computing AUC for regression models doesn't make sense.

@mblondel
Copy link
Member
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_classes=2)
>>> from sklearn.svm import SVC
>>> clf = SVC(kernel="rbf")
>>> clf.fit(X, y)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, shrinking=True, tol=0.001,
  verbose=False)
>>> from sklearn.metrics import auc_score
>>> y_score = clf.decision_function(X).ravel()
>>> auc_score(y, y_score)
0.998

@tjanez
Copy link
Contributor Author
tjanez commented Nov 24, 2012

@mblondel, what did you try to demonstrate with your example?

I think we actually agree on when and how to compute the AUC. Please, read my previous comment.

@amueller
Copy link
Member

@tjanez decision_function is not a function for regression models, it is a function that some classifiers have. There are no regression models with a decision_function.

@mblondel
Copy link
Member

I think we actually agree on when and how to compute the AUC. Please, read
my previous comment.

Well, you wrote that decision_function doesn't make sense for computing the
AUC...

@tjanez
Copy link
Contributor Author
tjanez commented Nov 24, 2012

Well, you wrote that decision_function doesn't make sense for computing the AUC...

Ok, this is a misunderstanding then. I said that "computing AUC for regression models doesn't make sense". I agree with you that you can use the the output of the decision function instead of probabilities for computing the AUC.

To return to the original problem of this issue, do we agree that the auc_score method should check if np.unique(y_score) contains only 2 values and raise a warning in that case?

@tjanez
Copy link
Contributor Author
tjanez commented Nov 24, 2012

There are no regression models with a decision_function.

@amueller , for example, LinearRegression has a decision_function: http://scikit-learn.org/dev/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.decision_function

@mblondel
Copy link
Member

Yes +1 for me. We might want to implement a small Cython utility function check_binary (to be put in utils/arrayfuncs.pyx) instead of using np.unique, which could be expensive for big arrays. check_binary could be useful in BernouilliNB for instance.

@amueller
Copy link
Member

+1 for raising a warning.
About LinearRegression: I consider this a bug. It is an artifact of the inheritance structure. I'll open an issue.

@BassT
Copy link
BassT commented Apr 6, 2016

+1 also for warning.

8760

@amueller
Copy link
Member

fixed in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
0