-
-
Notifications
You must be signed in to change notification settings - Fork 26k
MRG add log loss (cross-entropy loss) to metrics #2013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,6 +36,7 @@ | |
hamming_loss, | ||
hinge_loss, | ||
jaccard_similarity_score, | ||
log_loss, | ||
matthews_corrcoef, | ||
mean_squared_error, | ||
mean_absolute_error, | ||
|
@@ -1801,3 +1802,30 @@ def test__column_or_1d(): | |
assert_array_equal(_column_or_1d(y), np.ravel(y)) | ||
else: | ||
assert_raises(ValueError, _column_or_1d, y) | ||
|
||
|
||
def test_log_loss(): | ||
# binary case with symbolic labels ("no" < "yes") | ||
y_true = ["no", "no", "no", "yes", "yes", "yes"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is supporting string format necessary for metrics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't believe so, but it fell out of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The user could use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course, but in the binary case, Inconsistencies have crept into the scikit-learn API, and I'm dealing with them here so the user doesn't have to :( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When classes are inferred, do we consider that the class order is ascending, e.g. ("no" < "yes")? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Answering my own question |
||
y_pred = np.array([[0.5, 0.5], [0.1, 0.9], [0.01, 0.99], | ||
[0.9, 0.1], [0.75, 0.25], [0.001, 0.999]]) | ||
loss = log_loss(y_true, y_pred) | ||
assert_almost_equal(loss, 1.8817971) | ||
|
||
# multiclass case; adapted from http://bit.ly/RJJHWA | ||
y_true = [1, 0, 2] | ||
y_pred = [[0.2, 0.7, 0.1], [0.6, 0.2, 0.2], [0.6, 0.1, 0.3]] | ||
loss = log_loss(y_true, y_pred, normalize=True) | ||
assert_almost_equal(loss, 0.6904911) | ||
|
||
# check that we got all the shapes and axes right | ||
# by doubling the length of y_true and y_pred | ||
y_true *= 2 | ||
y_pred *= 2 | ||
loss = log_loss(y_true, y_pred, normalize=False) | ||
assert_almost_equal(loss, 0.6904911 * 6, decimal=6) | ||
|
||
# check eps and handling of absolute zero and one probabilities | ||
y_pred = np.asarray(y_pred) > .5 | ||
loss = log_loss(y_true, y_pred, normalize=True, eps=.1) | ||
assert_almost_equal(loss, log_loss(y_true, np.clip(y_pred, .1, .9))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer to use the output of decision_function. This way, we will be able to compute the loss value of linear classifiers without predict_proba support. As indicated in Eq. (4.104), we just need to apply softmax to the output of decision_function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought of doing that, but it would only produce meaningful results for LR-like models. When used on, say, an `SVC(probability=True), an SGDClassifier with the modified Huber loss or AdaBoost, the results come out wrong. If the function takes predict_proba output, it can work for any probability model, and it can be used in a self-training EM loop, which is why I wanted this function in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. For the record, the version I implemented is Eq. (12) (without the regularization term) in http://www.mblondel.org/publications/mblondel-mlj2013.pdf.