Documentation section 3.3.1.1 has incorrect description of brier_score_loss #13887

Sycor4x · 2019-05-15T16:03:31Z

In the documentation, section 3.3.1.1. "Common cases: predefined values" includes the remark

All scorer objects follow the convention that higher return values are better than lower return values.

As far as I can tell, this is true for all of the listed metrics, except the brier_score_loss. In the case of brier_score_loss, a lower loss value is better. This is because brier_score_loss measures the mean-square difference between a predicted probability and a categorical outcome; the Brier score is minimized at 0.0 because all summands are either (0 - 0) ^ 2=0 or (1 -1) ^ 2=0 when the model is making perfect predictions. On the other hand, the Brier score is maximized at 1.0 when all predictions are opposite the correct label, as all summands are either (0 - 1)^2=1 or (1 - 0)^2=1.

Therefore, the definition of the brier_score_loss is not consistent with the quotation from section 3.3.1.1.

I suggest making 2 changes to relieve this confusion.

Implement a function neg_brier_score_loss which simply negates the value of brier_score_loss; this is a direct analogy to what is done in the case of neg_log_loss. A better model has a lower value of log-loss (categorical cross-entropy loss), therefore a larger value of the negative log-loss implies a better model. Naturally, the same is true for Brier score, where it is also the case that a better model is assigned a lower loss.
Remove reference to brier_score_loss from section 3.3.1.1. Brier score is useful in lots of ways; however, because it does not have the property that a larger value implies a better model, it seems confusing to mention it in the context of section 3.3.1.1. References to brier_score_loss can be replaced with neg_brier_score_loss, which has the property that better models have large values, just like accuracy, ROC AUC and the rest of the listed metrics.

The text was updated successfully, but these errors were encountered:

ogrisel · 2019-05-16T07:26:38Z

Indeed this is probably the right course of action. Please feel free to open a PR if your wish.

qdeffense · 2019-05-16T10:53:42Z

@Sycor4x I'll gladly work on it if you're not already doing it

Sycor4x · 2019-05-17T16:22:05Z

@qdeffense Thank you. I had planned to start these revisions if this suggestion were well-received; however, I've just come down with a cold and won't be able to write coherent code at the moment. If you want to take a stab at this, I support your diligence.

It occurred to me after I wrote this that it is possible for the verbal description in 3.3.1.1 to be incorrect while the behavior of the scorer objects called via the strings in 3.3.1.1 might work correctly in the sense that internally, brier_score_loss behaves in the same manner as neg_log_loss and therefore is consistent with the statement

All scorer objects follow the convention that higher return values are better than lower return values.

If this is the case, then the documentation is the only thing that needs to be tweaked: just make it explicit that some kind of reversal is applied to brier_score_loss such that the block quote is true.

I haven't been able to check -- I'm basically incapacitated right now.

jnothman · 2019-05-21T11:18:44Z

No, we should be using neg_brier_loss or something. We made a similar change for other losses in 7e079c0fd2^...2ba3478.

qinhanmin2014 · 2019-05-21T12:45:14Z

Apologies the mistake was introduced by me.
I agree that we should introduce neg_brier_score_loss and deprecate brier_score_loss.

stefan-matcovici · 2019-06-16T20:49:03Z

Hi! I would want to start working on this.

jnothman · 2019-06-19T00:01:17Z

Thanks @stefan-matcovici

nityamd · 2019-08-24T18:36:47Z

I'll work on this.

amueller · 2019-08-24T18:48:39Z

@nityamd I think @stefan-matcovici is working on that already in #14123.

bharatr21 mentioned this issue May 21, 2019

[MRG] DOC: Fix description of brier_score_loss #13918

Closed

qinhanmin2014 added this to the 0.22 milestone May 21, 2019

qinhanmin2014 added Bug Easy Well-defined and straightforward way to resolve good first issue Easy with clear instructions to resolve help wanted labels Jun 13, 2019

This was referenced Jun 19, 2019

[MRG] DOC: Fix description of brier_score_loss stefan-matcovici/scikit-learn#1

Closed

[MRG] API: Fix description of brier_score_loss #14123

Closed

amueller removed the help wanted label Jul 12, 2019

nityamd mentioned this issue Aug 24, 2019

Brier patch #14785

Closed

stefan-matcovici mentioned this issue Sep 6, 2019

[MRG] API Replace scorer brier_score_loss with neg_brier_score #14898

Merged

qinhanmin2014 closed this as completed in #14898 Sep 19, 2019

zahs123 mentioned this issue Aug 7, 2020

why is brier score for grid search now 'neg_brier_score'? #18117

Closed

artificialfintelligence mentioned this issue Jul 4, 2024

Documentation section 3.4.1.1 has incorrect description that would be correct if the max_loss metric were to be tweaked and renamed #29417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Documentation section 3.3.1.1 has incorrect description of brier_score_loss #13887

Documentation section 3.3.1.1 has incorrect description of brier_score_loss #13887

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Documentation section 3.3.1.1 has incorrect description of brier_score_loss #13887

Documentation section 3.3.1.1 has incorrect description of brier_score_loss #13887

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!