8000 ENH Add Multiclass Brier Score Loss by ogrisel · Pull Request #22046 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH Add Multiclass Brier Score Loss #22046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 71 commits into from
Mar 20, 2025

Conversation

ogrisel
Copy link
Member
@ogrisel ogrisel commented Dec 21, 2021

Resolves #16055.
This PR updates #18699 by @aggvarun01 8000 after a merge with main and resolves merge conflicts. I do not have the permissions to push directly in the original branch and opening a sub-PR pointing to #18699 would lead to an unreadable diff because of the one-year merge sync.

I also added a changelog entry and demonstrate the new function in the multiclass calibration example.

@aggvarun01 if you want feel free to pull the last commit from this commit from this branch to your branch. Alternatively we can finalize the review here.

Copy link
Member Author
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for merge once https://github.com/scikit-learn/scikit-learn/pull/22046/files#r1943364780 is addressed (and conflicts are resolved).

@lorentzenchr do you agree with the way this PR evolved, in particular the points I raised in #22046 (comment)?

Copy link
Member Author
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still +1 for merge (I cannot approve the PR in github because I am the creator of the PR).

Copy link
Member
@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred to have all the non-related input validation and test thing changes in a separate PR.

if name in METRICS_REQUIRE_POSITIVE_Y:
y_true, y_pred = _require_positive_targets(y_true, y_pred)
always_symmetric = True
for _ in range(5):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment would help: why the loop? (make lucky test passes very unlikely)

Comment on lines 662 to 663
if always_symmetric: # pragma: no cover
raise ValueError(f"{name} seems to be symmetric")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if always_symmetric: # pragma: no cover
raise ValueError(f"{name} seems to be symmetric")
if not always_symmetric:
raise ValueError(f"{name} seems to be asymmetric")

There should be a test for this, e.g. applying test_not_symmetric_metric to log_loss and test for a fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meta test test_symmetry_tests in 6de5e13 checks test_symmetric_metric and test_not_symmetric_metric.

Comment on lines 3493 to 3497
For :math:`N` observations labeled from :math:`C` possible classes, the Brier
score is defined as:

.. math::
\\frac{1}{N}\\sum_{i=1}^{N}\\sum_{c=1}^{C}(y_{ic} - \\hat{p}_{ic})^{2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, we try to avoid LaTeX in docstrings and just link to the user guide. If LaTeX then only in the the Notes section (this may be a numpy thing) @glemaitre my know better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The math are moved to the Notes section in 58e5f18. If you feel that's too redundant with the User Guide I can remove the Notes section.

@antoinebaker
Copy link
Contributor

Thanks for the reviews @thomasjpfan and @lorentzenchr. I think the PR is ready for a final round of reviews.

@antoinebaker
Copy link
Contributor

I would have preferred to have all the non-related input validation and test thing changes in a separate PR.

Yes, sorry that we mix refactoring the brier_score_loss and log_loss input validation with adding multiclass support for Brier score. There will be a follow up PR to harmonize further the log_loss and brier_score_loss API and testing.

10000
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@lorentzenchr lorentzenchr merged commit 318a282 into scikit-learn:main Mar 20, 2025
33 checks passed
@lorentzenchr
Copy link
Member

There will be a follow up PR to harmonize further the log_loss and brier_score_loss API and testing.

If you intent to do that, I would very much like to have the classification metrics better structured, i.e. putting log loss and brier score at the top where they belong. This might also help with not needing to define things twice.

SwathiR1999 pushed a commit to SwathiR1999/scikit-learn that referenced this pull request Mar 21, 2025
Co-authored-by: Varun Aggarwal <varunaggarwal@Varuns-MBP.fios-router.home>
Co-authored-by: Antoine Baker <antoine.baker59@gmail.com>
@ogrisel ogrisel deleted the multiclass_brier_score_loss branch March 24, 2025 15:34
lucyleeow pushed a commit to lucyleeow/scikit-learn that referenced this pull request Apr 2, 2025
Co-authored-by: Varun Aggarwal <varunaggarwal@Varuns-MBP.fios-router.home>
Co-authored-by: Antoine Baker <antoine.baker59@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add multiclass support for brier_score_loss
6 participants
0