FIX f1_score with zero_division=1 on binary classes #27165

OmarManzoor · 2023-08-25T12:14:53Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Fixes some incorrect behavior observed with f1 score on binary classfication inputs.

Any other comments?

CC: @glemaitre Could you kindly have a look to see if this makes sense? I am not totally sure this is the correct fix so marking the PR as draft.

github-actions · 2023-08-25T12:15:48Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 612c77e. Link to the linter CI: here}

glemaitre · 2023-09-14T18:52:03Z

sklearn/metrics/_classification.py

+            pred_sum + true_sum, 0
+        )
+        denom[denom_mask] = 1  # avoid division by 0
+        f_score = numer / denom


Considering the issue reported in #27189, I think that we can express it without relying on the precision and recall:

# The score is defined as: # score = (1 + beta**2) * precision * recall / (beta**2 * precision + recall) # Therefore, you can express the score in terms of confusion matrix entries as: # score = (1 + beta**2) * tp / ((1 + beta**2) * tp + beta**2 * fn + fp) denom = beta2 * true_sum + pred_sum f_score = _prf_divide( (1 + beta2) * tp_sum, denom, "fscore", "true nor predicted", average, warn_for, zero_division, )

In this case, we handle it as any other by zero division.

glemaitre

The fix that I propose should solve the issue in #27189 and as well the original issue #26965

glemaitre · 2023-09-14T19:11:49Z

The warning raise in _prf_divide is maybe wrong because the f1-score is not necessarly not defined now.

glemaitre · 2023-09-14T19:15:16Z

By using my proposed formulation, we have two tests that start to fail because instead of having the value provided by zero_division, we have an actual value.

… recall

OmarManzoor · 2023-09-18T07:42:44Z

@glemaitre Should we remove the warning that we are specifically raising for f1

# warn for f-score only if zero_division is warn, it is in warn_for
    # and BOTH prec and rec are ill-defined
    if zero_division == "warn" and ("f-score",) == warn_for:
        if (pred_sum[true_sum == 0] == 0).any():
            _warn_prf(average, "true nor predicted", "F-score is", len(true_sum))

glemaitre · 2023-09-18T07:46:05Z

Indeed since now the warning will be handled in the new _prf_divide. Previously if one of the precision or recall was undefined then, we could not provide a score which is not the case now. We just base it on the denominator only.

FIX f1_score with zero_division=1 on binary classes

5083b2c

github-actions bot added the module:metrics label Aug 25, 2023

Merge branch 'main' into f1_fix

8ea01e1

glemaitre reviewed Sep 14, 2023

View reviewed changes

glemaitre mentioned this pull request Sep 14, 2023

F1 score not calculated properly #27189

Closed

OmarManzoor added 3 commits September 18, 2023 12:05

Merge branch 'main' into f1_fix

0ec9972

Fix f1 score using a formulation which does not require precision and…

fdbfa71

… recall

Add changlog

ae01c7a

OmarManzoor marked this pull request as ready for review September 18, 2023 07:32

OmarManzoor added 3 commits September 18, 2023 12:49

Remove f1 warning that does not apply anymore

6678537

Fix warnings and tests that failed because of warning change

b5d1f3e

Correct value in doctest

9b67107

glemaitre self-requested a review October 11, 2023 19:14

Merge remote-tracking branch 'origin/main' into pr/OmarManzoor/27165

612c77e

glemaitre mentioned this pull request Oct 12, 2023

FIX f1_score with zero_division=1 uses directly confusion matrix statistic #27577

Merged

glemaitre removed their request for review October 27, 2023 13:37

glemaitre closed this in #27577 Dec 11, 2023

OmarManzoor deleted the f1_fix branch May 3, 2024 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX f1_score with zero_division=1 on binary classes #27165

FIX f1_score with zero_division=1 on binary classes #27165

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FIX f1_score with zero_division=1 on binary classes #27165

FIX f1_score with zero_division=1 on binary classes #27165

Uh oh!

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

✔️ Linting Passed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!