ENH add zero_division=nan for classification metrics #23183

marctorsoc · 2022-04-21T18:19:25Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This is an extension of #14900, where I added the parameter zero_division for precision, recall, and f1. Afterwards, it was added for jaccard as well.

Here, we add the ability to set zero_division to np.nan, so that np.nan is returned when the metric is undefined. In addition to this:

when there is an average the numbers that are np.nan (due to undefined and then zero_division) are excluded from the average.
when beta=0, return precision
when just one of (precision, recall) is defined and it's 0, return fscore=0. Even if the other metric is undefined.

Specifically:

Precision:

If pred_sum = 0, undefined
If average != None, ignore from average any metric being np.nan

Recall:

If true_sum = 0, undefined
If average != None, ignore from average any class metric being np.nan

F-score:

if beta=inf, return recall, and beta=0, return precision
elif precision=0 or recall=0 (or both), return 0. <------------- this is a change
else return zero_division
If average != None, ignore from average any metric being np.nan

Jaccard:

if all labels and pred are 0, return zero_division
If average != None, ignore from average any metric being np.nan

Any other comments?

# Conflicts: # doc/whats_new/v0.21.rst # sklearn/metrics/classification.py # sklearn/metrics/tests/test_classification.py

- F-score only warns if both prec and rec are ill-defined - new private method to simplify _prf_divide

# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py

- add weights casting to np.array

# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py

marctorsoc · 2022-04-21T18:36:30Z

sklearn/metrics/_classification.py

+def _nan_average(scores: np.ndarray, weights: Optional[np.ndarray]):
+    """
+    Wrapper for np.average, with np.nan values being ignored from the average
+    This is similar to np.nanmean, but allowing to pass weights as in np.average


submitted an issue to numpy for this: numpy/numpy#21375, but let me know if there's a better solution than this wrapper!

What is the performance difference for running _nan_average and np.average when there are no nans?

idk... but I updated this so that if no weights are passed it does just np.nanmean which should be very fast. Otherwise, it creates the mask and does the extra work. tbh, I cannot foresee a big degradation since the operations here are very quick, but 🤷‍♂️

marctorsoc · 2022-04-22T07:26:04Z

@thomasjpfan and everyone interested, this is now ready to review :)

thomasjpfan

Thank you for the PR!

I think it would be good to see what @jnothman thinks of this np.nan behavior.

thomasjpfan · 2022-04-22T12:23:40Z

sklearn/metrics/_classification.py

+    if (weights == 0).all():
+        return np.average(scores)


Checking for weights == 0 adds more computation for an edge case. Can we pass weights directly into np.average and not do this check?

unfortunately

ZeroDivisionError When all weights along axis are zero.

see https://numpy.org/doc/stable/reference/generated/numpy.average.html. But will change into try/except

thomasjpfan · 2022-04-22T12:26:01Z

sklearn/metrics/_classification.py

+        Note that if zero_division is np.nan, such values will be excluded
+        from the average.


I think we move this into zero_division. Currently, when reading the zero_division description, the reader needs to scroll up to see what zero_division=np.nan does.

thomasjpfan · 2022-04-22T12:26:51Z

sklearn/metrics/_classification.py

+    Note that if zero_division is np.nan, such values will be excluded
+    from the average.


Same here, I think we can place this in the zero_division description.

Similar comment for the other docstrings.

- move comment to zero_division - try/except in nan_average

# Conflicts: # doc/whats_new/v1.3.rst

glemaitre · 2023-01-23T10:34:26Z

sklearn/metrics/tests/test_classification.py

+@pytest.mark.parametrize("zero_division", [0, 1, np.nan])
+@pytest.mark.parametrize(
+    "y_true, y_pred",
+    [
+        ([0], [1]),
+        ([0, 0], [1, 1]),
+        ([], []),
+    ],
+)
+def test_matthews_corrcoef_nan(zero_division, y_true, y_pred):
+    with warnings.catch_warnings(record=True) as record:
+        mcc = matthews_corrcoef(y_true, y_pred, zero_division=zero_division)
+        assert not record
+
+    if np.isnan(zero_division):
+        assert np.isnan(mcc)
+    else:
+        assert mcc == zero_division
+
+
+@pytest.mark.parametrize(
+    "y_true, y_pred",
+    [
+        ([0], [1]),
+        ([0, 0], [1, 1]),
+        ([], []),
+    ],
+)
+def test_matthews_corrcoef_nan_warn(y_true, y_pred):
+    with warnings.catch_warnings(record=True) as record:
+        mcc = matthews_corrcoef(y_true, y_pred, zero_division="warn")
+        assert len(record) == 1
+        assert mcc == 0.0


What I meant is the following general test:

@pytest.mark.parametrize("zero_division", [0, 1, np.nan]) @pytest.mark.parametrize("y_true, y_pred", [([0], [0]), ([], [])]) @pytest.mark.parametrize("metric", [ jaccard_score, matthews_corrcoef, f1_score, partial(fbeta_score, beta=1), precision_score, recall_score, ]) def test_zero_division_nan_no_warning(metric, y_true, y_pred, zero_division): """Check the behaviour of `zero_division` when setting to 0, 1 or np.nan. No warnings should be raised. """ with warnings.catch_warnings(): warnings.simplefil A3E2 ter("error") result = metric(y_true, y_pred, zero_division=zero_division) if np.isnan(zero_division): assert np.isnan(result) else: assert result == zero_division @pytest.mark.parametrize("y_true, y_pred", [([0], [0]), ([], [])]) @pytest.mark.parametrize("metric", [ jaccard_score, matthews_corrcoef, f1_score, partial(fbeta_score, beta=1), precision_score, recall_score, ]) def test_zero_division_nan_warning(metric, y_true, y_pred): """Check the behaviour of `zero_division` when setting to "warn". A `UndefinedMetricWarning` should be raised. """ with pytest.warns(UndefinedMetricWarning): result = metric(y_true, y_pred, zero_division="warn") assert result == 0.0

Here, I don't include precision_recall_f1_score_support and classification_report because they are a bit different.

The idea is not to test all potential cases that trigger the warning but the general behaviour when the UndefinedMetricWarning is triggered.

glemaitre · 2023-01-23T10:39:22Z

sklearn/metrics/tests/test_classification.py

@@ -1678,36 +1712,53 @@ def test_precision_recall_f1_score_multilabel_2():


 @ignore_warnings
-@pytest.mark.parametrize("zero_division", ["warn", 0, 1])
+@pytest.mark.parametrize("zero_division", ["warn", 0, 1, np.nan])


I think that we can directly set the behaviour outside without any programming.

Suggested change

@pytest.mark.parametrize("zero_division", ["warn", 0, 1, np.nan])

@pytest.mark.parametrize(

"zero_division, zero_division_expected",

[("warn", 0), (0, 0), (1, 1), (np.nan, np.nan)]

)

glemaitre · 2023-01-23T10:55:26Z

doc/whats_new/v1.3.rst

+:mod:`sklearn.metrics`
+.................................


Suggested change

:mod:`sklearn.metrics`

.................................

:mod:`sklearn.metrics`

......................

glemaitre · 2023-01-23T11:02:01Z

So I look a bit at the other classification metrics and I have the following question:

accuracy_score and balanced_accuracy will raise a warning when y_true and y_pred are empty. We are not really consistent with other metrics then.
cohen_kappa_score seems to be a good candidate that should be added to the list of metrics to accept a zero_division parameter.
class_likelihood_ratio has a raise_warning parameter that is inconsistent with the current PR. It should not be addressed in this PR. But we should investigate if the deprecation cost is lower than making this metric consistent (and check that it makes sense indeed).

@thomasjpfan could you provide your input on those three different questions?
Otherwise the PR is on a good state on my side.

marctorsoc · 2023-01-23T13:20:05Z

So I look a bit at the other classification metrics and I have the following question:

accuracy_score and balanced_accuracy will raise a warning when y_true and y_pred are empty. We are not really consistent with other metrics then.

cohen_kappa_score seems to be a good candidate that should be added to the list of metrics to accept a zero_division parameter.

class_likelihood_ratio has a raise_warning parameter that is inconsistent with the current PR. It should not be addressed in this PR. But we should investigate if the deprecation cost is lower than making this metric consistent (and check that it makes sense indeed).

@thomasjpfan could you provide your input on those three different questions? Otherwise the PR is on a good state on my side.

Added (2). (1) and (3) seemed more convoluted. Lmk your thoughts

marctorsoc · 2023-01-30T11:50:25Z

@glemaitre @thomasjpfan kind reminder about this :)

glemaitre · 2023-01-30T14:07:39Z

The CI is broken. @marctorsoc Could you solve that.

# Conflicts: # doc/whats_new/v1.3.rst

marctorsoc · 2023-01-30T16:46:35Z

The CI is broken. @marctorsoc Could you solve that.

@glemaitre done

thomasjpfan · 2023-01-31T18:50:25Z

@glemaitre I'll say to standardize over zero_division as much as possible, which includes all the metrics you suggested in #23183 (comment). I think it's worth deprecating raise_warning in class_likelihood_ratio.

As for this PR, I think it covers a lot, which makes it harder to reviewer. If we can break up this PR into smaller PRs that adds zero_division one metric at a time, I think it will speed up the process.

marctorsoc · 2023-01-31T21:52:54Z

@glemaitre I'll say to standardize over zero_division as much as possible, which includes all the metrics you suggested in #23183 (comment). I think it's worth deprecating raise_warning in class_likelihood_ratio.

As for this PR, I think it covers a lot, which makes it harder to reviewer. If we can break up this PR into smaller PRs that adds zero_division one metric at a time, I think it will speed up the process.

@thomasjpfan I don't think one metric per PR is a) feasible, b) worth it.

a) because e.g. precision and recall both affect classification_report. b) because it would mean 10+ PRs with very few changes each.

I propose the following:

PR to add zero_division=np.nan to precision, recall, f1, fbeta_score, precision_recall_fscore_support and classification_report
PR to add zero_division to matthews_corrcoef and :func:cohen_kappa_score.
PR to fix :func:classification_report so that empty input will return np.nan
PR to add zero_division for accuracy_score and balanced_accuracy which raise a warning when y_true and y_pred are empty
PR to add zero_division for class_likelihood_ratios and remove raise_warning

I can take care of of most if not all of these, but first I want a greenlight as this will take me some time

glemaitre · 2023-02-02T08:45:34Z

I can commit time reviewing individual PRs as proposed by @marctorsoc True that it is difficult to dissociate all metrics.

marctorsoc · 2023-02-02T13:41:12Z

@glemaitre @thomasjpfan please see #25531

erikhuck · 2024-05-31T19:36:40Z

@glemaitre are we able to get this merged at some point or is it being split into different pull requests?

glemaitre · 2024-05-31T20:26:35Z

It is split in different pull-requests.

marctorsoc · 2024-06-06T10:37:40Z

I'll close this PR to avoid confusion

marctorsoc and others added 16 commits September 7, 2019 10:17

temp commit to checkout sklearn master

b197bbe

Merge branch 'sklearn_master' into marc_master

974ebe3

# Conflicts: # doc/whats_new/v0.21.rst # sklearn/metrics/classification.py # sklearn/metrics/tests/test_classification.py

- Changed whats_new to 0.22

538e599

- F-score only warns if both prec and rec are ill-defined - new private method to simplify _prf_divide

flake8 warnings

42b895d

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn

a184e7c

# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py

first commit. Changes made and tests passing

7ccdd88

- fix linting

e41858a

- add weights casting to np.array

add PR number to whats_new

75309ec

run black

5d5112f

tmp commit

3a11007

remove all instances of None

8b25d83

tests fixed

c0f9e99

isort

12c0af2

Merge branch 'sklearn-main'

c885d3b

Merge branch 'main' into zero_division_nan

d74646d

# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py

merged main into this

e55e896

github-actions bot added the module:metrics label Apr 21, 2022

marctorsoc added 3 commits April 21, 2022 20:20

Merge branch 'sklearn-main' into zero_division_nan

8e1051e

remove change to v0.22.rst

39b954c

remove change to v0.22.rst

5016d55

marctorsoc commented Apr 21, 2022

View reviewed changes

marctorsoc added 5 commits April 21, 2022 20:51

fix linting and other errors in CI

113d565

fix linting again...

e41a6c9

apply black

f64a877

apply black2

ebcbc38

fix docstring examples

ac0e960

marctorsoc changed the title ~~[WIP] [FEAT] Zero division nan~~ [FEAT] Zero division nan Apr 22, 2022

thomasjpfan reviewed Apr 22, 2022

View reviewed changes

PR comments:

614d1d2

- move comment to zero_division - try/except in nan_average

marctorsoc added 2 commits January 20, 2023 21:09

Merge branch 'sklearn-main' into zero_division_nan

d1f6665

# Conflicts: # doc/whats_new/v1.3.rst

fix linting

14d6827

glemaitre self-requested a review January 23, 2023 09:30

glemaitre reviewed Jan 23, 2023

View reviewed changes

marctorsoc added 2 commits January 23, 2023 14:17

PR comments + kappa score

5067064

Merge branch 'sklearn-main' into zero_division_nan

88bda6b

marctorsoc added 5 commits January 30, 2023 16:15

Merge branch 'sklearn-main' into zero_division_nan

fcc6f20

# Conflicts: # doc/whats_new/v1.3.rst

fix sklearn/tests/test_public_functions.py

b532b73

format with black

4eb4931

again

d058142

remove Hidden

596348a

marctorsoc changed the title ~~ENH add zero_division parameter to classification metrics~~ ENH add zero_division=nan for classification metrics Feb 2, 2023

marctorsoc mentioned this pull request Feb 2, 2023

ENH add np.nan option for zero_division in precision/recall/f-score #25531

Merged

This was referenced May 19, 2024

FEA add zero_division to matthews_corrcoef #28509

Merged

Make zero_division parameter consistent in the different metric #29048

Open

marctorsoc closed this Jun 6, 2024

StefanieSenger mentioned this pull request Jun 7, 2024

ENH Add zero_division param to cohen_kappa_score #29210

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH add zero_division=nan for classification metrics #23183

ENH add zero_division=nan for classification metrics #23183

		Note that if zero_division is np.nan, such values will be excluded
		from the average.

ENH add zero_division=nan for classification metrics #23183

ENH add zero_division=nan for classification metrics #23183

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment