-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH add zero_division=nan for classification metrics #23183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # doc/whats_new/v0.21.rst # sklearn/metrics/classification.py # sklearn/metrics/tests/test_classification.py
- F-score only warns if both prec and rec are ill-defined - new private method to simplify _prf_divide
# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py
- add weights casting to np.array
# Conflicts: # sklearn/metrics/_classification.py # sklearn/metrics/tests/test_classification.py
sklearn/metrics/_classification.py
Outdated
def _nan_average(scores: np.ndarray, weights: Optional[np.ndarray]): | ||
""" | ||
Wrapper for np.average, with np.nan values being ignored from the average | ||
This is similar to np.nanmean, but allowing to pass weights as in np.average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
submitted an issue to numpy for this: numpy/numpy#21375, but let me know if there's a better solution than this wrapper!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the performance difference for running _nan_average
and np.average
when there are no nans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk... but I updated this so that if no weights are passed it does just np.nanmean which should be very fast. Otherwise, it creates the mask and does the extra work. tbh, I cannot foresee a big degradation since the operations here are very quick, but 🤷♂️
@thomasjpfan and everyone interested, this is now ready to review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
I think it would be good to see what @jnothman thinks of this np.nan
behavior.
sklearn/metrics/_classification.py
Outdated
if (weights == 0).all(): | ||
return np.average(scores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for weights == 0
adds more computation for an edge case. Can we pass weights directly into np.average
and not do this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately
ZeroDivisionError
When all weights along axis are zero.
see https://numpy.org/doc/stable/reference/generated/numpy.average.html. But will change into try/except
sklearn/metrics/_classification.py
Outdated
Note that if zero_division is np.nan, such values will be excluded | ||
from the average. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we move this into zero_division
. Currently, when reading the zero_division
description, the reader needs to scroll up to see what zero_division=np.nan
does.
sklearn/metrics/_classification.py
Outdated
Note that if zero_division is np.nan, such values will be excluded | ||
from the average. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I think we can place this in the zero_division
description.
Similar comment for the other docstrings.
- move comment to zero_division - try/except in nan_average
# Conflicts: # doc/whats_new/v1.3.rst
@pytest.mark.parametrize("zero_division", [0, 1, np.nan]) | ||
@pytest.mark.parametrize( | ||
"y_true, y_pred", | ||
[ | ||
([0], [1]), | ||
([0, 0], [1, 1]), | ||
([], []), | ||
], | ||
) | ||
def test_matthews_corrcoef_nan(zero_division, y_true, y_pred): | ||
with warnings.catch_warnings(record=True) as record: | ||
mcc = matthews_corrcoef(y_true, y_pred, zero_division=zero_division) | ||
assert not record | ||
|
||
if np.isnan(zero_division): | ||
assert np.isnan(mcc) | ||
else: | ||
assert mcc == zero_division | ||
|
||
|
||
@pytest.mark.parametrize( | ||
"y_true, y_pred", | ||
[ | ||
([0], [1]), | ||
([0, 0], [1, 1]), | ||
([], []), | ||
], | ||
) | ||
def test_matthews_corrcoef_nan_warn(y_true, y_pred): | ||
with warnings.catch_warnings(record=True) as record: | ||
mcc = matthews_corrcoef(y_true, y_pred, zero_division="warn") | ||
assert len(record) == 1 | ||
assert mcc == 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is the following general test:
@pytest.mark.parametrize("zero_division", [0, 1, np.nan])
@pytest.mark.parametrize("y_true, y_pred", [([0], [0]), ([], [])])
@pytest.mark.parametrize("metric", [
jaccard_score,
matthews_corrcoef,
f1_score,
partial(fbeta_score, beta=1),
precision_score,
recall_score,
])
def test_zero_division_nan_no_warning(metric, y_true, y_pred, zero_division):
"""Check the behaviour of `zero_division` when setting to 0, 1 or np.nan.
No warnings should be raised.
"""
with warnings.catch_warnings():
warnings.simplefil
A3E2
ter("error")
result = metric(y_true, y_pred, zero_division=zero_division)
if np.isnan(zero_division):
assert np.isnan(result)
else:
assert result == zero_division
@pytest.mark.parametrize("y_true, y_pred", [([0], [0]), ([], [])])
@pytest.mark.parametrize("metric", [
jaccard_score,
matthews_corrcoef,
f1_score,
partial(fbeta_score, beta=1),
precision_score,
recall_score,
])
def test_zero_division_nan_warning(metric, y_true, y_pred):
"""Check the behaviour of `zero_division` when setting to "warn".
A `UndefinedMetricWarning` should be raised.
"""
with pytest.warns(UndefinedMetricWarning):
result = metric(y_true, y_pred, zero_division="warn")
assert result == 0.0
Here, I don't include precision_recall_f1_score_support
and classification_report
because they are a bit different.
The idea is not to test all potential cases that trigger the warning but the general behaviour when the UndefinedMetricWarning
is triggered.
@@ -1678,36 +1712,53 @@ def test_precision_recall_f1_score_multilabel_2(): | |||
|
|||
|
|||
@ignore_warnings | |||
@pytest.mark.parametrize("zero_division", ["warn", 0, 1]) | |||
@pytest.mark.parametrize("zero_division", ["warn", 0, 1, np.nan]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can directly set the behaviour outside without any programming.
@pytest.mark.parametrize("zero_division", ["warn", 0, 1, np.nan]) | |
@pytest.mark.parametrize( | |
"zero_division, zero_division_expected", | |
[("warn", 0), (0, 0), (1, 1), (np.nan, np.nan)] | |
) |
doc/whats_new/v1.3.rst
Outdated
:mod:`sklearn.metrics` | ||
................................. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:mod:`sklearn.metrics` | |
................................. | |
:mod:`sklearn.metrics` | |
...................... |
So I look a bit at the other classification metrics and I have the following question:
@thomasjpfan could you provide your input on those three different questions? |
Added (2). (1) and (3) seemed more convoluted. Lmk your thoughts |
@glemaitre @thomasjpfan kind reminder about this :) |
The CI is broken. @marctorsoc Could you solve that. |
# Conflicts: # doc/whats_new/v1.3.rst
@glemaitre done |
@glemaitre I'll say to standardize over As for this PR, I think it covers a lot, which makes it harder to reviewer. If we can break up this PR into smaller PRs that adds |
@thomasjpfan I don't think one metric per PR is a) feasible, b) worth it. a) because e.g. precision and recall both affect classification_report. b) because it would mean 10+ PRs with very few changes each. I propose the following:
I can take care of of most if not all of these, but first I want a greenlight as this will take me some time |
I can commit time reviewing individual PRs as proposed by @marctorsoc True that it is difficult to dissociate all metrics. |
@glemaitre @thomasjpfan please see #25531 |
@glemaitre are we able to get this merged at some point or is it being split into different pull requests? |
It is split in different pull-requests. |
I'll close this PR to avoid confusion |
Reference Issues/PRs
Fixes #22625
What does this implement/fix? Explain your changes.
This is an extension of #14900, where I added the parameter
zero_division
for precision, recall, and f1. Afterwards, it was added for jaccard as well.Here, we add the ability to set zero_division to np.nan, so that np.nan is returned when the metric is undefined. In addition to this:
Specifically:
Precision:
Recall:
If true_sum = 0, undefined
If average != None, ignore from average any class metric being np.nan
F-score:
if beta=inf, return recall, and beta=0, return precision
elif precision=0 or recall=0 (or both), return 0. <------------- this is a change
else return zero_division
If average != None, ignore from average any metric being np.nan
Jaccard:
if all labels and pred are 0, return zero_division
If average != None, ignore from average any metric being np.nan
Any other comments?