8000 Add zero_division None or np.nan · Issue #22625 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add zero_division None or np.nan #22625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
marctorsoc opened this issue Feb 27, 2022 · 3 comments
Closed

Add zero_division None or np.nan #22625

marctorsoc opened this issue Feb 27, 2022 · 3 comments

Comments

@marctorsoc
Copy link
Contributor

Describe the workflow you want to enable

This would be an extension of my previous addition #14900, where I added the parameter zero_division for precision, recall, f1. Afterwards, it was added for jaccard as well.

Here, we would return np.nan when the metric is undefined. For the user to force this, she would pass zero_division=None or zero_division=np.nan. To be clear, in both cases, the result would be np.nan (even if passing None).

This would be useful if the metric is used downstream for some average, to be excluded, as e.g. pd.mean does. In case the metric included an average, then such behaviour would already be implemented here, ignoring the figures being undefined.

Describe your proposed solution

Allow zero_division to be None or np.nan. In both cases, disable the warning as we already do for zero_divsion 0 and

Behaviour for each metric when passing either None or np.nan as follows. Keep same as today for other values, i.e. back-compatible

Precision:

  • If pred_sum = 0, return np.nan
  • If average != None, ignore from average any class metric being undefined

Recall:

  • If true_sum = 0, return np.nan
  • If average != None, ignore from average any class metric being undefined

F-score:

  • if beta=inf, return recall, and beta=0, return precision
  • elif precision=0 or recall=0 (or both), return 0.
  • else return np.nan
  • If average != None, ignore from average any class metric being undefined

Jaccard:

  • if all labels and pred are 0, return np.nan
  • If average != None, ignore from average any class metric being undefined

Describe alternatives you've considered, if relevant

No response

Additional context

#14876
#14900

@marctorsoc marctorsoc added Needs Triage Issue requires triage New Feature labels Feb 27, 2022
@thomasjpfan thomasjpfan added module:metrics and removed Needs Triage Issue requires triage labels Apr 14, 2022
@thomasjpfan
Copy link
Member

From an API point of view, I prefer to only have zero_division=np.nan since we are always returning np.nan.

The ignoring behavior for np.nans seems okay to me.

@marctorsoc
Copy link
Contributor Author

ok, cool. I'll work on this and tag you in a PR to review when ready

@adrinjalali
Copy link
Member

I think we can keep the discussion here: #29048 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

< 3060 /div>
3 participants
0