8000 log_loss giving nan when input is np.float32 and eps is default · Issue #24315 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

log_loss giving nan when input is np.float32 and eps is default #24315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gsiisg opened this issue Sep 1, 2022 · 9 comments · Fixed by #24354
Closed

log_loss giving nan when input is np.float32 and eps is default #24315

gsiisg opened this issue Sep 1, 2022 · 9 comments · Fixed by #24354

Comments

@gsiisg
Copy link
gsiisg commented Sep 1, 2022

Describe the bug

When input has values that are numpy array of np.float32, 1-eps (with default eps=1e-15) results in 1.0, and log_loss() when calculating log(1-p) with p=1.0 results in nan.

Steps/Code to Reproduce

from sklearn.metrics import log_loss
import numpy as np
input = np.array([1],dtype=np.float32)

# when the input is array of np.float32, using the proper eps=eps=np.finfo(np.float32).eps, log_loss is fine
result = log_loss([[0,1]],input,eps=np.finfo(np.float32).eps)
print('with eps=np.finfo(np.float32).eps:',result)

# with input cast as np.float64, log_loss is also fine with the default eps=1e-15
result = log_loss([[0,1]],input.astype(np.float64))
print('with input as np.float64:',result)

# However, the following input as array of np.float32 using the default eps=1e-15 will give nan
result = log_loss([[0,1]],input)
print('with eps=1e-15 (default):',result)

Expected Results

not nan

Actual Results

with eps=1e-15 (default): nan

/Users/gso/anaconda3/lib/python3.9/site-packages/sklearn/metrics/_classification.py:2442: RuntimeWarning: divide by zero encountered in log
  loss = -(transformed_labels * np.log(y_pred)).sum(axis=1)
/Users/gso/anaconda3/lib/python3.9/site-packages/sklearn/metrics/_classification.py:2442: RuntimeWarning: invalid value encountered in multiply
  loss = -(transformed_labels * np.log(y_pred)).sum(axis=1)

Versions

System:
    python: 3.9.12 (main, Apr  5 2022, 01:53:17)  [Clang 12.0.0 ]
executable: /Users/gso/anaconda3/bin/python
   machine: macOS-10.16-x86_64-i386-64bit

Python dependencies:
          pip: 21.2.4
   setuptools: 61.2.0
      sklearn: 1.0.2
        numpy: 1.21.5
        scipy: 1.7.3
       Cython: 0.29.28
       pandas: 1.4.2
   matplotlib: 3.5.1
       joblib: 1.1.0
threadpoolctl: 2.2.0

Built with OpenMP: True
@gsiisg gsiisg added Bug Needs Triage Issue requires triage labels Sep 1, 2022
@gsiisg
Copy link
Author
gsiisg commented Sep 1, 2022

I traced the behavior from sklearn.metrics.log_loss to np.clip not giving the correct answer for 1-eps, where if the input is numpy array of np.float32, with default eps=1e-15, 1-eps = 1.0, which caused log(1-p) with p=1.0 gives the nan. I initially went to report this to numpy but they say scikit is not supporting array of np.float32 properly. Please see:
numpy/numpy#22192

@Micky774
Copy link
Contributor
Micky774 commented Sep 2, 2022

Seems reasonable enough. Would you like to open a PR updating the value to the float32 epsilon (i.e. np.finfo(np.float32).eps)?

@Micky774 Micky774 added help wanted and removed Needs Triage Issue requires triage labels Sep 2, 2022
@Safikh
Copy link
Contributor
Safikh commented Sep 3, 2022

Hi, Can I take this?

@Micky774
Copy link
Contributor
Micky774 commented Sep 4, 2022
8000

Hi, Can I take this?

Yes, go ahead :)

@Safikh
Copy link
Contributor
Safikh commented Sep 4, 2022

@Micky774 We would face the same issue, if we give a default epsilon of np.finfo(np.float32).eps and the user provides a float16 input.
So, should that be handled as well by having a different eps value for every type? Or going for float16 epsilon by default as that would be the largest epsilon?

@Micky774
Copy link
Contributor
Micky774 commented Sep 4, 2022

@Micky774 We would face the same issue, if we give a default epsilon of np.finfo(np.float32).eps and the user provides a float16 input.
So, should that be handled as well by having a different eps value for every type? Or going for float16 epsilon by default as that would be the largest epsilon?

Afaik we don't generally support FP16 and hence it would be preferable to keep with FP32 epsilon.

Edit: The 'auto' option mentioned below handles this well.

@gsiisg
Copy link
Author
gsiisg commented Sep 4, 2022

pull request: #24357

  • added check to see if input is np.float32 or np.float16
  • if detected, change eps to satisfy input precision
  • added documentation describing the change
  • added warning message to user when this happens

The change @Safikh proposed would fix my immediate problem (Thank you!), but...

  • As Safikh alluded to, setting a fixed new default does not account for np.float16 (may not be an issue if everyone knows sklearn doesn't support float16, but I doubt it's common knowledge)
  • would decrease the accuracy drastically if input is just a list of numbers, because internally they are treated as float64 by python
  • I would leave the default as is, and just let the user know when we dynamically change eps due to input lacking 64bit precision

@gsiisg
Copy link
Author
gsiisg commented Sep 5, 2022

Not sure how to fix this in the pull request
Screen Shot 2022-09-04 at 5 16 02 PM

@glemaitre
Copy link
Member

Just mentioning what @ogrisel proposed in one of the PR.

It would be better to introduce a "auto" solution that will switch depending on the dtype using: np.finfo(y_pred.dtype).eps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
0