-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Description
I found problem, where mean_squared_log_error() function does not catch error.
Steps/Code to Reproduce
Imagine simple situation, where regression model predicts negative value:
import numpy as np
y_true = np.array([1, 2, 3])
y_pred = np.array([1, -2, 3])
mean_squared_log_error(y_true, y_pred)
This happened, when my regression model predicted 1 negative value (among thousands of positive values).
Expected correct behavior:
Expected behavior is, that exception:
ValueError("Mean Squared Logarithmic Error cannot be used when targets contain negative values.")
should be raised and correctly inform about the underlying problem.
Exact location, where is the bug
Problematic code is exactly in sklearn/metrics/regression.py: (around line 313)
if not (y_true >= 0).all() and not (y_pred >= 0).all():
raise ValueError("Mean Squared Logarithmic Error cannot be used when "
"targets contain negative values.")
The condition is not fully correct and evaluates to False for example above - which is wrong.
It should evaluate to True, and raise the exception.
Suggested solution:
Just change the condition to:
if (y_true < 0).any() or (y_pred < 0).any():
raise ValueError("Mean Squared Logarithmic Error cannot be used when "
"targets contain negative values.")
so it catches the problem in case any of the y_true or y_pred contain negative value.