-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
Description
Describe the issue linked to the documentation
Current function that checks response values of an estimator
assumes that if the estimator is not a classifier or an outlier detector, then it is a regressor.
In the latter case, if it doesn't find the predict function,
it raises an error stating that the estimator is a regressor: ValueError: ... Got a regressor....
However, the estimator might be something other than a regressor.
For example, it can be a classification Pipeline with a MRO issue, see NeuroTechX/moabb#815
Many other GitHub repositories have reported this ambiguous error, see list in #33084 (comment)
The goal of this issue is to discuss how to correct the message by making it clearer,
specifically by no longer stating that it is a regressor.
Suggest a potential alternative/fix
Current function _get_response_values.py in scikit-learn/sklearn/utils/_response.py
else: # estimator is a regressor
if response_method != "predict":
raise ValueError(
f"{estimator.__class__.__name__} should either be a classifier to be "
f"used with response_method={response_method} or the response_method "
"should be 'predict'. Got a regressor with response_method="
f"{response_method} instead."
)
prediction_method = estimator.predict
y_pred, pos_label = prediction_method(X), Nonecould be corrected into
elif is_regressor(estimator):
prediction_method = estimator.predict
y_pred, pos_label = prediction_method(X), None
else:
raise ValueError(
f"{estimator.__class__.__name__} is not recognized as a classifier, "
"an outlier detector or a regressor."
)and, if necessary, we could add before else
elif is_clusterer(estimator):
prediction_method = ...
y_pred, pos_label = ...