8000 DOC error message when checking response values of an estimator not clear · Issue #33093 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC error message when checking response values of an estimator not clear #33093

@qbarthelemy

Description

@qbarthelemy

Describe the issue linked to the documentation

Current function that checks response values of an estimator
assumes that if the estimator is not a classifier or an outlier detector, then it is a regressor.

In the latter case, if it doesn't find the predict function,
it raises an error stating that the estimator is a regressor: ValueError: ... Got a regressor....

However, the estimator might be something other than a regressor.

For example, it can be a classification Pipeline with a MRO issue, see NeuroTechX/moabb#815
Many other GitHub repositories have reported this ambiguous error, see list in #33084 (comment)

The goal of this issue is to discuss how to correct the message by making it clearer,
specifically by no longer stating that it is a regressor.

Suggest a potential alternative/fix

Current function _get_response_values.py in scikit-learn/sklearn/utils/_response.py

    else:  # estimator is a regressor
        if response_method != "predict":
            raise ValueError(
                f"{estimator.__class__.__name__} should either be a classifier to be "
                f"used with response_method={response_method} or the response_method "
                "should be 'predict'. Got a regressor with response_method="
                f"{response_method} instead."
            )
        prediction_method = estimator.predict
        y_pred, pos_label = prediction_method(X), None

could be corrected into

    elif is_regressor(estimator):
        prediction_method = estimator.predict
        y_pred, pos_label = prediction_method(X), None
    else:
        raise ValueError(
            f"{estimator.__class__.__name__} is not recognized as a classifier, "
            "an outlier detector or a regressor."
        )

and, if necessary, we could add before else

    elif is_clusterer(estimator):
        prediction_method = ...
        y_pred, pos_label = ...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0