-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
ENH update code to check response values of an estimator #33126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ENH update code to check response values of an estimator #33126
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @qbarthelemy!
A few remarks:
- To avoid confusion, please change the title to start with "ENH" or "FIX" instead of "DOC" (as this PR does not change the documentation).
- Please add a changelog entry like this.
- Your tests are currently failing. In general, we need two new tests here:
- one that checks that no error is thrown for all types of estimators with their supported
response_methods, and - one that checks that the expected error is thrown for unsupported
response_methods.
Note thatresponse_methodcan be a string or a list of strings, both cases should be part of the tests.
All existing tests should not be changed (or only adapted according to the new error message), to make sure nothing else is affected by your change.
- one that checks that no error is thrown for all types of estimators with their supported
Let me know if you need more help with designing the tests.
Co-authored-by: Anne Beyer <anne.beyer@mailbox.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the comments so fast! Here is another round of feedback:
doc/whats_new/upcoming_changes/sklearn.utils/33126.enhancement.rst
Outdated
Show resolved
Hide resolved
| - for multilabel classification, it is a 2d array of shape `(n_samples, n_outputs)`; | ||
| - for outlier detection, it is a 1d array of shape `(n_samples,)`; | ||
| - for regression, it is a 1d array of shape `(n_samples,)`. | ||
| - for outlier detection, a regressor or a clusterer, it is a 1d array of shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add corresponding test cases (for binary and multiclass) for a clusterer here:
| def test_response_values_output_shape_( |
| ["predict_proba", "decision_function", ["predict_proba", "predict"]], | ||
| ["predict_proba", "decision_function", ["predict_proba", "decision_function"]], | ||
| ) | ||
| def test_regressor_unsupported_response(pyplot, response_method): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please extend this test (and rename it to test_estimator_unsupported_response) by adding a parametrization with regressor, outlier detector and clusterer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be in favor of moving this test to sklearn/utils/tests/test_response.py, to have everything related to testing response values in the same file.
sklearn/utils/tests/test_response.py
Outdated
|
|
||
| @pytest.mark.parametrize( | ||
| "response_method", | ||
| ["predict", "score", ["predict", "score"]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please extend the list with ["not-a-valid-method", "predict"] (which should also result in predict being selected)? The logic below would also have to be adapted slightly, or you can change the parametrization into a tuple to explicitly specify the expected method as well.
sklearn/utils/tests/test_response.py
Outdated
| _get_response_values(my_estimator, X, response_method=response_method) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("response_method", ["predict"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as below
Co-authored-by: Anne Beyer <anne.beyer@mailbox.org>
Reference Issues/PRs
Fixes #33093. See also #33084.
What does this implement/fix? Explain your changes.
It implements suggestions made here #33093 (comment)
AI usage disclosure
I did not use AI assistance.
Any other comments?