8000 FIX Add common test to check for unfitted behaviour in classifiers and fix RadiusNeighborsClassifier and ClassifierChain correspondingly by OmarManzoor · Pull Request #27051 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

FIX Add common test to check for unfitted behaviour in classifiers and fix RadiusNeighborsClassifier and ClassifierChain correspondingly #27051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

OmarManzoor
Copy link
Contributor
@OmarManzoor OmarManzoor commented Aug 11, 2023

Reference Issues/PRs

Relates to #26828

What does this implement/fix? Explain your changes.

  • Adds a common test to check for unfitted issues in classifiers.
  • Adds the check_is_fitted check at the beginning of predict_proba in RadiusNeighborsClassifier.
  • Adds the check_is_fitted check in predict_proba and decision_function in ClassifierChain.

Any other comments?

CC: @jjerphan

@OmarManzoor OmarManzoor added the Quick Review For PRs that are quick to review label Aug 11, 2023
@github-actions
Copy link
github-actions bot commented Aug 11, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2398ba7. Link to the linter CI: here

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

We might want to change the title of this PR for including FIX, I think. How about:

FIX Check that `RadiusNeighborsClassifier` is fit in `predict{,_proba}` 

@glemaitre: I am ignorant here, but aren there common tests checking that estimators are fit before predict and predict_proba are called? 🙂

Comment on lines 2309 to 2310
"""Check that :class:`RadiusNeighborsClassifier` is fitted before
calling the :meth:`predict` and :meth:`predict_proba`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_is_fitted also must be called in RadiusNeighborsClassifier.predict (for consistency with RadiusNeighborsRegressor.predict) and test this change accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do that here because the first thing that predict calls is predict_proba.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need at least to test predict on unfit estimator.

@OmarManzoor OmarManzoor changed the title MAINT call in RadiusNeighborsClassifer predict_proba FIX Check that RadiusNeighborsClassifier is fit in predict{,_proba} Aug 11, 2023
@glemaitre
Copy link
Member
glemaitre commented Aug 11, 2023

I am ignorant here, but aren there common tests checking that estimators are fit before predict and predict_proba are called?

Indeed we have common tests that check that we raise a NotFittedError if the estimator was not fitted.

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you, @OmarManzoor.

Comment on lines 2309 to 2310
"""Check that :class:`RadiusNeighborsClassifier` is fitted before
calling the :meth:`predict` and :meth:`predict_proba`.
Copy link
Member
10000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need at least to test predict on unfit estimator.

@OmarManzoor
Copy link
Contributor Author

@jjerphan I wonder if we should remove the test considering the common test and maybe add this classifier there if it is missing.

@jjerphan
Copy link
Member

If this is doable, it's preferable, yes.

* Add a common test to check for unfitted classifiers
* Fix ClassifierChain by adding check_is_fitted inside predict_proba and decision_function methods
@OmarManzoor
Copy link
Contributor Author

@jjerphan Could you kindly have a look?

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I would wait other maintainers with more authority on common tests to complement my comment and to approve this PR.

Comment on lines +608 to +616
@pytest.mark.parametrize(
"classifier",
_tested_estimators(type_filter="classifier"),
ids=_get_check_estimator_ids,
)
def test_classifiers_unfitted(classifier):
classifier_name = classifier.__class__.__name__
_set_checking_parameters(classifier)
check_estimators_unfitted(classifier_name, classifier)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised that a test like this one is not already existing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes made to this file enlarge the scope of this PR.

Could you change the title of this PR accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Feel free to modify if it looks too long.

@OmarManzoor OmarManzoor changed the title FIX Check that RadiusNeighborsClassifier is fit in predict{,_proba} FIX Add common test to check for unfitted behaviour in classifiers and fix RadiusNeighborsClassifier and ClassifierChain correspondingly Aug 15, 2023
@glemaitre
Copy link
Member

This is already the job of the check_fit_check_is_fitted check that is used in _yield_all_checks.

What should be checked instead is the reason why we would skip this test for the model here.
Usually, one possibility is that we face a meta-estimator that requires parameter and we did not modify the code to construct the instance.

@glemaitre
Copy link
Member

OK so the reason here is that ClassifierChain has a tag "_skip_test": True meaning we will not run most of the test. We should probably revisit what is the reason for not running the checks.

@OmarManzoor
Copy link
Contributor Author
OmarManzoor commented Aug 15, 2023

OK so the reason here is that ClassifierChain has a tag "_skip_test": True meaning we will not run most of the test. We should probably revisit what is the reason for not running the checks.

What is the reason that RadiusNeighborsClassifier did not get picked up in this test if it applicable? The main purpose of this PR initially was to check for is fitted in RadiusNeighborsClassifier predict_proba so that it means this was not being checked by the test you mentioned.

@OmarManzoor
Copy link
Contributor Author
OmarManzoor commented Aug 15, 2023

Thanks @glemaitre! I explored a bit and found that the check check_estimators_unfitted is already being applied on RadiusNeighborsClassifier. The reason it passes is that we are already checking for check_is_fitted inside the method radius_neighbors.

indices. In general, multiple points can be queried at the same time.
"""
check_is_fitted(self)
if sort_results and not return_distance:
raise ValueError("return_distance must be True if sort_results is True.")

@jjerphan I don't think we need a dedicated PR for this as all we are doing in #26828 is shifting this check a bit earlier as required.

@jjerphan
Copy link
Member

OK, feel free to close this PR.

@OmarManzoor OmarManzoor deleted the radius_neighbors_classifier_is_fitted branch August 18, 2023 10:05
@glemaitre
Copy link
Member

Thanks @OmarManzoor for investigating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0