10000 CalibratedClassifierCV doesn't support groups parameter in fit method · Issue #12052 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

CalibratedClassifierCV doesn't support groups parameter in fit method #12052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ian-perry-ironnet opened this issue Sep 10, 2018 · 6 comments
Closed

Comments

@ian-perry-ironnet
Copy link

Description

With classes like sklearn.model_selection.GridSearchCV and even skopt.BayesSearchCV they take a cv parameter that can be any object that produces train/test splits. In the case of using sklearn.model_selection.GroupKFold as the cv object, it requires a groups variable to be passed to fit. In both these classes, they allow groups to be passed to fit that's then passed to the underlying GroupKFold. However, sklearn.calibration.CalibratedClassifierCV doesn't allow the groups variable to fit even though it says it accepts any cv object.

@adrinjalali
Copy link
Member

This is I guess another example of what could potentially be solved by PR #9566. You can check issues #11429 and #4497 for more detailed discussion on the matter. Developers are mostly busy with 0.20 release at the moment. I'm pretty sure we'll get back to this soon after the release.

In the meantime, you can check the answer I gave to the question "sklearn GridSearchCV not using sample_weight in score function" on stackoverflow useful. I haven't applied it to GroupKFold, but I suspect it may be applicable there as well. You may need to write a wrapper around GroupKFold which doesn't expect groups as input.

@DanLeopold
Copy link
DanLeopold commented Apr 3, 2019

Do you know if there's been any progress on sample property routing PR #7646 (i.e., passing groups to fit functions)? My dilemma has been with GroupShuffleSplit and cross_val_score, but it seems these issues are one and the same. I tried adapting the answer you (Adrin) provided to GroupShuffleSplit, but my ability to do so is limited at best. Thus, I'm wondering if Ian found/created a successful workaround or if there's another thread/answer I've been missing. Thanks in advance!

@adrinjalali
Copy link
Member

Since then scikit-learn/enhancement_proposals#16 has started, and we've had long discussions about this. It is moving forward, but slowly.

@jnothman
Copy link
Member
jnothman commented Apr 4, 2019 via email

@mdbecker
Copy link
Contributor
mdbecker commented Nov 5, 2021

For anyone else that runs into this same issue before SLEP006 is finished, I've uploaded a very simple fix based on version 0.24.1 of sklearn here. This works the same was a GridSearchCV and fails gracefully if the cv passed in does not support the groups parameter (which may not be desirable depending on your usecase). @DanLeopold

@adrinjalali
Copy link
Member

Closing this since we have now metadata routing for CalibratedClassifierCV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
0