8000 [WIP] API: allow cross validation to work with partial_fit by stsievert · Pull Request #11266 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[WIP] API: allow cross validation to work with partial_fit #11266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

stsievert
Copy link
Contributor
@stsievert stsievert commented Jun 14, 2018

Reference Issues/PRs

Addresses (but doesn't fix) #1626: "API Proposal: Genearlized Cross-Validation and Early Stopping". This PR only address early stopping on models that have partial_fit (#1626 (comment)), not addressing "when more computation doesn't improve the result." (#1626 (comment)).

I will use this in dask/dask-searchcv#72. Briefly, I want to repeatedly use cross_validate with a specified number of partial_fit calls.

What does this implement/fix? Explain your changes.

This PR implements early stopping for cross validation when partial_fit is present. This requires that

  • _fit_and_score have a partial_fit keyword argument that can either be a bool or an int that reflects the number of times est.partial_fit is called.
    • I also add a partial_fit keyword arg to cross_validate, which relies on _fit_and_score.
  • the specification of a different train/validation set if _fit_and_score is called repeatedly.
    • yes, it's possible to make sure train/validation are always separated on repeated calls but not really user-friendly and doesn't have the cleanest documentation.

TODO

@stsievert stsievert changed the title WIP: API: allow partial fit for cross validation WIP: API: allow cross validation to work with partial_fit Jun 14, 2018
@jnothman
Copy link
Member
jnothman commented Jun 14, 2018 via email

8000

@stsievert
Copy link
Contributor Author

I think this is ready for review.

On partial_fit for GridSearchCV: I think integration is future work. It requires significant cleaning and refactoring because BaseSearchCV.fit is almost a re-implementation of cross_validate. I think a future PR will refactor the two.

@stsievert stsievert changed the title WIP: API: allow cross validation to work with partial_fit API: allow cross validation to work with partial_fit Jun 15, 2018
@stsievert stsievert changed the title API: allow cross validation to work with partial_fit [MRG] API: allow cross validation to work with partial_fit Jun 16, 2018
@stsievert
Copy link
Contributor Author

I'd like this to work better with repeated calls and the train/test split. That's where the main advantage comes from.

@stsievert stsievert changed the title [MRG] API: allow cross validation to work with partial_fit [WIP] API: allow cross validation to work with partial_fit Jun 21, 2018
Base automatically changed from master to main January 22, 2021 10:50
@cmarmo cmarmo added Needs Decision Requires decision and removed Waiting for Reviewer labels Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0