Add a way to use warm_start together with cross_val_score?

Discussed in #22042

^{Originally posted by PGijsbers December 21, 2021}
I want to obtain scores for a random forest with a different number of estimators across the same fold in a reproducible manner. I essentially want something like:

from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

x, y = make_classification(random_state=0)
rf = RandomForestClassifier(n_estimators=10, warm_start=True, random_state=0)
step_size = 10

for i in range(3):
    scores = cross_val_score(rf, x, y, cv=3)
    print(scores)
    rf.n_estimators += step_size

However, from my understanding each time the forests are trained from scratch in this scenario, since the original rf object is cloned. Instead I would want to save the progress for each fold and use that in the next iteration. It is possible to get the fitted estimators with cross_validate(..., return_estimator=True) but I don't see any compatible method to pass an estimator for each fold.
Is there a way? Or should I instead open an issue for a feature request? :)

Suggested change: allow cross_validate's estimator parameter to optionally be a list of estimators which matches in length to the number of folds. Then from the user it could work something similar to:

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_validate

x, y = make_classification(random_state=0)
rf = RandomForestClassifier(n_estimators=10, warm_start=True, random_state=0)
k_folds = 3
forests = [rf] * k_folds
step_size = 10

for i in range(3):
   result = cross_validate(forests, x, y, cv=k_folds, return_estimator=True)
   forests = result["estimator"]
   for forest in forests:
       forest.n_estimators += step_size

Internally you can presumably just zip the estimators in here and be done as I assume that the returned estimator list is already in matching order with the folds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Discussed in #22042

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Discussed in #22042

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions