-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Initial implementation of GridFactory #21784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I think that I did not comment on the issue but I really like what this factory remove as cognitive overhead on the user. The I would be happy to review and give feedback whenever you want. Just ping me directly. |
Wouldn't it make sense to add methods directly to the estimators them-selves? >>> pipeline = make_pipeline(
... PCA().with_search_space(n_components=[2, 5, 10, 20, 50, 100]),
... LogisticRegression().with_search_space(C=np.logspace(-6, 6, num=13)),
... )
>>> GridSearchCV(pipeline, cv=5).fit(X, y).best_params_
{"pca__n_components"=100, "logisiticregression__C"=1e-3} EDIT: One could even special-case pipelines to make it possible to accept tuple of estimators to express a disjunctions in the search space: >>> pipeline = make_pipeline(
... (
... PCA().with_search_space(n_components=[2, 5, 10, 20, 50, 100]),
... NMF().with_search_space(n_components=[2, 5, 10, 20, 50, 100]),
... ),
... LogisticRegression().with_search_space(C=np.logspace(-6, 6, num=13)),
... )
>>> GridSearchCV(pipeline, cv=5).fit(X, y).best_params_
{"nmf__n_components": 100, "logisiticregression__C": 1e-3} If you call This is a bit implicit, but given the gain in readability that might be worth it. |
BTW, another source of inspiration to tackle this problem: |
Thanks for those references, I'll enjoy reading them. I don't mind setting directly on the estimators, either with a method on the estimators or with an external function that just assumes it can write to a certain attribute of the estimator. Do we think this would be better? |
Setting an attribute could be a way to go toward an API where estimators could define their own default hyper parameters space in the future?
|
I agree it would probably be more intuitive to use. |
Neuraxle has some nice solutions, recognising some of the longstanding questionable design choices in Scikit-learn. One of our troubles is again going to be |
I agree. Let's put this API design on the agenda of the next dev meeting if you plan to attend @jnothman. |
Setting attributes not in For this PR, we can use the |
They support lists of search spaces, which is the same as GridSearchCV, right? |
I think what I find slightly counter-intuitive about that is that |
Reference Issues/PRs
Fixes #19045
What does this implement/fix? Explain your changes.
This implements a GridFactory with functionality comparable to searchgrid to allow easier specification of parameter search spaces, especially as a composite estimator is built.
Any other comments?
A complete version of this will:
cv_results_
in a__
-free wayI note that supporting a factory to be passed to
GridSearchCV
instead of a parameter grid doesn't save the user much code. Maybe we don't need to support it?