-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
sklearn.feature_selection.SequentialFeatureSelector Select features as long as score gets better. #20137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
…es as long as score gets better. scikit-learn#20137 Add `censored_rate` parameter in SequentialFeatureSelector. If censored_rate is NOT None, Fitting is aborted if new_score is lower than old_score*(1+censoring_rate).
…es as long as score gets better. scikit-learn#20137 Add `censored_rate` parameter in SequentialFeatureSelector. If censored_rate is NOT None, Fitting is aborted if new_score is lower than old_score*(1+censoring_rate).
I agree. The problem is how to manage backward compat with the current behavior without introducing too many spurious deprecation warnings and too many new parameters... @norbusan suggested the following strategy in a private conversation: add a new We will |
What is good with the solution above is:
There are several things I find itchy with the solution proposed above:
|
Maybe we could implement the following with an additional
Optionally we could make WDYT @NicolasHug and others? |
Just another comment which I forgot from the initial request: It may be the case that adding one feature more will always improve score (not sure about this). So some sort of threshold how much the score must improve in order to continue adding features might be needed.... |
I agree, this is the purpose of the |
@ogrisel you mentioned I'm not sure I properly understand this point:
Regarding this one:
The current default for Eitherway, this is a form of early stopping. I think I would prefer what I just proposed but if we're going to introduce more parameters, we might need to also introduce an |
|
That's an option. I think I like it. The only remaining question would be: do we plan to enable the early stopping behavior by default in the future? (I think that would be nice). How do we handle the deprecation?
|
@NicolasHug do you agree with the plan above? If so we can implement it as part of #20145. |
Yes this sounds good. An alternative that I slightly prefer would be to directly introduce
The reason the default is currently None is for consistency with RFE, but 'auto' is a more accurate name. Since we have to deprecate something anyway, we might as well introduce a better name in the process. I'm fine with either strategy anyway, so feel free to implement the one you like best. |
Dear Developers,
Currently one can give
sklearn.feature_selection.SequentialFeatureSelector
parameter 'n_features_to_select'
to get desired number of features.
It would be desired, that instead of giving a fixed number of 'n_features_to_select' one could use it as an upper limit and select features only up to point where the score improves (so the number of selected features could be smaller than 'n_features_to_select').
Terveisin, Markus
The text was updated successfully, but these errors were encountered: