8000 [MRG] DOC Add dropdowns to Module 1.13 Feature Selection by greyisbetter · Pull Request #26662 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] DOC Add dropdowns to Module 1.13 Feature Selection #26662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Aug 28, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 38 additions & 27 deletions doc/modules/feature_selection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -201,31 +201,36 @@ alpha parameter, the fewer features selected.

.. _compressive_sensing:

.. topic:: **L1-recovery and compressive sensing**

For a good choice of alpha, the :ref:`lasso` can fully recover the
exact set of non-zero variables using only few observations, provided
certain specific conditions are met. In particular, the number of
samples should be "sufficiently large", or L1 models will perform at
random, where "sufficiently large" depends on the number of non-zero
coefficients, the logarithm of the number of features, the amount of
noise, the smallest absolute value of non-zero coefficients, and the
structure of the design matrix X. In addition, the design matrix must
display certain specific properties, such as not being too correlated.

There is no general rule to select an alpha parameter for recovery of
non-zero coefficients. It can by set by cross-validation
(:class:`~sklearn.linear_model.LassoCV` or
:class:`~sklearn.linear_model.LassoLarsCV`), though this may lead to
under-penalized models: including a small number of non-relevant variables
is not detrimental to prediction score. BIC
(:class:`~sklearn.linear_model.LassoLarsIC`) tends, on the opposite, to set
high values of alpha.

**Reference** Richard G. Baraniuk "Compressive Sensing", IEEE Signal
|details-start|
**L1-recovery and compressive sensing**
|details-split|

For a good choice of alpha, the :ref:`lasso` can fully recover the
exact set of non-zero variables using only few observations, provided
certain specific conditions are met. In particular, the number of
samples should be "sufficiently large", or L1 models will perform at
random, where "sufficiently large" depends on the number of non-zero
coefficients, the logarithm of the number of features, the amount of
noise, the smallest absolute value of non-zero coefficients, and the
structure of the design matrix X. In addition, the design matrix must
display certain specific properties, such as not being too correlated.

There is no general rule to select an alpha parameter for recovery of
non-zero coefficients. It can by set by cross-validation
(:class:`~sklearn.linear_model.LassoCV` or
:class:`~sklearn.linear_model.LassoLarsCV`), though this may lead to
under-penalized models: including a small number of non-relevant variables
is not detrimental to prediction score. BIC
(:class:`~sklearn.linear_model.LassoLarsIC`) tends, on the opposite, to set
high values of alpha.

.. topic:: Reference

Richard G. Baraniuk "Compressive Sensing", IEEE Signal
Processing Magazine [120] July 2007
http://users.isr.ist.utl.pt/~aguiar/CS_notes.pdf

|details-end|

Tree-based feature selection
----------------------------
Expand Down Expand Up @@ -282,6 +287,10 @@ instead of starting with no features and greedily adding features, we start
with *all* the features and greedily *remove* features from the set. The
`direction` parameter controls whether forward or backward SFS is used.

|details-start|
**Detail on Sequential Feature Selection**
|details-split|

In general, forward and backward selection do not yield equivalent results.
Also, one may be much faster than the other depending on the requested number
of selected features: if we have 10 features and ask for 7 selected features,
Expand All @@ -299,16 +308,18 @@ cross-validation requires fitting `m * k` models, while
:class:`~sklearn.feature_selection.SelectFromModel` always just does a single
fit and requires no iterations.

.. topic:: Examples

* :ref:`sphx_glr_auto_examples_feature_selection_plot_select_from_model_diabetes.py`

.. topic:: References:
.. topic:: Reference

.. [sfs] Ferri et al, `Comparative study of techniques for
large-scale feature selection
<https://citeseerx.ist.psu.edu/doc_view/pid/5fedabbb3957bbb442802e012d829ee0629a01b6>`_.

|details-end|

.. topic:: Examples

* :ref:`sphx_glr_auto_examples_feature_selection_plot_select_from_model_diabetes.py`

Feature selection as part of a pipeline
=======================================

Expand Down
0