-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Fix to documentation and docstring of randomized lasso and randomized logistic regression #6498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Fix to documentation and docstring of randomized lasso and randomized logistic regression #6498
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -187,9 +187,13 @@ def _randomized_lasso(X, y, weights, mask, alpha=1., verbose=False, | |
class RandomizedLasso(BaseRandomizedLinearModel): | ||
"""Randomized Lasso. | ||
|
||
Randomized Lasso works by resampling the train data and computing | ||
a Lasso on each resampling. In short, the features selected more | ||
often are good features. It is also known as stability selection. | ||
Randomized Lasso works by subsampling the training data and | ||
computing a Lasso estimate where the penalty of a random subset of | ||
coefficients has been scaled. By performing this double | ||
randomization several times, the method assigns high scores to | ||
features that are repeatedly selected across randomizations. This | ||
is known as stability selection. In short, features selected more | ||
often are considered good features. | ||
|
||
Read more in the :ref:`User Guide <randomized_l1>`. | ||
|
||
|
@@ -201,8 +205,9 @@ class RandomizedLasso(BaseRandomizedLinearModel): | |
article which is scaling. | ||
|
||
scaling : float, optional | ||
The alpha parameter in the stability selection article used to | ||
randomly scale the features. Should be between 0 and 1. | ||
The s parameter used to randomly scale the penalty of different | ||
features (See :ref:`User Guide <randomized_l1>` for details ). | ||
Should be between 0 and 1. | ||
|
||
sample_fraction : float, optional | ||
The fraction of samples to be used in each randomized design. | ||
|
@@ -226,11 +231,11 @@ class RandomizedLasso(BaseRandomizedLinearModel): | |
If True, the regressors X will be normalized before regression. | ||
This parameter is ignored when `fit_intercept` is set to False. | ||
When the regressors are normalized, note that this makes the | ||
hyperparameters learnt more robust and almost independent of the number | ||
of samples. The same property is not valid for standardized data. | ||
However, if you wish to standardize, please use | ||
`preprocessing.StandardScaler` before calling `f 8000 it` on an estimator | ||
with `normalize=False`. | ||
hyperparameters learned more robust and almost independent of | ||
the number of samples. The same property is not valid for | ||
standardized data. However, if you wish to standardize, please | ||
use `preprocessing.StandardScaler` before calling `fit` on an | ||
estimator with `normalize=False`. | ||
|
||
precompute : True | False | 'auto' | ||
Whether to use a precomputed Gram matrix to speed up | ||
|
@@ -307,7 +312,7 @@ class RandomizedLasso(BaseRandomizedLinearModel): | |
|
||
See also | ||
-------- | ||
RandomizedLogisticRegression, LogisticRegression | ||
RandomizedLogisticRegression, Lasso, ElasticNet | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it would hurt to mention There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My rationale is based on the data type that needs to be modeled. So methods for categorical and metric responses are kind of far away. This separation is maintained in the "See also" section of the lasso and logistic sides in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @agramfort, do you have any opinions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agreed. +1 for the change. |
||
""" | ||
def __init__(self, alpha='aic', scaling=.5, sample_fraction=.75, | ||
n_resampling=200, selection_threshold=.25, | ||
|
@@ -378,9 +383,13 @@ def _randomized_logistic(X, y, weights, mask, C=1., verbose=False, | |
class RandomizedLogisticRegression(BaseRandomizedLinearModel): | ||
"""Randomized Logistic Regression | ||
|
||
Randomized Regression works by resampling the train data and computing | ||
a LogisticRegression on each resampling. In short, the features selected | ||
more often are good features. It is also known as stability selection. | ||
Randomized Logistic Regression works by subsampling the training | ||
data and fitting a L1-penalized LogisticRegression model where the | ||
penalty of a random subset of coefficients has been scaled. By | ||
performing this double randomization several times, the method | ||
assigns high scores to features that are repeatedly selected across | ||
randomizations. This is known as stability selection. In short, | ||
features selected more often are considered good features. | ||
|
||
Read more in the :ref:`User Guide <randomized_l1>`. | ||
|
||
|
@@ -390,8 +399,9 @@ class RandomizedLogisticRegression(BaseRandomizedLinearModel): | |
The regularization parameter C in the LogisticRegression. | ||
|
||
scaling : float, optional, default=0.5 | ||
The alpha parameter in the stability selection article used to | ||
randomly scale the features. Should be between 0 and 1. | ||
The s parameter used to randomly scale the penalty of different | ||
features (See :ref:`User Guide <randomized_l1>` for details ). | ||
Should be between 0 and 1. | ||
|
||
sample_fraction : float, optional, default=0.75 | ||
The fraction of samples to be used in each randomized design. | ||
|
@@ -484,7 +494,7 @@ class RandomizedLogisticRegression(BaseRandomizedLinearModel): | |
|
||
See also | ||
-------- | ||
RandomizedLasso, Lasso, ElasticNet | ||
RandomizedLasso, LogisticRegression | ||
""" | ||
def __init__(self, C=1, scaling=.5, sample_fraction=.75, | ||
n_resampling=200, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be consistent with the docstrings of the functions.
s
is described asalpha
in the docstrings:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this but I am not sure.
alpha
is used throughout sklearn linear models as the penalty parameter. The unfortunate thing is thatalpha
is the scaling factor in the paper. Probably usingalpha
in the equations will be more confusing. Maybe the solution is to change the docstring for scaling and reference that this iss
in the equation. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good observation. Though, to stay true with the source code, I would say it's better to replace
alpha
withC
ands
withalpha
. We would ideally want someone to look at the model's description, and understand the model by looking at its objective function.But you do make a good point about trying to stay true to the paper; I would put in parenthesis what the variables
C
andalpha
are representing from the original paper.(Making the point that
w
here isbeta
in the paper, ands_j
here isw_k
in the paper seems like overkill.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, agree with you. Staying consistent internally is better to match the paper's notation. Though, there is an additional "issue". This is that
alpha
andC
in the code refer to the penalization in the Lasso and Logistic regression models, respectively. And in the code,alpha
is not used as scaler, but a variable calledweights
. I just submitted a change that does not callalpha
the scaling parameter, but justs
. This might be better. What do you think?