Description
The algorithm descriptions for RandomizedLogisticRegression and RandomizedLasso are as follows:
Randomized Logistic Regression
Randomized Regression works by resampling the train data and computing a LogisticRegression on each resampling. In short, the features selected more often are good features. It is also known as stability selection.Randomized Lasso.
Randomized Lasso works by resampling the train data and computing a Lasso on each resampling. In short, the features selected more often are good features. It is also known as stability selection.
I don't think these descriptions are accurate. According to the original paper here, the description of the randomized lasso (and by association, the randomized logistic regression) is as follows:
(We would then find multiple values of beta-hat using randomly chosen values for W)
In other words, the algorithm resamples some default weights of the features; the algorithm doesn't sample the training set and fit to these samples (ie: it doesn't bootstrap).
I think how the documentation is currently written, it seems like we're resampling the training set like a bootstrap approach. The documentation should instead clarify that we're reweighting each feature each time we fit Lasso / LogisticRegression to the data.
Thoughts, @agramfort, @GaelVaroquaux ?