-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Normalize only applies if fit_intercept=True #3020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I agree. It's a bit awkward... I think normalize=True should also work with fit_intercept=False I am not sure how to fix this without breaking people's code... |
Maybe just document the current behavior? |
one option would be to rename normalize to standardize or scale and |
standardize actually sounds good :) |
looks like it's going to be a fun deprecation cycle... |
I think I can handle this one, but has a final decision been made on this? |
I think renaming normalize to standardize is the way to go (using a deprecation) and then making it normalize no matter what the value of |
Got it. Also, when making a fix that requires a deprecation, do you:
Thanks again. |
no, deprecation and fix should be in the same PR. |
Do methods which are public to a module but not to the containing package need to support the old name ( Of course this could break any code (not in sklearn) which uses this method directly, but it is also not public to the package. |
The issue is that this is not just a renaming. Since the behavior of |
yeah, it is a bit tricky. So you say it is not worth trying to remove the |
It's looking like it is possible (so far) to remove
My question here is whether |
I think you we should rename |
This has been hanging around a long time and may deserve a fresh look |
Another solution could be to deprecate
the issue is that there is a constant factor difference between scaling by std and scaling by the L2 norm so that would be not strictly equivalent. Suggesting, make_pipeline(
StandardScaler(with_std=False),
Normalizer()
) to get strictly identical results could work but is not very elegant.. In terms of API consistency, it's not clear why linear models have a Also even in linear models, it's not consistent. Out of 32 linear estimators, 22 have the normalize parameter, while 10 don't,
and of those that do have it, most have a default at
|
one solution would be rename normalize to standardize and we should ensure that
make_pipeline(
StandardScaler(),
LinearModel(standardize=False)
)
and
LinearModel(standardize=True)
are equivalent models, also if fit_intercept=True and no matter if X
is sparse or dense.
What is not clear is if it's even possible as StandardScaler will not
center the data
when X is sparse while our linear models will store the X_offset to
implicitly work with centered data
…On Fri, Jun 7, 2019 at 10:47 AM Roman Yurchak ***@***.***> wrote:
Another solution could be to deprecate normalize in linear models altogether and suggest using a StandardScaler or some other appropriate scaler in a Pipeline. The advantage is that,
it makes switching models more natural (no need to think about normalization by re-using a common pre-processing step)
makes users think about the right scaler for their data
lowers the cognitive load when reading linear model parameters
the issue is that there is a constant factor difference between scaling by std and scaling by the L2 norm so that would be not strictly equivalent. Suggesting,
make_pipeline(
StandardScaler(with_std=False),
Normalizer()
)
to get strictly identical results could work but is not very elegant..
In terms of API consistency, it's not clear why linear models have a normalize parameters while most work somewhat without it. At the same time, say SVM don't have it, while they cannot be used without feature scaling.
Also even in linear models, it's not consistent. Out of 32 linear estimators, 22 have the normalize parameter, while 10 don't,
HuberRegressor
LogisticRegression
LogisticRegressionCV
PassiveAggressiveClassifier
PassiveAggressiveClassifier
Perceptron
RANSACRegressor
SGDClassifier (ref #5248)
SGDRegressor
TheilSenRegressor
and of those that do have it, most have a default at normalize=False while the following have a default at normalize=True,
OrthogonalMatchingPursuit
Lars
LassoLars
LarsCV
LassoLarsCV
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Removing from the milestone for now. This needs a fair amount of love to be solved. |
ok so after a long discussion with @maikia and @glemaitre here is our summary. Initial problemwe want dense and sparse cases to match for linear models (eg Lasso) in the case were we standardize the features. Proposal from @rthUse a Explanation: y = (x - x_mean).dot(coef) + intercept is equivalent to: y = x.dot(coef) + intercept - x_mean.dot(coef) for it means that for Summarylet's remove normalize parameter from the API, suggesting to use a pipeline and let's just make sure we can fit intercepts properly without regularizing them (like liblinear still does :( ) sounds like a plan @maikia @glemaitre @rth @jnothman ? |
thanks @agramfort for the explanation. I will work on this |
As this is a quite large PR, after discussing with @glemaitre we decided it is better to subdivide it to smaller PRs.
23 Rewrite an example based on #2601 (comment) to show the impact of standardization with StandardScaler on the optimal alpha for Ridge and Lasso dependent on normalization. |
The fact that Pipeline currently doesn't support |
what do you suggest?
|
In #3020 (comment), the last point is G. documentation. What exactly is expected in order to close this issue (and #2601)? |
I think we have already clear warning messages and good what's new
entries... :)
… |
A number of linear models have a
normalize
option to scale all features to equal variance.However, this only has any effect if
fit_intercept=True
. This interaction does not appear to be documented.So this issue requests that one of the following actions be taken:
fit_intercept=True
requirement should be noted in the comment fornormalize
; ornormalize
should apply whenfit_intercept=False
, assuming the data is already centeredThe text was updated successfully, but these errors were encountered: