-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG]: MAINT center_data for linear models #5357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
23273f5
to
876c8c8
Compare
Regarding point (a). Edited from this test.
|
492c5cb
to
73b30a5
Compare
Were you able to figure out the bugs in |
Which issue are you referring to? |
In the Travis build. We were trying to fix this issue before, but since |
682b219
to
f1356ff
Compare
Question: can we just implement the |
05a9d0e
to
6e8290b
Compare
Down to 81 errors. |
I found |
7176457
to
610c21b
Compare
Down to 35. |
If LassoLars has normalize=True, LassoLarsCV hopefully has normalize=True, too, right? |
Yes, all classes in least_angle.py. |
610c21b
to
c183cf2
Compare
@GaelVaroquaux as you are around, do you have a better idea than to warn by default in lars? I don't like it. |
What's the question? (I have a grant proposal to write before the sprint, |
We are deprecating |
@GaelVaroquaux Also good luck :) |
@GaelVaroquaux Also good luck :)
Thanks. I hate grant writing.
|
I am mentionned in my comment on that issue, this is really a bad idea. I The whole value of this option is that it makes linear models robust in We convinced ourselves that we shouldn't do this change and keep the I am sorry, I said 👍 to this change a few days ago. But Gorgio hit |
ok
so we keep the fact that if fit_intercept=False normalize=True has no
effect?
|
sample_weight = np.sqrt(sample_weight) | ||
sw_matrix = sparse.dia_matrix((sample_weight, 0), | ||
shape=(n_samples, n_samples)) | ||
sw_matrix = np.diag(sample_weight) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will blow up the memory if you have large number of samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, what was the problem with the existing implementation in master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry my bad, I don't remember what I was thinking at that time.
that's it for me now. Sorry for the super slow reaction time... ping when I shall take a look again. thanks @giorgiop ! |
c29c00b
to
ea666c5
Compare
@agramfort I finally had time to resume this one :) |
looks ok at first sight. maybe @ogrisel has some time to have a look? |
@@ -360,6 +425,13 @@ class LinearRegression(LinearModel, RegressorMixin): | |||
|
|||
normalize : boolean, optional, default False | |||
If True, the regressors X will be normalized before regression. | |||
When the regressors are normalized, the fitted `coef_` are the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this. Did you mean the "fitted coef_
are of the same scale"?. How will it be same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this may not be the best wording. The discussion is in #5447 with @GaelVaroquaux. Let me know if you guys have a better and concise way to explain this property here.
effdf97
to
c822947
Compare
@@ -360,6 +426,14 @@ class LinearRegression(LinearModel, RegressorMixin): | |||
|
|||
normalize : boolean, optional, default False | |||
If True, the regressors X will be normalized before regression. | |||
This parameter is ignored when `fit_intercept` is set to `False`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This is what I infer, when X is scaled down to unit variance, the squared loss term goes up by a certain factor (I'm not sure we can particularly determine this because y is not normalized), hence for different number of samples, alpha has to be set differently to achieve the same effect. Even when X is scaled to unit length, there is this effect but to a much lesser extent.
Should we just write something like "Setting normalized equals to true scales down the input regressors to unit length. Note that this makes the hyperparameters learnt more robust and almost independent of the number of samples. The same property contd ... "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @GaelVaroquaux the last word here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with current formulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just that the "fiited coef_ are the same" term that I am worried about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed. "same" is not correct.
+1 for
Note that this makes the hyperparameters learnt more robust and almost independent of the number of samples.
thx @MechCoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
Just left 2 comments. |
# inplace_csr_row_normalize_l2 must be changed such that it | ||
# can return also the norms computed internally | ||
|
||
# transform variance to norm in-place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use csr_row_norms
which already does this, but we can leave that for later.
43d0f2c
to
d40f1f1
Compare
@giorgiop Can you fix the "fitted coef_" are same comment for all estimators? After that we can merge. |
d40f1f1
to
ec0a97a
Compare
ec0a97a
to
acdd1aa
Compare
Done ! |
Merged @giorgiop |
thanks heaps @giorgiop ! 🍻 |
Thanks to all reviewers for the combined effort! |
Fixes #2601no more.TO DO
center_data
andsparse_center_data
into a new private function, and deprecate them# TODO
and# XXX
AND NOT TO DO (see #2601 and discussion below):
normalize
(signature and property) and introducestandardize
behaviourcenter_data
internally + related testssample_weights
with unused variables (a) . Done in [MRG + 1] Improve tests for sample_weight in LinearRegression and Ridge #5526center_data
behaviour w.r.t.fit_intercept
. Currently, input data is not touched if we not are fitting the intercept. See separate issue BUG: linear model inputs are not normalized when fit_intercept=False #5799