-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+2] temporary fix for sparse ridge with intercept fitting #5360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -281,6 +286,12 @@ def ridge_regression(X, y, alpha, sample_weight=None, solver='auto', | |||
----- | |||
This function won't compute the intercept. | |||
""" | |||
if return_intercept: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do you check X is sparse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the fit(), here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and solver != "sag"
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, 'sag' currently uses the same centering trick than other solvers to avoid computing the intercept.
The intercept is computed with 'sag' only if the data has not been centered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I got your point, I'll add it
needs a whatsnew, otherwise looks good apart from minor comments. |
|
||
|
||
def test_ridge_fit_intercept_sparse(): | ||
r = np.random.RandomState(42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or make_regression(...) + 10
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure :) I was tired
1768e44
to
19e4452
Compare
X_csr = sp.csr_matrix(X) | ||
|
||
dense = Ridge(alpha=1., tol=1.e-15, solver='sag', fit_intercept=True) | ||
sparse = Ridge(alpha=1., tol=1.e-15, solver='sag', fit_intercept=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add the case where solver != 'sag'
and check for the solver switch warning with assert_warns
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Apart from my last comment and a needed rebase on top of master, +1 for merge as well. |
maybe @mblondel wants to do a quick look-over? |
warnings.warn("In Ridge, only 'sag' solver can currently fit the " | ||
"intercept when X is sparse. Solver has been " | ||
"automatically changed into 'sag'.") | ||
solver = 'sag' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this raise a warning even when the data isn't sparse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I did not think about the public use of ridge_regression
.
I'll change that
LGTM once my comments are addressed. In #4710, I gave another temporary quick fix which can be used with all solvers. It could be worth investing it if time permits. |
I chose a quicker fix for the release of v0.17, but your suggestion might be better for long term. |
[MRG+2] temporary fix for sparse ridge with intercept fitting
I have noticed that the warning for sparse input is now in contrast with the dosctring. |
PR welcome @giorgiop ;) |
[or open an issue please?] |
Temporary fix for #4710, using 'sag' solver, that can directly fit the intercept.
If
fit_intercept=True
andX
is sparse, then the solver is switch to 'sag'.I am aware this is not what we want for long term, yet it is a temporary fix for release 0.17.