8000 [MRG+2] temporary fix for sparse ridge with intercept fitting by TomDLT · Pull Request #5360 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG+2] temporary fix for sparse ridge with intercept fitting #5360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 13, 2015

Conversation

TomDLT
Copy link
Member
@TomDLT TomDLT commented Oct 7, 2015

Temporary fix for #4710, using 'sag' solver, that can directly fit the intercept.

If fit_intercept=True and X is sparse, then the solver is switch to 'sag'.

I am aware this is not what we want for long term, yet it is a temporary fix for release 0.17.

@TomDLT TomDLT added this to the 0.17 milestone Oct 7, 2015
@TomDLT TomDLT changed the title [WIP] temporary fix for sparse ridge with intercept fitting [MRG] temporary fix for sparse ridge with intercept fitting Oct 7, 2015
@@ -281,6 +286,12 @@ def ridge_regression(X, y, alpha, sample_weight=None, solver='auto',
-----
This function won't compute the intercept.
"""
if return_intercept:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do you check X is sparse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the fit(), here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and solver != "sag" right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, 'sag' currently uses the same centering trick than other solvers to avoid computing the intercept.
The intercept is computed with 'sag' only if the data has not been centered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I got your point, I'll add it

@amueller
Copy link
Member
amueller commented Oct 9, 2015

needs a whatsnew, otherwise looks good apart from minor comments.



def test_ridge_fit_intercept_sparse():
r = np.random.RandomState(42)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or make_regression(...) + 10, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure :) I was tired

@TomDLT TomDLT force-pushed the sparseridge branch 2 times, most recently from 1768e44 to 19e4452 Compare October 9, 2015 17:40
X_csr = sp.csr_matrix(X)

dense = Ridge(alpha=1., tol=1.e-15, solver='sag', fit_intercept=True)
sparse = Ridge(alpha=1., tol=1.e-15, solver='sag', fit_intercept=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also add the case where solver != 'sag' and check for the solver switch warning with assert_warns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ogrisel
Copy link
Member
ogrisel commented Oct 12, 2015

Apart from my last comment and a needed rebase on top of master, +1 for merge as well.

@ogrisel ogrisel changed the title [MRG] temporary fix for sparse ridge with intercept fitting [MRG+2] temporary fix for sparse ridge with intercept fitting Oct 12, 2015
@amueller
Copy link
Member

maybe @mblondel wants to do a quick look-over?

warnings.warn("In Ridge, only 'sag' solver can currently fit the "
"intercept when X is sparse. Solver has been "
"automatically changed into 'sag'.")
solver = 'sag'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this raise a warning even when the data isn't sparse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I did not think about the public use of ridge_regression.
I'll change that

@mblondel
Copy link
Member

LGTM once my comments are addressed.

In #4710, I gave another temporary quick fix which can be used with all solvers. It could be worth investing it if time permits.

@TomDLT
Copy link
Member Author
TomDLT commented Oct 13, 2015

In #4710, I gave another temporary quick fix which can be used with all solvers. It could be worth investing it if time permits.

I chose a quicker fix for the release of v0.17, but your suggestion might be better for long term.

TomDLT pushed a commit that referenced this pull request Oct 13, 2015
[MRG+2] temporary fix for sparse ridge with intercept fitting
@TomDLT TomDLT merged commit 90922ea into scikit-learn:master Oct 13, 2015
@TomDLT TomDLT deleted the sparseridge branch October 16, 2015 12:08
@giorgiop
Copy link
Contributor

I have noticed that the warning for sparse input is now in contrast with the dosctring.

@amueller
Copy link
Member

PR welcome @giorgiop ;)

@amueller
Copy link
Member

[or open an issue please?]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0