8000 [MRG] handle sparse x and intercept in _RidgeGCV by jeromedockes · Pull Request #13350 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] handle sparse x and intercept in _RidgeGCV #13350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jeromedockes
Copy link
Contributor
@jeromedockes jeromedockes commented Mar 1, 2019

closes #2354, closes #13325, closes #5364, closes #13321, closes #4781.

Currently, if the design matrix is sparse, no intercept is fitted by
_RidgeGCV, even if fit_intercept is true.

This is an attempt to handle this case correctly when gcv_mode is "eigen",
which uses an eigendecompositon of the Gram matrix.
(gcv_mode="svd" cannot handle sparse matrices because the left singular
vectors of X are dense and this could use too much memory).

This is work in progress and for now, to make comparison easier, suggested
changes are implemented in a subclass of _RidgeGCV, _WIPNewRidgeGCV.

@jeromedockes
Copy link
Contributor Author

some figures to compare the current and proposed behaviors, which can be
reproduced with:

https://gist.github.com/jeromedockes/7fca2698f6ab7413903c782becc22c59

coef
intercept

@jeromedockes
Copy link
Contributor Author

note: the proposed solution passes check_estimator_sparse_dense in #7590 while the current one does not

@jeromedockes jeromedockes changed the title [WIP] handle sparse x and intercept in _RidgeGCV with gcv_mode='eigen' [WIP] handle sparse x and intercept in _RidgeGCV Apr 7, 2019
@jeromedockes
Copy link
Contributor Author

I haven't yet merged master because
test_ridge_regression_dtype_stability[sparse_cg] fails on master since
0dac63f (at least on my machine).

Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass over the implementation.

@jeromedockes jeromedockes force-pushed the ridge_gcv_sparse_x_intercept branch from 096b444 to 6e5ea75 Compare May 6, 2019 08:50
@jnothman
Copy link
Member
jnothman commented May 7, 2019

This formally has 3 approvals. I've not looked at it. Is it ready for merge? Are we releasing in 0.21, or 0.22?

@ogrisel
Copy link
Member
ogrisel commented May 7, 2019

I am still +1 for merge. It seems that all @thomasjpfan comments have been addressed but it would be great to have his confirmation before merging.

I think this is an important bug fix so +1 for 0.21 but I am fine to merge to master only for 0.22 if others disagree.

now _{decomposition method used}_decompose_{decomposed matrix}
and _solve_{provided decomposition}_{matrix whose decomposition is provided}
e.g. _eigen_decompose_covariance, _solve_eigen_covariance
@ogrisel
Copy link
Member
ogrisel commented May 7, 2019 6D47

We had an IRL discussion with @jeromedockes and we decided to stick to float64 for RidgeCV for now, otherwise we risk delaying this PR significantly.

This means that some of the changes from #13302 have been reverted. Only the Ridge class can natively work with float32 data and even that seems to be problematic with the sparse_cg solver that seems to be unreliable.

@jnothman
Copy link
Member
jnothman commented May 8, 2019

@thomasjpfan please confirm and merge

@thomasjpfan
Copy link
Member

Looking into it. Going over the implementation one last time before merging.

@thomasjpfan thomasjpfan merged commit f23e92e into scikit-learn:master May 8, 2019
@thomasjpfan
Copy link
Member

Thank you! @jeromedockes

@agramfort
Copy link
Member

Great job @jeromedockes 👏

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request May 8, 2019
@jeromedockes
Copy link
Contributor Author

many thanks to @agramfort, @glemaitre, @ogrisel and @thomasjpfan for the time
you spent on your very instructive reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants
0