8000 ENH `make_sparse_spd_matrix` use sparse memory layout by Charlie-XIAO · Pull Request #27438 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH make_sparse_spd_matrix use sparse memory layout #27438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 26, 2023

Conversation

Charlie-XIAO
Copy link
Contributor

Reference Issues/PRs

Issue #27433.

What does this implement/fix? Explain your changes.

This PR implements make_sparse_spd_matrix to use sparse memory layout from the start.

Any other comments?

It seems that tests are missing for make_sparse_spd_matrix. I made a test for the following:

  • Check that the output is in sparse format and of correct shape;
  • Check that the output is PSD;
  • Check that when norm_diag=True, leading diagonal elements are 1;

However, I found it hard to check the parameters alpha, smallest_coef, and largest_coef, because these are imposed on A but the output is A.T * A. Due to the permutation (which I'm not sure why it is needed), A is not necessarily upper/lower triangular so Cholesky decomposition does not give A from the result. If the permutation is removed, it might be possible to test these parameters.

@github-actions
Copy link
github-actions bot commented Sep 22, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 7652bde. Link to the linter CI: here

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @Charlie-XIAO for the quick PR!

Here is a first pass of feedback:

@Charlie-XIAO
Copy link
Contributor Author

Thanks for the review @ogrisel! I'm somehow busy rn but will try resolving them ASAP.

@Charlie-XIAO
Copy link
Contributor Author

@ogrisel Conversations resolved. As for backward compatibility, I added a new keyword sparse_format which is by default None, returning the dense numpy ndarray. Currently I allowed all the existing sparse formats. Since the format may change during computations and supports different operations, I only did .asformat as the very end. Please let me know if this should be improved.

@ogrisel
Copy link
Member
ogrisel commented Sep 22, 2023

I think we should use a sparse array instead of a sparse matrix. It means that to use this function with the non default sparse_format param, the user would need scipy 1.8 or later but i think this is fine for a new feature.

@ogrisel
Copy link
Member
ogrisel commented Sep 25, 2023

I think we should use a sparse array instead of a sparse matrix. It means that to use this function with the non default sparse_format param, the user would need scipy 1.8 or later but i think this is fine for a new feature.

Actually, looking at the code again, I think this PR is good the way it is. Let's wait and see at which speed scipy will deprecate scipy sparse matrices in favor of scipy sparse arrays.

In particular, I expect that at some point top level functions used in this PR such as scipy.sparse.eye and scipy.sparse.random will be updated to return sparse arrays instead of sparse matrices and we will have nothing to do in scikit-learn.

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you very much @Charlie-XIAO.

@ogrisel ogrisel added the Waiting for Second Reviewer First reviewer is done, need a second one! label Sep 25, 2023
@ogrisel ogrisel changed the title FIX make_sparse_spd_matrix use sparse memory layout ENH make_sparse_spd_matrix use sparse memory layout Sep 25, 2023
@ogrisel ogrisel added the Quick Review For PRs that are quick to review label Sep 25, 2023
Copy link
Contributor
@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks @Charlie-XIAO. I just have one question.

@OmarManzoor OmarManzoor removed the Waiting for Second Reviewer First reviewer is done, need a second one! label Sep 26, 2023
@OmarManzoor OmarManzoor enabled auto-merge (squash) September 26, 2023 12:16
@OmarManzoor OmarManzoor merged commit 011e209 into scikit-learn:main Sep 26, 2023
@Charlie-XIAO Charlie-XIAO deleted the sp_spd_mat branch September 26, 2023 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0