10BC0 FIX make 'array-like' reject sparse containers in parameter validation by Charlie-XIAO · Pull Request #28101 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@Charlie-XIAO
Copy link
Contributor
@Charlie-XIAO Charlie-XIAO commented Jan 11, 2024

Fixes #28099. This still needs discussions, see the referenced issue, as for whether to close this PR or how to improve it.

"array-like" is currently a superset of "sparse matrix" which I do not think is the desired behavior, so this PR excludes sparse containers from the "array-like" constraints.

The other modifications are according to the failing tests, so essentially I added the "sparse matrix" option if (1) originally it has only "array-like" and (2) there exist at least some tests that use sparse matrices as input.

https://github.com/scikit-learn/administrative/blob/master/meeting_notes/2024-01-29.md#need-attention-reviews

@github-actions
Copy link
github-actions bot commented Jan 11, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 7987dfa. Link to the linter CI: here

@jeremiedbb
Copy link
Member

My opinion is to let the definition of "array-like" in our glossary (https://scikit-learn.org/stable/glossary.html#term-array-like) be our ground truth and build upon that.

Based on the glossary, an array-like is something that gives a meaningful result when you run np.asarray on it, which is not the case for sparse matrices and sparse arrays. That's why I'm in favor of changing _is_arraylike to reject sparse input. I pushed a commit to mention sparse arrays in the description of array-like in the glossary.

Regarding the implications on #27950, I think for now it's fine to keep "sparse matrix" for both sparse matrices and sparse arrays. After all we left "sparse matrix" in all docstrings for now as well (although at some point we may want to rename it to sparse container or sparse array everywhere). I pushed a commit that I took from #27950 to test param_validation against sparse containers in this PR since it's related and I think we should discard everything else in #27950.

@glemaitre What do you think ?

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Charlie-XIAO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG "array-like" in parameter validation treats sparse containers as valid inputs

3 participants

0