-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Input validation refactoring #3440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1 |
What we have(only public functions)
|
|
Ok my proposed interface is: check_consistent_length(X, y)
# hopefully not as long in real live
X = check_array(X, allowed_sparse=['csr', 'lil'], convert_sparse_to='csr', dtype='any_float', order='C', copy=copy, check_finite=True)
y = column_or_1d(y) and array_2d_or_csx would basically be a shortcut to allowed_sparse=['csx'], convert_sparse_to='csx' |
I like that. I would also propose the renaming of check_array as |
pr here: #3443 |
We could also add check to raise ValueError on 64 bit sparse matrices if it's not supported. |
There is still the |
@arjoly can you comment on the 64bit sparse matrices? What is the issue there again? Otherwise I'd close this as I think the current state is pretty decent. |
Sparse matrices rows with intp-like types (csr/c/bsr indices, indptr; coo col, row) were constrained to a 32 bit int until very recently. If the user has a supporting scipy version, and we only perform scipy public API things, 64-bit index support should be seamless. If we perform our own cython ops, or if we use explicit dtyping, things will break with unhelpful error messages. So it's best if we validate the index dtype where that's going to happen, or fix our implementations to allow very big sparse matrices. Basically, we need to alter all the sparse matrix tests to test with both index dtypes, and go from there. |
Right. That should actually not be too hard. |
I think I will make that a new issue, though, as that is more about tighter testing. |
Are there reasons to keep as_float_array? |
Actually, probably not. |
There is a lot of logic in that one, and I'm not sure I want to include this all in the |
Maybe that is a good candidate to fix after #4136 has been merged. A problem is that it is a lot used in function interfaces, which are not tested in the common tests :-/ |
closing, I think we're happy with |
I propose to refactor the input validation. The current zoo of methods is kinda confusing.
Related to #3142.
Checks that we want are:
I'll now check if there is anything else that we currently check and see what functions we have.
Remaining Issues
The text was updated successfully, but these errors were encountered: