-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] Input validation refactoring #3443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Input validation refactoring #3443
Conversation
Hopefully more readable and consistent input checking. Also fixed some corner cases. |
|
||
Parameters | ||
---------- | ||
spmatrix : scipy sparse matrix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. is not needed
spmatrix, copy=copy, dtype=dtype) | ||
if force_all_finite: | ||
_assert_all_finite(spmatrix.data) | ||
spmatrix.data = np.array(spmatrix.data, copy=False, order=order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It assumes that you a .data
which is not the case for each sparse matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is true. The tests seem not to be so great ^^ we need to add a test for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
spmatrix = spmatrix.astype(dtype) | ||
else: | ||
# create new | ||
spmatrix = _sparse_matrix_constructor(convert_sparse_to)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that here I would use the 'asformat' method of sparse matrices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, I have been looking and not finding this method.
Does dense data always allowed? |
Yes, currently dense data is always allowed. Do we have an application where this is not the case? If we do at some point, we could later add a flag? |
thanks for the helpful reviews guys :) |
I see that this has switched to MRG, but the TODO still lists tests. What's the status on that? |
from .class_weight import compute_class_weight | ||
from sklearn.utils.sparsetools import minimum_spanning_tree | ||
|
||
|
||
__all__ = ["murmurhash3_32", "as_float_array", "check_arrays", "safe_asarray", | ||
"assert_all_finite", "array2d", "atleast2d_or_csc", | ||
"assert_all_finite", "array2d", "atleast2d_or_csc", "check_array", | ||
"atleast2d_or_csr", | ||
"warn_if_not_float", | ||
"check_random_state", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove deprecated stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this PR, but in the next PR which will touch all files.
matrix input will raise an error. If the input is sparse but not in | ||
the allowed format, it will be converted to the first listed format. | ||
|
||
order : 'F', 'C' or None (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order : 'F', 'C' or None (default=None)
Except for the cosmetic comment, 👍 |
@@ -349,7 +349,7 @@ def extract_patches_2d(image, patch_size, max_patches=None, random_state=None): | |||
i_h, i_w = image.shape[:2] | |||
p_h, p_w = patch_size | |||
|
|||
image = array2d(image) | |||
image = check_array(image, allow_nd=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, that change is really surprising for me: I would read the 2 lines (the one removed and the one added) as doing very different things. It's probably just a question of choice of names on the arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is because the previous behavior was surprising ;)
+1 for merging this PR as it is and deal with further refactoring in a separate PR. |
+1 |
[MRG] Input validation refactoring
Merged from the airport! Awesome work! Go Go team! |
err... I just squashed, rebased and merge... hum.... Let's see what'll happen now. |
Oh I didn't push. all is good. |
Refactor input validation to make it more consistent.
Todo