8000 [MRG] Input validation refactoring by amueller · Pull Request #3443 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] Input validation refactoring #3443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

amueller
Copy link
Member

Refactor input validation to make it more consistent.

Todo

  • Tests
  • docstrings

@amueller
Copy link
Member Author

Hopefully more readable and consistent input checking. Also fixed some corner cases.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling c3ecfed on amueller:input_validation_refactoring into 0807e19 on scikit-learn:master.


Parameters
----------
spmatrix : scipy sparse matrix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. is not needed

@coveralls
Copy link
8000

Coverage Status

Coverage decreased (-0.01%) when pulling f72aa17 on amueller:input_validation_refactoring into 0807e19 on scikit-learn:master.

spmatrix, copy=copy, dtype=dtype)
if force_all_finite:
_assert_all_finite(spmatrix.data)
spmatrix.data = np.array(spmatrix.data, copy=False, order=order)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It assumes that you a .data which is not the case for each sparse matrix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is true. The tests seem not to be so great ^^ we need to add a test for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.0%) when pulling d03c84e on amueller:input_validation_refactoring into 4ec8630 on scikit-learn:master.

spmatrix = spmatrix.astype(dtype)
else:
# create new
spmatrix = _sparse_matrix_constructor(convert_sparse_to)(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that here I would use the 'asformat' method of sparse matrices.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, I have been looking and not finding this method.

@arjoly
Copy link
Member
arjoly commented Jul 19, 2014

Does dense data always allowed?

@amueller
Copy link
Member Author

Yes, currently dense data is always allowed. Do we have an application where this is not the case? If we do at some point, we could later add a flag?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) when pulling 1be9e46 on amueller:input_validation_refactoring into 4ec8630 on scikit-learn:master.

@amueller
Copy link
Member Author

thanks for the helpful reviews guys :)

@amueller amueller changed the title [WIP] Input validation refactoring [MRG] Input validation refactoring Jul 19, 2014
@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) when pulling 33f20bf on amueller:input_validation_refactoring into 4ec8630 on scikit-learn:master.

@GaelVaroquaux
Copy link
Member

I see that this has switched to MRG, but the TODO still lists tests. What's the status on that?

8000
from .class_weight import compute_class_weight
from sklearn.utils.sparsetools import minimum_spanning_tree


__all__ = ["murmurhash3_32", "as_float_array", "check_arrays", "safe_asarray",
"assert_all_finite", "array2d", "atleast2d_or_csc",
"assert_all_finite", "array2d", "atleast2d_or_csc", "check_array",
"atleast2d_or_csr",
"warn_if_not_float",
"check_random_state",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove deprecated stuff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR, but in the next PR which will touch all files.

matrix input will raise an error. If the input is sparse but not in
the allowed format, it will be converted to the first listed format.

order : 'F', 'C' or None (default)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

order : 'F', 'C' or None (default=None)

@arjoly
Copy link
Member
arjoly commented Jul 19, 2014

Except for the cosmetic comment, 👍

@@ -349,7 +349,7 @@ def extract_patches_2d(image, patch_size, max_patches=None, random_state=None):
i_h, i_w = image.shape[:2]
p_h, p_w = patch_size

image = array2d(image)
image = check_array(image, allow_nd=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, that change is really surprising for me: I would read the 2 lines (the one removed and the one added) as doing very different things. It's probably just a question of choice of names on the arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is because the previous behavior was surprising ;)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.04%) when pulling 95afdc7 on amueller:input_validation_refactoring into 4ec8630 on scikit-learn:master.

@ogrisel
Copy link
Member
ogrisel commented Jul 20, 2014

+1 for merging this PR as it is and deal with further refactoring in a separate PR.

@arjoly
Copy link
Member
arjoly commented Jul 20, 2014

+1

GaelVaroquaux added a commit that referenced this pull request Jul 20, 2014
@GaelVaroquaux GaelVaroquaux merged commit 8dab222 into scikit-learn:master Jul 20, 2014
@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) when pulling f7549fd on amueller:input_validation_refactoring into 41d02e0 on scikit-learn:master.

@GaelVaroquaux
Copy link
Member

Merged from the airport!

Awesome work! Go Go team!

@amueller
Copy link
Member Author

err... I just squashed, rebased and merge... hum.... Let's see what'll happen now.

@amueller
Copy link
Member Author

Oh I didn't push. all is good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0