8000 [MRG] astype sparse matrix by sseg · Pull Request #4585 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] astype sparse matrix #4585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

sseg
Copy link
Contributor
@sseg sseg commented Apr 13, 2015

Changes astype() in utils/fixes.py which failed on SciPy sparse matrices

@amueller
Copy link
Member

Do you have any idea why it failed on python3 only?

@GaelVaroquaux
Copy link
Member

I don't think that we want to do that, as this function is in only as a temporary fixes (a backport) for old versions of numpy that doesn't have it.

@amueller
Copy link
Member

I agree that we don't want to change the functionality of astype if the numpy version doesn't support sparse matrices (which it shouldn't). We should use check_array instead.

@sseg
Copy link
Contributor Author
sseg commented Apr 13, 2015

I had thought it had been just an issue with the Numpy version (with the Travis Py2 builds using <=1.6.2), as opposed to the Python version. I'm not sure whether that's the case, but given the response from travis I've realized that I've fed keyword arguments for positionals, so that for the current commit I've precluded testing the issue.

This patch was intended to address the failed tests here: https://travis-ci.org/scikit-learn/scikit-learn/jobs/58113604.

@ogrisel
Copy link
Member
ogrisel commented Apr 13, 2015

The goal of this PR is to tackle the limitations of our backport when try to use it more generally as in #4575.

Arguably it's no longer just a numpy backport. Still I think it's a useful utility to help use avoid introducing silent memory copies of the input data throughout the code base.

@amueller
Copy link
Member

why not use check_array instead?

@ogrisel
Copy link
Member
ogrisel commented Apr 13, 2015

Maybe we could move it under sklearn.utils.validation as _astype.

@ogrisel
Copy link
Member
ogrisel commented Apr 13, 2015

why not use check_array instead?

Good suggestion.

@amueller
Copy link
Member

I think that would be a good opportunity to add the "allowed dtype" argument that we were talking about at some point ;).
Not sure if that should supersede dtype="numeric" or be in addition to it.
The kmeans code already has a call do check_array but doesn't specify a dtype.
If should probably specify dtype=[np.float32, np.float64].

@ogrisel
Copy link
Member
ogrisel commented Apr 14, 2015

I agree.

@ogrisel
Copy link
Member
ogrisel commented Apr 14, 2015

Maybe we should close this PR then. Sorry @sseg to have mislead you. I hope you won't hold it against me. If you are interested in implementing support patterns such as a = check_array(a, dtype=[np.float32, np.float64]) please let us know.

@ogrisel ogrisel closed this Apr 14, 2015
@sseg
Copy link
Contributor Author
sseg commented Apr 14, 2015

I'm glad to have been a part of fleshing this out. If an issue is opened to implement dtype validation I'll have a look at it, but I don't think I have enough of a grasp of the use cases for sparse arrays.

@sseg sseg deleted the astype_spmatrix branch April 14, 2015 23:59
@TomDLT TomDLT mentioned this pull request Apr 30, 2015
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0