-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Astype fix #4645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Astype fix #4645
Conversation
thanks @TomDLT Travis is not happy. You have failing tests. |
All reactions
Sorry, something went wrong.
OK maybe I will start testing before pushing |
All reactions
Sorry, something went wrong.
Also it would be great to rebase this branch on top of current master while removing the |
All reactions
Sorry, something went wrong.
Sounds like a plan ;) |
All reactions
Sorry, something went wrong.
meh, you can also just check travis after pushing ;) but that utually takes longer. |
All reactions
Sorry, something went wrong.
0222c8b
to
411b868
Compare
Big thanks to @ogrisel that worked on this with me TODO: I will look at this monday |
All reactions
Sorry, something went wrong.
To fix the test that fail on travis, you might want to setup a conda env with the same versions for numpy and scipy.
|
All reactions
Sorry, something went wrong.
Good catch, travis is now green. I re-wrapped and moved your todo list to put it in the description of the PR. Once all the boxes are ticked please edit the title of the PR to replace |
All reactions
Sorry, something went wrong.
In note : |
All reactions
Sorry, something went wrong.
I'm not sure how great an idea of |
All reactions
Sorry, something went wrong.
This is what we are doing in this PR when you pass |
All reactions
Sorry, something went wrong.
+1 for using |
All reactions
Sorry, something went wrong.
X_train = X_train.astype(np.float64) | ||
X_test = X_test.astype(np.float64) | ||
X_train = astype(X_train, np.float64, copy=False) | ||
X_test = astype(X_test, np.float64, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to set copy=False
for this use case as the original data type is int16
for sure as explained in the previous commit. There is no way to avoid the malloc / dtype conversion in that case. We can keep:
X_train = X_train.astype(np.float64)
X_test = X_test.astype(np.float64)
Sorry, something went wrong.
All reactions
Is there a clean way to revert some old commits without a total rebase? |
All reactions
Sorry, something went wrong.
Run a |
All reactions
Sorry, something went wrong.
# not a data type (e.g. a column named dtype in a pandas DataFrame) | ||
dtype_orig = None | ||
|
||
if dtype_numeric: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure if we want to keep this behavior. In what cases do we want to pass through integer data? Or bytes? But maybe that is for another PR? I'm not sure....
We could also just set dtype = NUMERIC_DTYPES (which would be a long list) to remove this branch at least.
Sorry, something went wrong.
All reactions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought about that option but I think it's fine to leave it this way for now.
Sorry, something went wrong.
All reactions
This is awesome, thanks! I'm not sure how good the "numeric" option was, and maybe we should remove it now, and use explicit dtypes instead? But I think we can merge this before that refactoring. |
All reactions
Sorry, something went wrong.
Do you want me to squash all commits in only one? |
All reactions
Sorry, something went wrong.
Maybe we should deprecate |
All reactions
Sorry, something went wrong.
+1 for thinking about this point again once this PR has been merged. |
All reactions
Sorry, something went wrong.
I think so yes. |
All reactions
Sorry, something went wrong.
I think removing |
All reactions
Sorry, something went wrong.
+1 on squashing everything and +1 on merge. |
All reactions
Sorry, something went wrong.
See #4685 for tracking the possible removal of "numeric". |
All reactions
Sorry, something went wrong.
Squashed |
All reactions
Sorry, something went wrong.
Just curiosity, to deprecate |
All reactions
Sorry, something went wrong.
from . import deprecated maybe? |
All reactions
Sorry, something went wrong.
It did not work for me |
All reactions
Sorry, something went wrong.
@ogrisel merge? |
All reactions
Sorry, something went wrong.
Fine with the removal of Can you please rebase on top of the current master once again? There is a small conflict in |
All reactions
Sorry, something went wrong.
ENH improve check_array to warn on dtype conversions ENH make check_array accept several dtypes ENH change validation with improved check_array ENH change astype to avoid copy if possible ENH remove warn_if_not_float
Rebased on top of master |
All reactions
Sorry, something went wrong.
Thanks :) |
All reactions
Sorry, something went wrong.
This is really great btw 🍻 |
All reactions
Sorry, something went wrong.
Thanks :) |
All reactions
Sorry, something went wrong.
🍻 !
|
All reactions
Sorry, 93A6 something went wrong.
I forgot to drink a beer on that PR: 🍻 |
All reactions
Sorry, something went wrong.
Successfully merging this pull request may close these issues.
I extended
check_array
to add support foraccept_sparse=True
and solve the 3 errors by changing
astype
intocheck_array
when sparse matrix are possibleI will check the other uses of
astype
tomorrow.TODO: