-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: Fix concatenation when the output is "S" or "U" #18052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is this a regression in 1.20.0? |
Its a bug in a new feature in 1.20. If this feels like to much, we could pull that out in principle (although we should do this anyway probably). The reason for the So yeah, from my perspective backporting is slightly easier, but as the deprecation will only be in 1.21, it probably doesn't matter. |
What deprecation? Number to string? What I'd like to know is if 1.20 will break current usage without a fix. |
@charris sorry. Current usage is unchanged since 1.20 added the keyword argument (and this change does not modify that). If you use that new feature you can run into bugs (possibly including crashes). |
ef084dc
to
cc99c61
Compare
Previously, the dtype was used, this now assumes that we want to cast to a string of (unknown) length. This is a simplified version of what happens in `np.array()` or `arr.astype()` (it does never inspect the values, e.g. for object casts). This is more complex as I would like, and with the refactor of ResultType and similar can be cleaned up a bit more hopefully. Note that currently, object to "S" or "U" casts simply return length 64 strings, but with the new version, this will be an error (although the error message probably needs improvement). This is a behaviour inherited from other places however.
cc99c61
to
50dce51
Compare
NULL); | ||
ret = (PyArrayObject *)PyArray_NewFromDescr_int( | ||
subtype, descr, 1, &shape, &stride, NULL, 0, NULL, | ||
NULL, 0, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually changes nothing, but eventually, we may want to not do weird things about dtype="S0"
and then it will be necessary.
Let's give it a shot. Thanks Sebastian. |
Thanks, sorry for missing this originally. |
Previously, the dtype was used, this now assumes that we want to
cast to a string of (unknown) length. This is a simplified version
of what happens in
np.array()
orarr.astype()
(it does neverinspect the values, e.g. for object casts).
This is more complex as I would like, and with the refactor of
ResultType and similar can be cleaned up a bit more hopefully.
Note that currently, object to "S" or "U" casts simply return
length 64 strings, but with the new version, this will be an error
(although the error message probably needs improvement).
This is a behaviour inherited from other places however.
The issue here is that NumPy 1.20 is just broken if you pass
dtype="U"
in concatenate. Unfortunately, this is about as "minimal" as I could think of (some things around it should be cleaned up also in the new code paths). Things are simply fairly complicated if you have "flexible dtypes" or DType (type/class) in my way of thinking about it...There is of course more inanity here, e.g.
concatenate
usesresult_type
which uses value-based casting for 0-D objects, but there is not much to do about it right at this instance.