-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: Fixed an issue wherein certain nan<x>
functions could fail for object arrays
#19821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
np.uint16, np.uint32, np.uint64) | ||
@pytest.mark.parametrize( | ||
"dtype", | ||
np.typecodes["AllInteger"] + np.typecodes["AllF 10000 loat"] + "O", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.typecodes['AllFloat'][1:]
would remove 'e'
, but is not quite as robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, as of fffcb6e the tests seem to work fine with np.float16
.
# Precaution against reduced object arrays | ||
try: | ||
return a.dtype.type(a / b) | ||
except AttributeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this faster than checking for the attribute as you do below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During a quick local test it shaves off about 100 ns (~400 vs ~300 ns) for the non-object case.
Would you feel this performance gain is worthwhile enough to stick to the current try
/except
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see it making a difference in the big picture. There might be a small difference in clarity, but with one liners that is debatable ...
LGTM, couple of nits. I assume the main fix here is for the case of non-array objects being returned. |
… arrays Use the same approach as in numpy#9013
@@ -160,8 +160,12 @@ def _remove_nan_1d(arr1d, overwrite_input=False): | |||
True if `res` can be modified in place, given the constraint on the | |||
input | |||
""" | |||
if arr1d.dtype == object: | |||
# object arrays do not support `isnan` (gh-9009), so make a guess | |||
c = np.not_equal(arr1d, arr1d, dtype=bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever.
Thanks Bas. The circleci failure may be ignored, it would be fixed by a rebase. It might be worth testing some of these function with 0-D arrays, but that is for another PR. |
Should we add a release note for this? (A brief ping to the mailing list may also be nice, it is an API change after all.) I also wonder if we want to do this for the I am fine with it, though. Just thought it might be good to add a note and ping the list. |
The test needs to be updated, since this PR adds the correct checks for floating point errors to accumulations.
I would argue that this is really just a bug fix; namelly I'd reasonably expect the |
Ah OK, I did not realize we already have this logic in other places. |
Previously there were a number of cases wherein the
nan<x>
functions (e.g.nanmedian
) could fail for object arrays,be it either due to a call to
isnan
or assuming that array reduction would always return ageneric
subclass,the latter being in the possession of a
dtype
attribute.