BUG: Fixed an issue wherein certain `nan<x>` functions could fail for object arrays #19821

BvB93 · 2021-09-03T13:48:44Z

Previously there were a number of cases wherein the nan<x> functions (e.g. nanmedian) could fail for object arrays,
be it either due to a call to isnan or assuming that array reduction would always return a generic subclass,
the latter being in the possession of a dtype attribute.

… object arrays

… number types

…t-arrays https://github.com/numpy/numpy/blob/410a89ef04a2d3c50dd2dba2ad403c872c3745ac/numpy/core/_methods.py#L265-L270

charris · 2021-09-04T20:23:48Z

numpy/lib/tests/test_nanfunctions.py

-                 np.uint16, np.uint32, np.uint64)
+@pytest.mark.parametrize(
+    "dtype",
+    np.typecodes["AllInteger"] + np.typecodes["AllF
10000
loat"] + "O",


np.typecodes['AllFloat'][1:] would remove 'e', but is not quite as robust.

No worries, as of fffcb6e the tests seem to work fine with np.float16.

numpy/lib/nanfunctions.py

charris · 2021-09-04T20:51:18Z

numpy/lib/nanfunctions.py

+                # Precaution against reduced object arrays
+                try:
+                    return a.dtype.type(a / b)
+                except AttributeError:


Is this faster than checking for the attribute as you do below?

During a quick local test it shaves off about 100 ns (~400 vs ~300 ns) for the non-object case.
Would you feel this performance gain is worthwhile enough to stick to the current try/except block?

I can't see it making a difference in the big picture. There might be a small difference in clarity, but with one liners that is debatable ...

charris · 2021-09-04T20:53:48Z

LGTM, couple of nits. I assume the main fix here is for the case of non-array objects being returned.

… arrays Use the same approach as in numpy#9013

charris · 2021-09-05T23:24:32Z

numpy/lib/nanfunctions.py

@@ -160,8 +160,12 @@ def _remove_nan_1d(arr1d, overwrite_input=False):
        True if `res` can be modified in place, given the constraint on the
        input
    """
+    if arr1d.dtype == object:
+        # object arrays do not support `isnan` (gh-9009), so make a guess
+        c = np.not_equal(arr1d, arr1d, dtype=bool)


charris · 2021-09-05T23:33:03Z

Thanks Bas.

The circleci failure may be ignored, it would be fixed by a rebase. It might be worth testing some of these function with 0-D arrays, but that is for another PR.

seberg · 2021-09-07T16:37:35Z

Should we add a release note for this? (A brief ping to the mailing list may also be nice, it is an API change after all.)

I also wonder if we want to do this for the nan functions, whether we should actually upgrade isnan to use this assumption? (I am fine with either though, defining a != a as the definition for NaN in nan<x> can very much be independent of isnan.

I am fine with it, though. Just thought it might be good to add a note and ping the list.

The test needs to be updated, since this PR adds the correct checks for floating point errors to accumulations.

BvB93 · 2021-09-08T12:34:09Z

Should we add a release note for this? (A brief ping to the mailing list may also be nice, it is an API change after all.)

I would argue that this is really just a bug fix; namelly I'd reasonably expect the <x> and nan<x> functions to produce the same result if no nan-values are involved. The only thing that might potentially warrant a ping is the a != a approach for finding object-based nans, and even this is based on a similar approach used in a prior PR (#9013).

seberg · 2021-09-08T13:28:34Z

Ah OK, I did not realize we already have this logic in other places.

BvB93 · 2021-09-09T09:26:02Z

@charris turns out 0d arrays are very much a worthwhile test case here: #19854.

BvB93 added the 00 - Bug label Sep 3, 2021

BvB93 marked this pull request as draft September 3, 2021 14:25

Bas van Beek added 2 commits September 3, 2021 16:47

BUG: Fixed an issue wherein certain nan<x> functions could fail for…

b6d7c46

… object arrays

TST: Expand the old TestNanFunctions_IntTypes test with non-integer…

fffcb6e

… number types

BvB93 force-pushed the nanfunctions branch from df3bf91 to fffcb6e Compare September 3, 2021 14:48

TST: Add more tests for nanmedian, nanquantile and nanpercentile

9ef7783

BvB93 marked this pull request as ready for review September 3, 2021 15:02

MAINT: Copy the _methods._std code-path for handling nanstd objec…

a0ea053

…t-arrays https://github.com/numpy/numpy/blob/410a89ef04a2d3c50dd2dba2ad403c872c3745ac/numpy/core/_methods.py#L265-L270

charris reviewed Sep 4, 2021

View reviewed changes

numpy/lib/nanfunctions.py Outdated Show resolved Hide resolved

charris reviewed Sep 4, 2021

View reviewed changes

MAINT: Let _remove_nan_1d attempt to identify nan-containing object…

ecba713

… arrays Use the same approach as in numpy#9013

charris reviewed Sep 5, 2021

View reviewed changes

charris merged commit b3a66e8 into numpy:main Sep 5, 2021

BvB93 deleted the nanfunctions branch September 6, 2021 07:12

seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Sep 7, 2021

seberg added a commit to seberg/numpy that referenced this pull request Sep 7, 2021

TST: Update test added in numpygh-19821

8cd51e9

The test needs to be updated, since this PR adds the correct checks for floating point errors to accumulations.

BvB93 mentioned this pull request Sep 9, 2021

BUG: Fixed an issue wherein var would raise for 0d object arrays #19854

Merged

BvB93 mentioned this pull request Oct 4, 2021

BUG: Calling nanstd with dtype=object fails with a cryptic error message. #17343

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fixed an issue wherein certain `nan<x>` functions could fail for object arrays #19821

BUG: Fixed an issue wherein certain `nan<x>` functions could fail for object arrays #19821

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Fixed an issue wherein certain nan<x> functions could fail for object arrays #19821

BUG: Fixed an issue wherein certain nan<x> functions could fail for object arrays #19821

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Fixed an issue wherein certain `nan<x>` functions could fail for object arrays #19821

BUG: Fixed an issue wherein certain `nan<x>` functions could fail for object arrays #19821