8000 BUG,MAINT: Ensure masked elements can be tested against nan and inf. by mhvk · Pull Request #11122 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG,MAINT: Ensure masked elements can be tested against nan and inf. #11122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 7, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
BUG,MAINT: Ensure masked elements can be tested against nan and inf.
The removal of nan and inf from arrays that are compared using
test routines like assert_array_equal treated the two arrays
separately, which for masked arrays meant that some elements would
not be removed when they should have been. This PR corrects this.
  • Loading branch information
mhvk committed May 27, 2018
commit 5718b3306303b4899c017641a9282714dcf8c9b8
67 changes: 29 additions & 38 deletions numpy/testing/_private/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -685,7 +685,7 @@ def assert_array_compare(comparison, x, y, err_msg='', verbose=True,
header='', precision=6, equal_nan=True,
equal_inf=True):
__tracebackhide__ = True # Hide traceback for py.test
from numpy.core import array, isnan, isinf, any, inf
from numpy.core import array, isnan, any, inf, ndim
x = array(x, copy=False, subok=True)
y = array(y, copy=False, subok=True)

Expand All @@ -695,9 +695,14 @@ def isnumber(x):
def istime(x):
return x.dtype.char in "Mm"

def chk_same_position(x_id, y_id, hasval='nan'):
8000 """Handling nan/inf: check that x and y have the nan/inf at the same
locations."""
def func_assert_same_pos(x, y, func=isnan, hasval='nan'):
"""Handling nan/inf: combine results of running func on x and y,
checking that they are True at the same locations."""
x_id = func(x)
y_id = func(y)
if not any(x_id) and not any(y_id):
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two lines needed? Aren't they covered by the below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think not, but numerous recursion failures if I leave them out...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what happens is that it does recurs 8000 e, but the if statement prevents it from doing it more than once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is simple to solve - just use any - will push fix up shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't work either, but passing in equal_[nan|inf]=False does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment saying that this is needed to prevent recursion or something?

Good to go either way


try:
assert_array_equal(x_id, y_id)
except AssertionError:
Expand All @@ -706,6 +711,9 @@ def chk_same_position(x_id, y_id, hasval='nan'):
% (hasval), verbose=verbose, header=header,
names=('x', 'y'), precision=precision)
raise AssertionError(msg)
# If there is a scalar, then here we know the array has the same
# flag as it everywhere, so we should return the scalar flag.
return x_id if x_id.ndim == 0 else y_id

try:
cond = (x.shape == () or y.shape == ()) or x.shape == y.shape
Expand All @@ -718,49 +726,32 @@ def chk_same_position(x_id, y_id, hasval='nan'):
names=('x', 'y'), precision=precision)
raise AssertionError(msg)

flagged = False
if isnumber(x) and isnumber(y):
has_nan = has_inf = False
if equal_nan:
x_isnan, y_isnan = isnan(x), isnan(y)
# Validate that NaNs are in the same place
has_nan = any(x_isnan) or any(y_isnan)
if has_nan:
chk_same_position(x_isnan, y_isnan, hasval='nan')
flagged = func_assert_same_pos(x, y, func=isnan, hasval='nan')

if equal_inf:
x_isinf, y_isinf = isinf(x), isinf(y)
# Validate that infinite values are in the same place
has_inf = any(x_isinf) or any(y_isinf)
if has_inf:
# Check +inf and -inf separately, since they are different
chk_same_position(x == +inf, y == +inf, hasval='+inf')
chk_same_position(x == -inf, y == -inf, hasval='-inf')

if has_nan and has_inf:
x = x[~(x_isnan | x_isinf)]
y = y[~(y_isnan | y_isinf)]
elif has_nan:
x = x[~x_isnan]
y = y[~y_isnan]
elif has_inf:
x = x[~x_isinf]
y = y[~y_isinf]

# Only do the comparison if actual values are left
if x.size == 0:
return
flagged |= func_assert_same_pos(x, y,
func=lambda xy: xy == +inf,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.isposinf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Learn something new every day...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't work for complex numbers...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does what you have here work for complex numbers?

hasval='+inf')
flagged |= func_assert_same_pos(x, y,
func=lambda xy: xy == -inf,
hasval='-inf')

elif istime(x) and istime(y):
# If one is datetime64 and the other timedelta64 there is no point
if equal_nan and x.dtype.type == y.dtype.type:
x_isnat, y_isnat = isnat(x), isnat(y)
flagged = func_assert_same_pos(x, y, func=isnat, hasval="NaT")

if any(x_isnat) or any(y_isnat):
chk_same_position(x_isnat, y_isnat, hasval="NaT")

if any(x_isnat) or any(y_isnat):
x = x[~x_isnat]
y = y[~y_isnat]
if ndim(flagged):
x, y = x[~flagged], y[~flagged]
# Only do the comparison if actual values are left
if x.size == 0:
return
elif flagged:
# no sense doing comparison if everything is flagged.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I agree with this - what if x and y are quantities of different dimensions? The test should still compare the empty arrays so that the correct error is raised, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same case as the if x.size == 0 stanza above, but for the case where x and y are both scalar. In principle, I agree the comparison should work for empty elements, but in practice, this is not the case (e.g., matrix changing its shape being one problem...)

Since this was here before, I propose to leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, i agree this is out of scope for the patch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it sadly remains a chunk of rather fragile code...

return

val = comparison(x, y)

Expand Down
12 changes: 12 additions & 0 deletions numpy/testing/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,17 @@ def test_recarrays(self):
self._test_not_equal(c, b)
assert_equal(len(l), 1)

def test_masked_nan_inf(self):
# Regression test for gh-11121
a = np.ma.MaskedArray([3., 4., 6.5], mask=[False, True, False])
b = np.array([3., np.nan, 6.5])
self._test_equal(a, b)
self._test_equal(b, a)
a = np.ma.MaskedArray([3., 4., 6.5], mask=[True, False, False])
b = np.array([np.inf, 4., 6.5])
self._test_equal(a, b)
self._test_equal(b, a)


class TestBuildErrorMessage(object):

Expand Down Expand Up @@ -650,6 +661,7 @@ def test_inf_compare_array(self):
assert_raises(AssertionError, lambda: self._assert_func(-ainf, -x))
self._assert_func(-ainf, x)


@pytest.mark.skip(reason="The raises decorator depends on Nose")
class TestRaises(object):

Expand Down
0