8000 Questionable behaviour in MaskedArray.__eq__ on flexible arrays · Issue #8589 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Questionable behaviour in MaskedArray.__eq__ on flexible arrays #8589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eric-wieser opened this issue Feb 9, 2017 · 8 comments
Closed

Questionable behaviour in MaskedArray.__eq__ on flexible arrays #8589

eric-wieser opened this issue Feb 9, 2017 · 8 comments

Comments

@eric-wieser
Copy link
Member

These lines, introduced in a5da87c by @pierregm, appear multiple times in numpy/ma/core.py, such as MaskedArray.__eq__:

try:
    mask = mask.view((bool_, len(self.dtype))).all(axis)
except ValueError:
    mask = np.all([[f[n].all() for n in mask.dtype.names]
      for f in mask], axis=axis)

But where is this ValueError expected to come from? The .view, or the .all, due to an invalid axis.

It seems like there is a bug in the code that prepares axis here, and this bug is being silenced by an overzealous try/catch.

@charris
Copy link
Member
charris commented Feb 9, 2017

@mhvk @ahaldane Thoughts.

@mhvk
Copy link
Contributor
mhvk commented Feb 9, 2017

In the commit pointed too, in another similar change, there is a comment: "# In case we have nested fields..."

So, at least that gives a hint why this was introduced. Indeed, one of the test cases shows that it is at least one thing the ValueError is meant to catch:

ndtype = [('A', int), ('B', [('BA', int), ('BB', int)])]
a = np.ma.array([(1, (1, 1)), (2, (2, 2))], mask=[(0, (1, 0)), (0, (0, 1))], dtype=ndtype)
a.mask
# array([(False, ( True, False)), (False, (False,  True))], 
#       dtype=[('A', '?'), ('B', [('BA', '?'), ('BB', '?')])])
a.mask.view((np.bool_, len(a.dtype)))
# ValueError: new type not compatible with array.

In principle, I guess one should really write

try:
    mask = mask.view((bool_, len(self.dtype)))
except ValueError:
    mask = np.all(....)
else:
    mask = mask.all(axis)

But where does the indexing go wrong? I must say having a default of axis=1 seems very odd; surely, this should be axis=-1?!?!

@mhvk
Copy link
Contributor
mhvk commented Feb 9, 2017

Just checking now: my try/except/else indeed throws the same errors as found in #8584, and those disappear when I (correctly) set axis=-1. PR to come soon.

@eric-wieser
Copy link
Member Author

Does this mean there is a bug with masked 2D record arrays?

@mhvk
Copy link
Contributor
mhvk commented Feb 9, 2017

Does this mean there is a bug with masked 2D record arrays?

Yes. What a code... Anyway, fix almost done.

@mhvk
Copy link
Contributor
mhvk commented Feb 9, 2017

See #8590 for a fix.

mhvk added a commit to mhvk/numpy that referenced this issue Feb 27, 2017
In the process of trying to fix the "questionable behaviour in
`MaskedArray.__eq__`" (numpygh-8589), it became clear that the code was
buggy. E.g., `ma == ma[0]` failed if `ma` held a structured dtype;
multi-d structured dtypes failed generally; and, more worryingly,
a masked scalar comparison could be wrong:
`np.ma.MaskedArray(1, mask=True) == 0` yields True.

This commit solves these problems, adding tests to prevent regression.
In the process, it also ensures that the results for structured arrays
always equals what one would get by logically combining the results
over individual parts of the structure.
@eric-wieser
Copy link
Member Author

@mhvk: We can close this as of #8590, right?

@mhvk
Copy link
Contributor
mhvk commented Apr 5, 2017

Yes!

@mhvk mhvk closed this as completed Apr 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
0