-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
BUG: Fix array_equal for numeric and non-numeric scalar types #27275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
2211bc9
95355b7
b15960f
4ba6b85
8ede522
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2554,24 +2554,24 @@ def array_equal(a1, a2, equal_nan=False): | |
if a1.shape != a2.shape: | ||
return False | ||
if not equal_nan: | ||
return builtins.bool((a1 == a2).all()) | ||
cannot_have_nan = (_dtype_cannot_hold_nan(a1.dtype) | ||
and _dtype_cannot_hold_nan(a2.dtype)) | ||
if cannot_have_nan: | ||
if a1 is a2: | ||
return True | ||
return builtins.bool((a1 == a2).all()) | ||
return builtins.bool((asarray(a1 == a2)).all()) | ||
|
||
if a1 is a2: | ||
# nan will compare equal so an array will compare equal to itself. | ||
return True | ||
|
||
cannot_have_nan = (_dtype_cannot_hold_nan(a1.dtype) | ||
and _dtype_cannot_hold_nan(a2.dtype)) | ||
if cannot_have_nan: | ||
return builtins.bool(asarray(a1 == a2).all()) | ||
|
||
# Handling NaN values if equal_nan is True | ||
a1nan, a2nan = isnan(a1), isnan(a2) | ||
# NaN's occur at different locations | ||
if not (a1nan == a2nan).all(): | ||
return False | ||
# Shapes of a1, a2 and masks are guaranteed to be consistent by this point | ||
return builtins.bool((a1[~a1nan] == a2[~a1nan]).all()) | ||
return builtins.bool(asarray(a1[~a1nan] == a2[~a1nan]).all()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suspect this one might not be needed because we can only end up here with numeric types (the calls There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, all numeric types can be compared. So I think you can skip it if you like. |
||
|
||
|
||
def _array_equiv_dispatcher(a1, a2): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this changes behaviour for
MaskedArray
, since it returns a masked array of bool, and here you strip the mask. In principle, not bad conceptually, since the underlying value isFalse
if one array element is masked and the other is not, andTrue
if both are masked. But definitely a change in API that might catch people by surprise: it will no longer be just(a1 == a2).all()
, which I think is what most people would expect.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. I did not check the reasoning for the
asarray
-asanyarray
would solve the problem withMaskedArray
- I think that may be good enough to solve the problem here too? Alternatively, replace(...).all()
withnp.all(...)
- that will deal with python bools fine.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, maybe
np.all
is best, then. Theasarray
was restoring old behavior, but hand't though about it potentially making a regression for masked (where removing it was maybe an improvement)...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this vague feeling that I indeed removed to ensure
MaskedArray
worked. But it may have beenassert_array_equal
... Unfortunately,MaskedArray
tests do not really cover the case with regular numpy functions much.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using
asanyarray
preserves the masked array and solves issue #27271, so I updated the PR to use that.@mhvk Does that address your point? As far as I understand it, the problem with
asarray
is that it creates an unnecessary copy of the masked array (and hence uses more memory).I am also looking into updating the scalar comparison to solve #27271 by making
np.array(1) == np.array('s')
return a numpy bool instead of a python bool.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is not the use of memory, but rather than
np.asarray
makes any array subclass into a regularndarray
. So,np.asanyarray
solves that!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second line of
np.array_equal
all input arrays are transformed withasarray
, so I am not sure any masked arrays can arrive at the lines we are discussing now. We could change the second line intoa1, a2 = asanyarray(a1), asanyarray(a2)
so that masked arrays are handled.To make it a bit more concrete here is a minimal example showing the behaviour of masked arrays.
On numpy 1.26 and 2.0 the output of
np.array_equal(x, y)
isTrue
and the output ofnp.array_equal(x, w)
isFalse
. But since the input arrays are fully masked, one could argue that the output of bothnp.array_equal(x, y)
andnp.array_equal(x, w)
should be the same (but I would not be sure whether the result should then beTrue
orFalse
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. But I guess
asanyarray
is just as well, anyway, it just shouldn't make a difference.