-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: __array_function__
support for np.lib
, part 2/2
#12119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
xref GH12028 np.lib.npyio through np.lib.ufunclike
There's some weird interaction between
|
The problem seems to be that my dummy I see three possible ways to fix this:
|
I decided to go for a mix of 2 and 3:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We perhaps should get rid of the hack sometime, but other than that I'm okay with this.
return (x, out) | ||
|
||
|
||
@array_function_dispatch(_fix_dispatcher, verify=False) | ||
@_deprecate_out_named_y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be added to the dispatcher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a better idea. That way the warning will be raised even if the function is dispatching.
Note: This one still needs more work before it's ready to merge (to fix the deprecation warnings). |
OK, inspired by @mhvk's and @hameerabbasi's comments I consolidated a bunch of unnecessarily repeated dispatcher functions here, too. |
This is ready for a final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor bug I spotted as I was going through this.
@@ -309,6 +323,12 @@ def log10(x): | |||
x = _fix_real_lt_zero(x) | |||
return nx.log10(x) | |||
|
|||
|
|||
def _logn_dispatcher(n, x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return both arguments -- Can have logs of different bases as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes -- I think I missed this because I only read the docstring, which says n
is an int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but some suggestions for using decorators for a group.
Also, more important, the last comment about the change in testing utils.py
- that was really tricky to get right; I would suggest not to change it unless truly needed.
@@ -712,7 +759,12 @@ def array_split(ary, indices_or_sections, axis=0): | |||
return sub_arys | |||
|
|||
|
|||
def split(ary,indices_or_sections,axis=0): | |||
def _split_dispatcher(ary, indices_or_sections, axis=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could use a single _split_dispatcher
with the function above as well as below (where the function currently gets redefined, but then it does get reused a few times).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch -- fixed
numpy/lib/twodim_base.py
Outdated
@@ -373,6 +386,11 @@ def tri(N, M=None, k=0, dtype=float): | |||
return m | |||
|
|||
|
|||
def _tril_dispatcher(m, k=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it triul_dispatcher
, since it is used for both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
numpy/lib/twodim_base.py
Outdated
@@ -812,6 +842,11 @@ def tril_indices(n, k=0, m=None): | |||
return nonzero(tri(n, m, k=k, dtype=bool)) | |||
|
|||
|
|||
def _tril_indices_form_dispatcher(arr, k=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, either _tri
or _triul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
numpy/lib/type_check.py
Outdated
@@ -183,6 +194,11 @@ def imag(val): | |||
return asanyarray(val).imag | |||
|
|||
|
|||
def _iscomplex_dispatcher(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next few could be common _is_type_dispatcher
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
numpy/testing/_private/utils.py
Outdated
@@ -713,7 +713,7 @@ def func_assert_same_pos(x, y, func=isnan, hasval='nan'): | |||
# such subclasses, but some used to work. | |||
x_id = func(x) | |||
y_id = func(y) | |||
if npall(x_id == y_id) != True: | |||
if (x_id == y_id).all() != True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the comment above, I think this should not be changed... Alternatively, at least the comment should be adjusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. The problem is that now np.all()
may likely not be implemented for an ndarray subclass, if it doesn't define np.all
in __array_function__
.
I think we can just delete this comment, because it's no longer true that np.all()
can work when a subclass implements.all()
differently -- we already use getattr()
to pull out a all()
method to call if one exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see, thanks. I think the comment was a little misleading here -- I think the problem was classes that define equality differently (e.g., to return a boolean) rather than classes that don't define an .all()
method.
numpy/testing/_private/utils.py
Outdated
if npall(x_id == y_id) != True: | ||
result = x_id == y_id | ||
all_equal = (result.all() | ||
if isinstance(result, ndarray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this pass through subclasses? And if so, will that be an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the worry of @hameerabbasi - also, at least the comment reads a bit weird now: I don't think we have to worry about classed that define __array_function__
but do not override np.all
- they're new and they can override easily as part of their trials; maybe just leave it as it was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this passes through subclasses -- which is the point.
I don't think we have to worry about classed that define array_function but do not override np.all - they're new and they can override easily as part of their trials; maybe just leave it as it was?
I don't exactly disagree with this, but this did come up with the ndarray subclass I made for unit testing purposes. With the arrival of __array_function__
, it will be easier to depend on built-in methods like all()
rather than np.all()
. I guess we offer no guarantees for subclasses that don't implement the full numpy API, but it still feels like not a great user experience.
Two other ways to do this:
- Cast to a NumPy boolean scalar/array, which guarantees that we have an
all()
method:np.bool_(x_id == y_id).all()
. This looks less hacky and would still work for all the identified use-cases. - Give up on duck-typing for
assert_array_equal
, and instead decorate it witharray_function_dispatch
for explicit dispatching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, if this won't cause any problems, then I agree with this design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option 1 seems to cover all bases, and is shorter than what you have now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Testing out masked arrays, they no longer seem to return masked arrays from .all()
or np.all()
:
In [16]: x = np.ma.array([1, 2, 3], mask=[False, True, False])
In [17]: x == x
Out[17]:
masked_array(data=[True, --, True],
mask=[False, True, False],
fill_value=True)
In [18]: (x == x).all()
Out[18]: True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should cause a test failure, and if there is no test like that we should add one. Also note there are test failures with datetime64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should cause a test failure, and if there is no test like that we should add one. Also note there are test failures with datetime64
I'm not sure. What would you do with a mask at all on a reduction? I mean, it makes sense to only reduce over the non-masked objects, but what is the output mask? For zero nonmasked elements do you set it to the identity of the ufunc or to False
? Or do you just mask everything where any element is masked? Likely not the right answer.
Wow, that should cover it! Thanks, @shoyer! |
OK, I plan to merge this shortly unless I get more feedback. |
__array_function__
support for np.lib
, part 2/2
…#29317) TestArrayEqual.test_masked_scalar now passes. This case regressed since 7315145 (merged in numpy#12119) due to: - `<masked scalar> == <scalar>` returning np.ma.masked (not a 0-dim masked bool array), followed by - `np.bool(np.ma.masked)` unintentionally converting it to np._False There are a few ways to resolve this; I went with testing the comparison result with `isinstance(bool)` to check if a conversion to array is necessary, which is the same approach already taken in assert_array_compare after evaluating `comparison(x, y)`.
…#29317) TestArrayEqual.test_masked_scalar now passes. This case regressed since 7315145 (merged in numpy#12119) due to: - `<masked scalar> == <scalar>` returning np.ma.masked (not a 0-dim masked bool array), followed by - `np.bool(np.ma.masked)` unintentionally converting it to np._False Note on the modified comment: Confusingly, "isinstance(..., bool) checks" in the previous wording actually incorrectly referred to the ones towards the end of the function, which are not actually related to __eq__'s behavior but to the possibility of `func` returning a bool.
…#29317) TestArrayEqual.test_masked_scalar now passes. This case regressed since 7315145 (merged in numpy#12119) due to: - `<masked scalar> == <scalar>` returning np.ma.masked (not a 0-dim masked bool array), followed by - `np.bool(np.ma.masked)` unintentionally converting it to np._False Note on the modified comment: Confusingly, "isinstance(..., bool) checks" in the previous wording actually incorrectly referred to the ones towards the end of the function, which are not actually related to __eq__'s behavior but to the possibility of `func` returning a bool.
… (#29318) * TST: Add failing test TestArrayEqual.test_masked_scalar The two added test cases fail with: E AssertionError: E Arrays are not equal E E nan location mismatch: E ACTUAL: MaskedArray(3.) E DESIRED: array(3.) and E AssertionError: E Arrays are not equal E E nan location mismatch: E ACTUAL: MaskedArray(3.) E DESIRED: array(nan) * BUG: Fix np.testing utils failing for masked scalar vs. scalar (#29317) TestArrayEqual.test_masked_scalar now passes. This case regressed since 7315145 (merged in #12119) due to: - `<masked scalar> == <scalar>` returning np.ma.masked (not a 0-dim masked bool array), followed by - `np.bool(np.ma.masked)` unintentionally converting it to np._False Note on the modified comment: Confusingly, "isinstance(..., bool) checks" in the previous wording actually incorrectly referred to the ones towards the end of the function, which are not actually related to __eq__'s behavior but to the possibility of `func` returning a bool. * MNT: Improve comments on assert_array_compare nan/inf handling logic - Use same language as elsewhere below to explain `!= True` used to handle np.ma.masked - Clarify committed to support standard MaskedArrays - Restore note lost in 7315145 comment changes about how the np.bool casts towards the end of the function handle np.ma.masked, and expand further. * TST: Expand TestArrayEqual.test_masked_scalar
xref #12028
np.lib.npyio through np.lib.ufunclike