-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
API: Add device
and to_device
to numpy.ndarray
[Array API]
#25233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
device
and to_device
to numpy.ndarray
device
and to_device
to numpy.ndarray
[Array API]
4a36606
to
5037b48
Compare
5037b48
to
ebe3503
Compare
4e99405
to
89184df
Compare
Hi @seberg, |
I think there's enough code out there relying on the identity comparison fast path in There's a real test failure though:
|
Right, I need to adjust the regex to conform to pypy's ndarray name. |
Now it's ready. |
Just for the record - it would be nice to see a little justification here or in the eventual NEP for why we need this, despite the only allowed options being I would also add to all the docstrings for
|
94377a8
to
91b8d6f
Compare
device
and to_device
to numpy.ndarray
[Array API]device
and to_device
to numpy.ndarray
[Array API]
Hmm, looks like those note admonitions don't show up in the docstrings: https://output.circle-artifacts.com/output/job/ce6f8552-de08-47c3-80bb-91eb2ad750a5/artifacts/0/doc/build/html/reference/generated/numpy.empty.html#numpy.empty |
91b8d6f
to
dc7848e
Compare
Ah, I missed one |
be8b328
to
5498f7f
Compare
In accordance with the array API. See numpy#25233.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments mostly from the point of view that we should try to minimize any impact from an argument that for numpy is not useful. So I'd put device
after like
, and would try to minimize extra lines added and impact on code by passing it on as much as possible, so that there is only one place where the error gets raised (in the C code parsing device
).
@@ -912,7 +912,7 @@ | |||
|
|||
add_newdoc('numpy._core.multiarray', 'asarray', | |||
""" | |||
asarray(a, dtype=None, order=None, *, like=None) | |||
asarray(a, dtype=None, order=None, *, device=None, like=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below, why is device
ahead of like
? Since one cannot use it for anything useful, my tendency would be to put it at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why not just set the default to "cpu"
? That seems clearer. (Still fine to accept None
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the function dispatch requires like
to be the last argument - here's a check that raises RuntimeError
if a function doesn't comply with it:
numpy/numpy/_core/overrides.py
Lines 149 to 156 in 448e4da
last_arg = co.co_argcount + co.co_kwonlyargcount - 1 | |
last_arg = co.co_varnames[last_arg] | |
if last_arg != "like" or co.co_kwonlyargcount == 0: | |
raise RuntimeError( | |
"__array_function__ expects `like=` to be the last " | |
"argument and a keyword-only argument. " | |
f"{implementation} does not seem to comply.") | |
Can I keep device
before like
or should we update the __array_function__
protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why not just set the default to
"cpu"
? That seems clearer. (Still fine to acceptNone
).
That's true that "cpu"
is the only supported value, but I would prefer to stick to the Array API standard here. It defines the default value for device
as None
: https://data-apis.org/array-api/latest/API_specification/generated/array_api.asarray.html
We're adding these keyword arguments for Array API compatibility only, so I would prefer to fully conform with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters from the Array API perspective, it doesn't make NumPy non-compliant, it just makes it explicit about what that default is.
But, I'll give a very slight vote to just keep it None
anyway. Just becuase I don't think the more explicit device="cpu"
makes it clearer, since users should ignore it anyway (you only use it via xp = arr.__array_namespace__()
as an Array API user, never directly as a NumPy user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order doesn't matter much in principle, since it's keyword-only and like
is only for other non-numpy libraries as well. However, there's a bigger problem here:
>>> import numpy as np
>>> import dask.array as da
>>> d = da.ones((1, 2))
>>> np.asarray([1, 2], device='cpu')
array([1, 2])
>>> np.asarray([1, 2], device='cpu', like=d)
Traceback (most recent call last):
Cell In[24], line 1
np.asarray([1, 2], device='cpu', like=d)
File ~/code/tmp/dask/dask/array/core.py:1760 in __array_function__
return da_func(*args, **kwargs)
File ~/code/tmp/dask/dask/array/core.py:4581 in asarray
return from_array(a, getitem=getter_inline, **kwargs)
TypeError: from_array() got an unexpected keyword argument 'device'
The device
keyword probably should not be added to the dispatchers for any of these creation functions. It looks like the __array_function__
protocol wasn't implemented in a forward-looking way - I haven't looked much deeper yet, but at first sight it may be the case that we can never add a new keyword to dispatched functions anymore.
In this case, device
is here to support writing cross-library-compatible code via the array API standard. The dispatching protocols kinda were earlier attempts to do something similar. So probably they should not interact anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah wait, it's maybe not that extreme - np.asarray([1, 2], like=d)
still works, so new keywords are fine, as long as they are not used (then they don't get forwarded). Of course that's a bit funky, but there's no way to avoid it really.
>>> np.asarray([1, 2], device='cpu')
array([1, 2])
>>> np.asarray([1, 2], like=d)
dask.array<array, shape=(2,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> np.asarray([1, 2], like=d, device='cpu')
...
TypeError: from_array() got an unexpected keyword argument 'device'
This would have been much more problematic if libraries like SciPy and scikit-learn supported non-numpy array inputs via __array_ufunc/function__
. But since they squash such inputs anyway with np.asarray
, there shouldn't be too much of an issue in practice.
And it's fairly easy to add support for new keywords in Dask & co. They just need a new 2.0-compatible release that adds such support, and then device
/like
can be used together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rgommers - I've gotten quite used to updating astropy's __array_function__
wrappers to remain compatible with numpy functions -- and we've got tests to check that our signatures remain consistent. Of course it doesn't happen that often. So, I don't think that's a worry, and overall I'd tend to not special-case any arguments (i.e., pass it on if present).
On the order: if like
has to be last, then just keep it like that, sorry for not noticing myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: the check is only there to test habits, and probably mainly to make sure it is kwarg only.
It may have to be last in C (not sure), but it doesn't really matter anywhere here. But... it also doesn't matter a lot (especially since if someone cares, they can change it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mhvk, that is useful context. A test that flag mismatches sounds like a great idea. 🤞🏼 most implementers have that.
Co-authored-by: Aaron Meurer <asmeurer@gmail.com>
Merging this one as well since all comments have been addressed. Thanks @mtsokol! |
Hi @rgommers @seberg,
This PR introduces
device
andto_device
members tonumpy.ndarray
.Additionally,
device
keyword-only arguments were added to:numpy.asarray
,numpy.arange
,numpy.empty
,numpy.empty_like
,numpy.eye
,numpy.full
,numpy.full_like
,numpy.linspace
,numpy.ones
,numpy.ones_like
,numpy.zeros
, andnumpy.zeros_like
.