8000 NEP: Adjust NEP-35 to make it more user-accessible by pentschev · Pull Request #17093 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

NEP: Adjust NEP-35 to make it more user-accessible #17093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Sep 7, 2020
Merged
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
7d7b46c
NEP: Adjust NEP-35 to make it more user-accessible
pentschev Aug 14, 2020
9b660e4
NEP: Simplify NEP-35 further with reviewer's suggestions
pentschev Aug 17, 2020
68fd054
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
61dcb63
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
3cf7b6b
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
52d9c74
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
615f19f
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
69e3e71
NEP: Improve NEP-35 abstract per @mattip's suggestion
pentschev Aug 19, 2020
cde3543
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
1017007
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
a82cc4b
NEP: Move NumPy users comment to top of NEP-35 Usage and Impact
pentschev Aug 19, 2020
17620c2
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
f1d1562
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
57d6bab
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
67c9733
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
974c023
Update doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
pentschev Aug 19, 2020
b5f5577
NEP: Clarify NEP-35 C implementation details.
pentschev Aug 19, 2020
2e30534
NEP: Clarify Dask intent with `my_dask_pad` function name
pentschev Aug 19, 2020
3d527ea
NEP: Improve grammar on NEP-35 reference to Dask's objects
pentschev Aug 19, 2020
b6f2c16
NEP: Fix some grammar and formatting in NEP-35
pentschev Aug 19, 2020
57f78df
ENH: Clarifies meta_from_array function in NEP-35
pentschev Aug 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
NEP: Simplify NEP-35 further with reviewer's suggestions
  • Loading branch information
pentschev committed Aug 17, 2020
commit 9b660e445bb19331b8d4308223b9da418166ef80
56 changes: 39 additions & 17 deletions doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,34 +27,49 @@ Motivation and Scope
Many are the libraries implementing the NumPy API, such as Dask for graph
computing, CuPy for GPGPU computing, xarray for N-D labeled arrays, etc. All
the libraries mentioned have yet another thing in common: they have also adopted
the ``__array_function__`` protocol. The protocol defines a mechanism allowing a
user to directly use the NumPy API as a dispatcher based on the input array
type. In essence, dispatching means users are able to pass a downstream array,
such as a Dask array, directly to one of NumPy's compute functions, and NumPy
will be able to automatically recognize that and send the work back to Dask's
implementation of that function, which will define the return value. For
example:
the ``__array_function__`` protocol; a protocol that allows NumPy to understand
and treat downstream objects as if they are the native ``numpy.ndarray`` object.
Hence the community while using various libraries still benefits from a unified
NumPy API. This not only brings great convenience for standardization but also
removes the burden of learning a new API and rewriting code for every new
object. In more technical terms, this mechanism of the protocol is called a
"dispatcher", which is the terminology we use from here onwards when referring
to that.


.. code:: python

x = dask.array.arange(5) # Creates dask.array
np.sum(a) # Returns dask.array
np.diff(x) # Returns dask.array

Note above how we called Dask's implementation of ``sum`` via the NumPy
namespace by calling ``np.sum``, and the same would apply if we had a CuPy
namespace by calling ``np.diff``, and the same would apply if we had a CuPy
array or any other array from a library that adopts ``__array_function__``.
This allows writing code that is agnostic to the implementation library, thus
users can write their code once and still be able to use different array
implementations according to their needs.

Unfortunately, ``__array_function__`` has limitations, one of them being array
creation functions. In the example above, NumPy was able to call Dask's
implementation because the input array was a Dask array. The same is not true
for array creation functions, in the example the input of ``arange`` is simply
the integer ``5``, not providing any information of the array type that should
be the result, that's where a reference array passed by the ``like=`` argument
proposed here can be of help, as it provides NumPy with the information
required to create the expected type of array.
Obviously, having a protocol in-place is useful if the arrays are created
elsewhere and let NumPy handle them. But still these arrays have to be started
in their native library and brought back. Instead if it was possible to create
these objects through NumPy API then there would be an almost complete
experience, all using NumPy syntax. For example, say we have some CuPy array
Comment on lines +50 to +54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Obviously, having a protocol in-place is useful if the arrays are created
elsewhere and let NumPy handle them. But still these arrays have to be started
in their native library and brought back. Instead if it was possible to create
these objects through NumPy API then there would be an almost complete
experience, all using NumPy syntax. For example, say we have some CuPy array
The mechanism as described above covers cases where the input is already an array.
These arrays have to be created. NumPy provides `array creation routines` like
`np.ones` but how to use these to create a CuPy or Dask array? For example,
say we have some CuPy array

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one of the reasons why I got confused in the beginning. "Why should I even be able to create other arrays if I am already using those libs? I can create via CuPy or Dask whatever I need".

That detail is being lost with this compact narrative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was suggested in #17093 (comment) . Given that the intent was to make the NEP clearer to users, I agree with @ilayn that the detail is getting lost.

``cp_arr`` , and want a similar CuPy array with identity matrix. We could still
write the following:

.. code:: python
x = cupy.identity(3)

Instead, the better way would be using to only use the NumPy API, this could now
be achieved with:

.. code:: python
x = np.identity(3, like=cp_arr)

As if by magic, ``x`` will also be a CuPy array, as NumPy was capable to infer
that from the type of ``cp_arr``. Note that this last step would not be possible
without ``like=``, as it would be impossible for the NumPy to know the user
expects a CuPy array based only on the integer input.

The new ``like=`` keyword proposed is solely intended to identify the downstream
library where to dispatch and the object is used only as reference, meaning that
Expand Down Expand Up @@ -150,6 +165,13 @@ impossible to ensure ``my_pad`` creates a padding array with a type matching
that of the input array, which would cause cause a ``TypeError`` exception to
be raised by CuPy, as discussed above would happen to the CuPy case alone.
7DF2
Current NumPy users who don't use other arrays from downstream libraries should
have no impact in their current usage of the NumPy API. In the event of the
user passing a NumPy array to ``like=``, that will continue to work as if no
array was passed via that argument. However, this is advised against, as
internally there will be additional checks required that will have an impact in
performance.

Backward Compatibility
----------------------

Expand Down
0