8000 Adding description of __array_finalize__ and __array_wrap__ · numpy/numpy@c55b507 · GitHub
[go: up one dir, main page]

Skip to content

Commit c55b507

Browse files
committed
Adding description of __array_finalize__ and __array_wrap__
1 parent 7ce32d6 commit c55b507

File tree

2 files changed

+78
-28
lines changed

2 files changed

+78
-28
lines changed

doc/source/user/basics.interoperability.rst

Lines changed: 77 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -5,29 +5,41 @@ Interoperability with NumPy
55

66
NumPy's ndarray objects provide both a high-level API for operations on
77
array-structured data and a concrete implementation of the API based on
8-
:ref:`strided in-RAM storage <arrays>`.
9-
While this API is powerful and fairly general, its concrete implementation has
10-
limitations. As datasets grow and NumPy becomes used in a variety of new
11-
environments and architectures, there are cases where the strided in-RAM storage
12-
strategy is inappropriate, which has caused different libraries to reimplement
13-
this API for their own uses. This includes GPU arrays (CuPy_), Sparse arrays
14-
(`scipy.sparse`, `PyData/Sparse <Sparse_>`_) and parallel arrays (Dask_ arrays)
15-
as well as various NumPy-like implementations in deep learning frameworks, like
16-
TensorFlow_ and PyTorch_. Similarly, there are many projects that build on top
17-
of the NumPy API for labeled and indexed arrays (XArray_), automatic
18-
differentiation (JAX_), masked arrays (`numpy.ma`), physical units
19-
(astropy.units_, pint_, unyt_), among others that add additional functionality
20-
on top of the NumPy API.
8+
:ref:`strided in-RAM storage <arrays>`. While this API is powerful and fairly
9+
general, its concrete implementation has limitations. As datasets grow and NumPy
10+
becomes used in a variety of new environments and architectures, there are cases
11+
where the strided in-RAM storage strategy is inappropriate, which has caused
12+
different libraries to reimplement this API for their own uses. This includes
13+
GPU arrays (CuPy_), Sparse arrays (`scipy.sparse`, `PyData/Sparse <Sparse_>`_)
14+
and parallel arrays (Dask_ arrays) as well as various NumPy-like implementations
15+
in deep learning frameworks, like TensorFlow_ and PyTorch_. Similarly, there are
16+
many projects that build on top of the NumPy API for labeled and indexed arrays
17+
(XArray_), automatic differentiation (JAX_), masked arrays (`numpy.ma`),
18+
physical units (astropy.units_, pint_, unyt_), among others that add additional
19+
functionality on top of the NumPy API.
2120

2221
Yet, users still want to work with these arrays using the familiar NumPy API and
2322
re-use existing code with minimal (ideally zero) porting overhead. With this
2423
goal in mind, various protocols are defined for implementations of
25-
multi-dimensional arrays with high-level APIs matching NumPy.
24+
multi-dimensional arrays with high-level APIs matching NumPy.
2625

27-
Using arbitrary objects in NumPy
28-
--------------------------------
26+
Broadly speaking, there are three groups of features used for interoperability
27+
with NumPy:
2928

30-
When NumPy functions encounter a foreign object, they will try (in order):
29+
1. Methods of turning a foreign object into an ndarray;
30+
2. Methods of deferring execution from a NumPy function to another array
31+
library;
32+
3. Methods that use NumPy functions and return an instance of a foreign object.
33+
34+
We describe these features below.
35+
36+
37+
1. Using arbitrary objects in NumPy
38+
-----------------------------------
39+
40+
The first set of interoperability features from the NumPy API allows foreign
41+
objects to be treated as NumPy arrays whenever possible. When NumPy functions
42+
encounter a foreign object, they will try (in order):
3143

3244
1. The buffer protocol, described :py:doc:`in the Python C-API documentation
3345
<c-api/buffer>`.
@@ -106,18 +118,22 @@ as the original object and any attributes/behavior it may have had, is lost.
106118
To see an example of a custom array implementation including the use of
107119
``__array__()``, see :ref:`basics.dispatch`.
108120

109-
Operating on foreign objects without converting
110-
-----------------------------------------------
121+
122+
2. Operating on foreign objects without converting
123+
--------------------------------------------------
124+
125+
A second set of methods defined by the NumPy API allows us to defer the
126+
execution from a NumPy function to another array library.
111127

112128
Consider the following function.
113129

114130
>>> import numpy as np
115131
>>> def f(x):
116132
... return np.mean(np.exp(x))
117133

118-
Note that `np.exp` is a :ref:`ufunc <ufuncs-basics>`, which means that it
119-
operates on ndarrays in an element-by-element fashion. On the other hand,
120-
`np.mean` operates along one of the array's axes.
134+
Note that `np.exp <numpy.exp>` is a :ref:`ufunc <ufuncs-basics>`, which means
135+
that it operates on ndarrays in an element-by-element fashion. On the other
136+
hand, `np.mean <numpy.mean>` operates along one of the array's axes.
121137

122138
We can apply ``f`` to a NumPy ndarray object directly:
123139

@@ -126,8 +142,7 @@ We can apply ``f`` to a NumPy ndarray object directly:
126142
21.1977562209304
127143

128144
We would like this function to work equally well with any NumPy-like array
129-
object. Some of this is possible today with various protocol mechanisms within
130-
NumPy.
145+
object.
131146

132147
NumPy allows a class to indicate that it would like to handle computations in a
133148
custom-defined way through the following interfaces:
@@ -139,15 +154,15 @@ custom-defined way through the following interfaces:
139154

140155
As long as foreign objects implement the ``__array_ufunc__`` or
141156
``__array_function__`` protocols, it is possible to operate on them without the
142-
need for explicit conversion.
157+
need for explicit conversion.
143158

144159
The ``__array_ufunc__`` protocol
145160
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146161

147162
A :ref:`universal function (or ufunc for short) <ufuncs-basics>` is a
148163
“vectorized” wrapper for a function that takes a fixed number of specific inputs
149164
and produces a fixed number of specific outputs. The output of the ufunc (and
150-
its methods) is not necessarily an ndarray, if not all input arguments are
165+
its methods) is not necessarily a ndarray, if not all input arguments are
151166
ndarrays. Indeed, if any input defines an ``__array_ufunc__`` method, control
152167
will be passed completely to that function, i.e., the ufunc is overridden. The
153168
``__array_ufunc__`` method defined on that (non-ndarray) object has access to
@@ -173,6 +188,36 @@ The semantics of ``__array_function__`` are very similar to ``__array_ufunc__``,
173188
except the operation is specified by an arbitrary callable object rather than a
174189
ufunc instance and method. For more details, see :ref:`NEP18`.
175190

191+
192+
3. Returning foreign objects
193+
----------------------------
194+
195+
A third type of feature set is meant to use the NumPy function implementation
196+
and then convert the return value back into an instance of the foreign object.
197+
The ``__array_finalize__`` and ``__array_wrap__`` methods act behind the scenes
198+
to ensure that the return type of a NumPy function can be specified as needed.
199+
200+
The ``__array_finalize__`` method is the mechanism that NumPy provides to allow
201+
subclasses to handle the various ways that new instances get created. This
202+
method is called whenever the system internally allocates a new array from an
203+
object which is a subclass (subtype) of the ndarray. It can be used to change
204+
attributes after construction, or to update meta-information from the “parent.”
205+
206+
The ``__array_wrap__`` method “wraps up the action” in the sense of allowing a
207+
subclass to set the type of the return value and update attributes and metadata.
208+
This can be seen as the opposite of the ``__array__`` method. At the end of
209+
every ufunc, this method is called on the input object with the
210+
highest *array priority*, or the output object if one was specified. The
211+
``__array_priority__`` attribute is used to determine what type of object to
212+
return in situations where there is more than one possibility for the Python
213+
type of the returned object. Subclasses may opt to use this method to transform
214+
the output array into an instance of the subclass and update metadata before
215+
returning the array to the user.
216+
217+
For more information on these methods, see :ref:`basics.subclassing` and
218+
:ref:`specific-array-subtyping`.
219+
220+
176221
Interoperability examples
177222
-------------------------
178223

@@ -218,6 +263,7 @@ We can even do operations with other ndarrays:
218263
>>> type(result)
219264
numpy.ndarray
220265

266+
221267
Example: PyTorch tensors
222268
~~~~~~~~~~~~~~~~~~~~~~~~
223269

@@ -343,8 +389,11 @@ Further reading
343389
- :ref:`basics.dispatch`
344390
- :ref:`special-attributes-and-methods` (details on the ``__array_ufunc__`` and
345391
``__array_function__`` protocols)
346-
- `NumPy roadmap: interoperability
347-
<https://numpy.org/neps/roadmap.html#interoperability>`__
392+
- :ref:`basics.subclassing` (details on the ``__array_wrap__`` and
393+
``__array_finalize__`` methods)
394+
- :ref:`specific-array-subtyping` (more details on the implementation of
395+
``__array_finalize__``, ``__array_wrap__`` and ``__array_priority__``)
396+
- :doc:`NumPy roadmap: interoperability <neps:roadmap>`
348397
- `PyTorch documentation on the Bridge with NumPy
349398
<https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#bridge-to-np-label>`__
350399

doc/source/user/c-info.beyond-basics.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -450,6 +450,7 @@ type(s). In particular, to create a sub-type in C follow these steps:
450450
More information on creating sub-types in C can be learned by reading
451451
PEP 253 (available at https://www.python.org/dev/peps/pep-0253).
452452
453+
.. _specific-array-subtyping:
453454
454455
Specific features of ndarray sub-typing
455456
---------------------------------------

0 commit comments

Comments
 (0)
0