3
3
Interoperability with NumPy
4
4
***************************
5
5
6
- NumPy’ s ndarray objects provide both a high-level API for operations on
6
+ NumPy' s ndarray objects provide both a high-level API for operations on
7
7
array-structured data and a concrete implementation of the API based on
8
- `strided in-RAM storage <https://numpy.org/doc/stable/reference/ arrays.html >`__ .
8
+ :ref: `strided in-RAM storage <arrays >` .
9
9
While this API is powerful and fairly general, its concrete implementation has
10
10
limitations. As datasets grow and NumPy becomes used in a variety of new
11
11
environments and architectures, there are cases where the strided in-RAM storage
@@ -29,44 +29,39 @@ Using arbitrary objects in NumPy
29
29
30
30
When NumPy functions encounter a foreign object, they will try (in order):
31
31
32
- 1. The buffer protocol, described `in the Python C-API documentation
33
- <https://docs.python.org/3/ c-api/buffer.html> `__ .
32
+ 1. The buffer protocol, described :py:doc: `in the Python C-API documentation
33
+ <c-api/buffer>` .
34
34
2. The ``__array_interface__ `` protocol, described
35
- :ref: `in this page <arrays.interface >`. A precursor to Python’ s buffer
35
+ :ref: `in this page <arrays.interface >`. A precursor to Python' s buffer
36
36
protocol, it defines a way to access the contents of a NumPy array from other
37
37
C extensions.
38
- 3. The ``__array__ `` protocol , which asks an arbitrary object to convert itself
39
- into an array.
38
+ 3. The ``__array__() `` method , which asks an arbitrary object to convert
39
+ itself into an array.
40
40
41
41
For both the buffer and the ``__array_interface__ `` protocols, the object
42
42
describes its memory layout and NumPy does everything else (zero-copy if
43
- possible). If that’ s not possible, the object itself is responsible for
43
+ possible). If that' s not possible, the object itself is responsible for
44
44
returning a ``ndarray `` from ``__array__() ``.
45
45
46
- The array interface
47
- ~~~~~~~~~~~~~~~~~~~
46
+ The array interface protocol
47
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48
48
49
- The :ref: `array interface <arrays.interface >` defines a protocol for array-like
50
- objects to re-use each other’ s data buffers. Its implementation relies on the
51
- existence of the following attributes or methods:
49
+ The :ref: `array interface protocol <arrays.interface >` defines a way for
50
+ array-like objects to re-use each other' s data buffers. Its implementation
51
+ relies on the existence of the following attributes or methods:
52
52
53
53
- ``__array_interface__ ``: a Python dictionary containing the shape, the
54
54
element type, and optionally, the data buffer address and the strides of an
55
55
array-like object;
56
56
- ``__array__() ``: a method returning the NumPy ndarray view of an array-like
57
57
object;
58
- - ``__array_struct__ ``: a ``PyCapsule `` containing a pointer to a
59
- ``PyArrayInterface `` C-structure.
60
58
61
- The ``__array_interface__ `` and ``__array_struct__ `` attributes can be inspected
62
- directly:
59
+ The ``__array_interface__ `` attribute can be inspected directly:
63
60
64
61
>>> import numpy as np
65
62
>>> x = np.array([1 , 2 , 5.0 , 8 ])
66
63
>>> x.__array_interface__
67
64
{'data': (94708397920832, False), 'strides': None, 'descr': [('', '<f8')], 'typestr': '<f8', 'shape': (4,), 'version': 3}
68
- >>> x.__array_struct__
69
- <capsule object NULL at 0x7f798800be40>
70
65
71
66
The ``__array_interface__ `` attribute can also be used to manipulate the object
72
67
data in place:
@@ -96,21 +91,20 @@ We can check that ``arr`` and ``new_arr`` share the same data buffer:
96
91
array([1000, 2, 3, 4])
97
92
98
93
99
- The ``__array__ `` protocol
94
+ The ``__array__() `` method
100
95
~~~~~~~~~~~~~~~~~~~~~~~~~~
101
96
102
- The ``__array__ `` protocol acts as a dispatch mechanism and ensures that any
103
- NumPy-like object (an array, any object exposing the array interface, an object
104
- whose `` __array__ `` method returns an array or any nested sequence) that
105
- implements it can be used as a NumPy array. If possible, this will mean using
106
- `` __array__ `` to create a NumPy ndarray view of the array-like object.
107
- Otherwise, this copies the data into a new ndarray object. This is not optimal,
108
- as coercing arrays into ndarrays may cause performance problems or create the
109
- need for copies and loss of metadata .
97
+ The ``__array__() `` method ensures that any NumPy-like object (an array, any
98
+ object exposing the array interface, an object whose `` __array__() `` method
99
+ returns an array or any nested sequence) that implements it can be used as a
100
+ NumPy array. If possible, this will mean using `` __array__() `` to create a NumPy
101
+ ndarray view of the array-like object. Otherwise, this copies the data into a
102
+ new ndarray object. This is not optimal, as coercing arrays into ndarrays may
103
+ cause performance problems or create the need for copies and loss of metadata,
104
+ as the original object and any attributes/behavior it may have had, is lost .
110
105
111
- To see an example of a custom array implementation including the use of the
112
- ``__array__ `` protocol, see `Writing custom array containers
113
- <https://numpy.org/devdocs/user/basics.dispatch.html> `__.
106
+ To see an example of a custom array implementation including the use of
107
+ ``__array__() ``, see :ref: `basics.dispatch `.
114
108
115
109
Operating on foreign objects without converting
116
110
-----------------------------------------------
@@ -121,7 +115,11 @@ Consider the following function.
121
115
>>> def f (x ):
122
116
... return np.mean(np.exp(x))
123
117
124
- We can apply it to a NumPy ndarray object directly:
118
+ Note that `np.exp ` is a :ref: `ufunc <ufuncs-basics >`, which means that it
119
+ operates on ndarrays in an element-by-element fashion. On the other hand,
120
+ `np.mean ` operates along one of the array's axes.
121
+
122
+ We can apply ``f `` to a NumPy ndarray object directly:
125
123
126
124
>>> x = np.array([1 , 2 , 3 , 4 ])
127
125
>>> f(x)
@@ -149,9 +147,13 @@ The ``__array_ufunc__`` protocol
149
147
A :ref: `universal function (or ufunc for short) <ufuncs-basics >` is a
150
148
“vectorized” wrapper for a function that takes a fixed number of specific inputs
151
149
and produces a fixed number of specific outputs. The output of the ufunc (and
152
- its methods) is not necessarily an ndarray, if all input arguments are not
150
+ its methods) is not necessarily an ndarray, if not all input arguments are
153
151
ndarrays. Indeed, if any input defines an ``__array_ufunc__ `` method, control
154
- will be passed completely to that function, i.e., the ufunc is overridden.
152
+ will be passed completely to that function, i.e., the ufunc is overridden. The
153
+ ``__array_ufunc__ `` method defined on that (non-ndarray) object has access to
154
+ the NumPy ufunc. Because ufuncs have a well-defined structure, the foreign
155
+ ``__array_ufunc__ `` method may rely on ufunc attributes like ``.at() ``,
156
+ ``.reduce() ``, and others.
155
157
156
158
A subclass can override what happens when executing NumPy ufuncs on it by
157
159
overriding the default ``ndarray.__array_ufunc__ `` method. This method is
@@ -169,9 +171,7 @@ is safe and consistent across projects.
169
171
170
172
The semantics of ``__array_function__ `` are very similar to ``__array_ufunc__ ``,
171
173
except the operation is specified by an arbitrary callable object rather than a
172
- ufunc instance and method. For more details, see `NEP 18
173
- <https://numpy.org/neps/nep-0018-array-function-protocol.html> `__.
174
-
174
+ ufunc instance and method. For more details, see :ref: `NEP18 `.
175
175
176
176
Interoperability examples
177
177
-------------------------
@@ -223,7 +223,7 @@ Example: PyTorch tensors
223
223
224
224
`PyTorch <https://pytorch.org/ >`__ is an optimized tensor library for deep
225
225
learning using GPUs and CPUs. PyTorch arrays are commonly called *tensors *.
226
- Tensors are similar to NumPy’ s ndarrays, except that tensors can run on GPUs or
226
+ Tensors are similar to NumPy' s ndarrays, except that tensors can run on GPUs or
227
227
other hardware accelerators. In fact, tensors and NumPy arrays can often share
228
228
the same underlying memory, eliminating the need to copy data.
229
229
@@ -251,13 +251,22 @@ explicit conversion:
251
251
Also, note that the return type of this function is compatible with the initial
252
252
data type.
253
253
254
- **Note ** PyTorch does not implement ``__array_function__ `` or
255
- ``__array_ufunc__ ``. Under the hood, the ``Tensor.__array__() `` method returns a
256
- NumPy ndarray as a view of the tensor data buffer. See `this issue
257
- <https://github.com/pytorch/pytorch/issues/24015> `__ and the
258
- `__torch_function__ implementation
259
- <https://github.com/pytorch/pytorch/blob/master/torch/overrides.py> `__
260
- for details.
254
+ .. admonition :: Warning
255
+
256
+ While this mixing of ndarrays and tensors may be convenient, it is not
257
+ recommended. It will not work for non-CPU tensors, and will have unexpected
258
+ behavior in corner cases. Users should prefer explicitly converting the
259
+ ndarray to a tensor.
260
+
261
+ .. note ::
262
+
263
+ PyTorch does not implement ``__array_function__ `` or ``__array_ufunc__ ``.
264
+ Under the hood, the ``Tensor.__array__() `` method returns a NumPy ndarray as
265
+ a view of the tensor data buffer. See `this issue
266
+ <https://github.com/pytorch/pytorch/issues/24015> `__ and the
267
+ `__torch_function__ implementation
268
+ <https://github.com/pytorch/pytorch/blob/master/torch/overrides.py> `__
269
+ for details.
261
270
262
271
Example: CuPy arrays
263
272
~~~~~~~~~~~~~~~~~~~~
@@ -271,7 +280,8 @@ with Python. CuPy implements a subset of the NumPy interface by implementing
271
280
>>> x_gpu = cp.array([1 , 2 , 3 , 4 ])
272
281
273
282
The ``cupy.ndarray `` object implements the ``__array_ufunc__ `` interface. This
274
- enables NumPy ufuncs to be directly operated on CuPy arrays:
283
+ enables NumPy ufuncs to be applied to CuPy arrays (this will defer operation to
284
+ the matching CuPy CUDA/ROCm implementation of the ufunc):
275
285
276
286
>>> np.mean(np.exp(x_gpu))
277
287
array(21.19775622)
@@ -307,8 +317,7 @@ implements a subset of the NumPy ndarray interface using blocked algorithms,
307
317
cutting up the large array into many small arrays. This allows computations on
308
318
larger-than-memory arrays using multiple cores.
309
319
310
- Dask supports array protocols like ``__array__ `` and
311
- ``__array_ufunc__ ``.
320
+ Dask supports ``__array__() `` and ``__array_ufunc__ ``.
312
321
313
322
>>> import dask.array as da
314
323
>>> x = da.random.normal(1 , 0.1 , size = (20 , 20 ), chunks = (10 , 10 ))
@@ -317,8 +326,10 @@ Dask supports array protocols like ``__array__`` and
317
326
>>> np.mean(np.exp(x)).compute()
318
327
5.090097550553843
319
328
320
- **Note ** Dask is lazily evaluated, and the result from a computation isn’t
321
- computed until you ask for it by invoking ``compute() ``.
329
+ .. note ::
330
+
331
+ Dask is lazily evaluated, and the result from a computation isn't computed
332
+ until you ask for it by invoking ``compute() ``.
322
333
323
334
See `the Dask array documentation
324
335
<https://docs.dask.org/en/stable/array.html> `__
@@ -328,13 +339,10 @@ and the `scope of Dask arrays interoperability with NumPy arrays
328
339
Further reading
329
340
---------------
330
341
331
- - `The Array interface
332
- <https://numpy.org/doc/stable/reference/arrays.interface.html> `__
333
- - `Writing custom array containers
334
- <https://numpy.org/devdocs/user/basics.dispatch.html> `__.
335
- - `Special array attributes
336
- <https://numpy.org/devdocs/reference/arrays.classes.html#special-attributes-and-methods> `__
337
- (details on the ``__array_ufunc__ `` and ``__array_function__ `` protocols)
342
+ - :ref: `arrays.interface `
343
+ - :ref: `basics.dispatch `
344
+ - :ref: `special-attributes-and-methods ` (details on the ``__array_ufunc__ `` and
345
+ ``__array_function__ `` protocols)
338
346
- `NumPy roadmap: interoperability
339
347
<https://numpy.org/neps/roadmap.html#interoperability> `__
340
348
- `PyTorch documentation on the Bridge with NumPy
0 commit comments