10000 MAINT, DOC: make np._from_dlpack public by tirthasheshpatel · Pull Request #21145 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT, DOC: make np._from_dlpack public #21145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 10, 2022
Merged
Prev Previous commit
Next Next commit
DOC: add the DLPack protocol to interpolability docs
  • Loading branch information
tirthasheshpatel committed Mar 3, 2022
commit 492489f862473a2f4f26dfd48726f8d6293266b4
3 changes: 2 additions & 1 deletion doc/source/reference/arrays.interface.rst
8000
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,8 @@ flag is present.
.. note::

:obj:`__array_struct__` is considered legacy and should not be used for new
code. Use the :py:doc:`buffer protocol <c-api/buffer>` instead.
code. Use the :py:doc:`buffer protocol <c-api/buffer>` or the DLPack protocol
`numpy.from_dlpack` instead.


Type description examples
Expand Down
97 changes: 97 additions & 0 deletions doc/source/user/basics.interoperability.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,14 @@ describes its memory layout and NumPy does everything else (zero-copy if
possible). If that's not possible, the object itself is responsible for
returning a ``ndarray`` from ``__array__()``.

DLPack_ is yet another protocol to convert foriegn objects to NumPy arrays in
a language and device agnostic manner. NumPy doesn't implicitly convert objects
to ndarrays using DLPack. It provides the function `numpy.from_dlpack` that
accepts any object implementing the ``__dlpack__`` method and outputs a NumPy
ndarray (which is generally a view of the input object's data buffer). The
`Python specification for DLPack <https://dmlc.github.io/dlpack/latest/python_spec.html>`_
page explains the ``__dlpack__`` protocol in detail.

The array interface protocol
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -118,6 +126,25 @@ as the original object and any attributes/behavior it may have had, is lost.
To see an example of a custom array implementation including the use of
``__array__()``, see :ref:`basics.dispatch`.

The DLPack Protocol
~~~~~~~~~~~~~~~~~~~

The DLPack_ protocol defines a memory-layout of strided n-dimensional array
objects. It offers the following syntax for data exchange:

1. A ``from_dlpack(x)`` function, which accepts (array) objects with a
``__dlpack__`` method and uses that method to construct a new array
containing the data from ``x``.
2. ``__dlpack__(self, stream=None)`` and ``__dlpack_device__`` methods on the
array object, which will be called from within ``from_dlpack``, to query
what device the array is on (may be needed to pass in the correct
stream, e.g. in the case of multiple GPUs) and to access the data.

Unlike the buffer protocol, DLPack allows exchanging arrays containing data on
devices other than the CPU (e.g. Vulkan or GPU). Since NumPy only supports CPU,
it can only convert objects whose data exists on the CPU. But other libraries,
like PyTorch_ and CuPy_, may exchange data on GPU using this protocol.


2. Operating on foreign objects without converting
--------------------------------------------------
Expand Down Expand Up @@ -395,6 +422,75 @@ See `the Dask array documentation
and the `scope of Dask arrays interoperability with NumPy arrays
<https://docs.dask.org/en/stable/array.html#scope>`__ for details.

Example: DLPack
~~~~~~~~~~~~~~~

Several Python data science libraries implement the ``__dlpack__`` protocol.
Among them are PyTorch_ and CuPy_. A full list of libraries that implement
this protocol can be found on `this page of DLPack
documentation <https://dmlc.github.io/dlpack/latest/#purpose>`_.

Convert a PyTorch CPU tensor to NumPy array:

>>> import torch
>>> x_torch = torch.arange(5)
>>> x_torch
tensor([0, 1, 2, 3, 4])
>>> x_np = np.from_dlpack(x_torch)
>>> x_np
array([0, 1, 2, 3, 4])
>>> # note that x_np is a view of x_torch
>>> x_torch[1] = 100
>>> x_torch
tensor([ 0, 100, 2, 3, 4])
>>> x_np
array([ 0, 100, 2, 3, 4])

The imported arrays are read-only so writing or operating in-place will fail:

>>> x_np[1] = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only

A copy must be created in order to operate on the imported arrays in-place:

>>> x_np_copy = x_np.copy()
>>> x_np_copy.sort() # works

.. note::

Note that GPU tensors can't be converted to NumPy arrays since NumPy doesn't
support GPU devices:

>>> x_torch = torch.arange(5, device='cuda')
>>> np.from_dlpack(x_torch)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: Unsupported device in DLTensor.

But, if both libraries support the device the data buffer is on, it is
possible to use the ``__dlpack__`` protocol (e.g. PyTorch_ and CuPy_):

>>> x_torch = torch.arange(5, device='cuda')
>>> x_cupy = cupy.from_dlpack(x_torch)

Similarly, a NumPy array can be converted to a PyTorch tensor:

>>> x_np = np.arange(5)
>>> x_torch = torch.from_dlpack(x_np)

Read-only arrays cannot be exported:

>>> x_np = np.arange(5)
>>> x_np.flags.writeable = False
>>> torch.from_dlpack(x_np)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tirthasheshpatel/tensor_libs/virtualenvs/numpy-dev/lib/python3.9/site-packages/torch/utils/dlpack.py", line 63, in from_dlpack
dlpack = ext_tensor.__dlpack__()
TypeError: NumPy currently only supports dlpack for writeable arrays

Further reading
---------------

Expand All @@ -420,3 +516,4 @@ Further reading
.. _astropy.units: https://docs.astropy.org/en/stable/units/
.. _pint: https://pint.readthedocs.io/
.. _unyt: https://unyt.readthedocs.io/
.. _DLPack: https://dmlc.github.io/dlpack/latest
0