8000 Improve docs for `from_dlpack` and `to_dlpack` by rgommers · Pull Request #70437 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Improve docs for from_dlpack and to_dlpack #70437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

rgommers
Copy link
Collaborator

This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make from_dlpack more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or __array_interface__ support) that from_dlpack already provides.

@pytorch-probot
Copy link
pytorch-probot bot commented Dec 27, 2021
CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/rgommers/pytorch/blob/ec8a29e35a3a59b06cf196da6209cdca55e9d441/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Dec 27, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit ec8a29e (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@rgommers
Copy link
Collaborator Author

Hmm, the docs preview job has been removed and https://hud.pytorch.org/pr/70437 wants so many permissions that I can't access it. This should be good to go, but I can't verify (local KaTeX has a problem for me).

@rgommers rgommers requested a review from mruberry December 28, 2021 10:47
@gchanan gchanan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 28, 2021
@vadimkantorov
Copy link
Contributor

It mentions "old-style DLPack capsule", but is there a "new-style DLPack capsule"? I think it'd be a frequent question

@rgommers
Copy link
Collaborator Author

It mentions "old-style DLPack capsule", but is there a "new-style DLPack capsule"? I think it'd be a frequent question

the "new style DLPack" is to just pass a tensor directly. It's still the same capsule, it's now just under the hood, so better UX and no change of consuming it twice.

Could you propose a rephrase?

@vadimkantorov
Copy link
Contributor

Well, I still don't understand the meaning of old/new style :( We often need a DLPack tensor to develop an interoperable C interface (not only for exchanging tensors between NumPy / Deep Learning frameworks), like I did here: https://github.com/vadimkantorov/dotnetdlpack

From phrasing to_dplack returns a new-style capsule is still opaque to me in terms of what's a capsule or what's inside or how it's what is new/old style.

@rgommers
Copy link
Collaborator Author

To explain:

  • There is only one kind of "capsule". DLPack has been stable for years, no change in format there.
  • At the Python level this was the old usage pattern capsule = to_dlpack(t) followed by t_new = from_dlpack(capsule). There is nothing else you can do with capsule other than pass it to from_dlpack exactly once.
  • New style is t_new = from_dlpack(t). the exact same "capsule" is produced under the hood (by t.__dlpack__), used and then discarded safely. This UX is strictly better, and nothing else changes.

is still opaque to me in terms of what's a capsule or what's inside

It should be, because it's pretty much useless. The name is derived from PyCapsule, which is a Python C API thing that a user shouldn't have to know about. When the versions of both libraries you are using support __dlpack__ (and hence t_new = from_dlpack(t)), then you can forget all about capsule. If you have older versions, you will still need it.

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Dec 31, 2021

I think I understand now, but it should be explicitly explained what is the new and old style, or remove the refernece to these concepts at all and just explain that two kinds of arguments are accepted.

In my case, I needed DLPack not for interop between frameworks but to create a .NET wrapper consuming DLPack API using libtorch. So I needed to understand exactly what is the capsule and what is the field name. All this is better be explained. There was an issue in DLPack complaining that the name of the field inside the capsule is not documented: dmlc/dlpack#51. It would be better if PyTorch documented it as well or referenced some ground-truth documentation. It may also be nice to include a snippet of manual working with these capsules. This example is very useful if you are not a framework developer. E.g. I had to figure this out in https://github.com/vadimkantorov/pydlpack and figuring out the proper semantics of DLTensor deallocator and its interaction with capsule is quite non-trivial.

@rgommers
Copy link
Collaborator Author

It would be better if PyTorch documented it as well or referenced some ground-truth documentation.

That belongs in the DLPack docs, it doesn't make sense to explain the kind of details you are mentioning here in the PyTorch Python API docs - they are irrelevant to Python users. You are only running into it because you are using C/C++ APIs.

The minor catch is that DLPack currently has no docs (outside of inline comments). I plan to address that in H1 2022. For now this is probably the best user-facing DLPack documentation: https://data-apis.org/array-api/latest/design_topics/data_interchange.html#dlpack-support. Most of that should move to DLPack itself in the future.

what is the new and old style, or remove the refernece to these concepts at all

Then I'd prefer to just tweak the text a bit to remove "old style", and only state that using from_dlpack(tensor) is the preferred usage.

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Dec 31, 2021

from_dlpack(tensor) is the preferred usage.

Why not! Just please specify that the tensor object should have a __dlpack__ capsule property. So that the "old-style" usages where we manually prepare the PyCapsule can figured out.

The minor catch is that DLPack currently has no docs

Yep! And especially no docs on PyCapsule format ("dltensor" field name and deleter semantics). So having at least some notes on what PyTorch assumes of those __dlpack__ capsules is helpful. Otherwise I had to read the PyTorch source code to understand how to prepare the capsule and when exactle the deleter is called to avoid memory leaks, use after free and double free.

That belongs in the DLPack docs

Agreed!

it doesn't make sense to explain the kind of details you are mentioning here in the PyTorch Python API docs - they are irrelevant to Python users. You are only running into it because you are using C/C++ APIs.

A bit disagreeing. I'm running into it very much because DLPack is for doing interop with PyTorch's Python API, and interop with custom C code is a big part of it.

@rgommers
Copy link
Collaborator Author

A bit disagreeing. I'm running into it very much because DLPack is for doing interop with PyTorch's Python API, and interop with custom C code is a big part of it.

Okay, interesting. Not quite agreeing that we should include content that is irrelevant for Python users and only needed if you write custom C code, but I understand better what you're doing at least after I had a look at your repo.

Side note: NumPy 1.22.0 will have support for DLPack, so your custom code for that will become obsolete - it'll just be x = np.from_dlpack(tensor).

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Dec 31, 2021

Yeah, I also needed this when I was doing a "framework-agnostic" pure C module for reading audio with ffmpeg: https://github.com/vadimkantorov/readaudio. Basically you get a native DLPack structures from C land and then somehow you need to wrap it into a PyCapsule to have it ingested by NumPy / PyTorch. It was somewhat untrivial to understand how to do it correctly.

These "framework-agnostic" C extensions (that can be consumed with NumPy or other framework in absence of PyTorch python/native libraries) can be an interesting usecase in general and would be nice to have a tutorial on something like this.

Leave `to_dlpack` in the `torch.utils.dlpack` namespace here.
It is present in the main namespace as well, but undocumented there
and it's not clear if we want to keep it there - see
pytorch#70543 (comment)
Copy link
Contributor
@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


.. warning::
Only call ``from_dlpack`` once per capsule produced with ``to_dlpack``.
Behavior when a capsule is consumed multiple times is undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: technically it's well defined (an error would be raised if the capsule has been consumed with its name changed) long before we settled on __dlpack__, but I'm fine leaving it as is.

Copy link
Collaborator
@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to keep you waiting, @rgommers. Let's try to get this merged!

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Jan 25, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: #70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
pytorchmergebot pushed a commit that referenced this pull request Jan 25, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: #70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96)
@rgommers rgommers deleted the dlpack-docs-tweak branch January 25, 2022 21:17
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 3, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: pytorch/pytorch#70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 3, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: pytorch/pytorch#70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 9, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: pytorch/pytorch#70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 9, 2022
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: pytorch/pytorch#70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed module: dlpack open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0