Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154568

kwen2501 · 2025-05-28T21:55:32Z

Stack from ghstack (oldest at bottom):

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds.
See:
https://pypi.nvidia.com/nvidia-nvshmem-cu12/
https://pypi.nvidia.com/nvidia-nvshmem-cu11/

[ghstack-poisoned]

pytorch-bot · 2025-05-28T21:55:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154568

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 51f8222 with merge base 241f8dc ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kwen2501 · 2025-05-28T22:19:16Z

@Skylion007 @atalman do you mind having a look?
@atalman can you please help uploading the wheels to S3?
Thanks a lot!

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

ghstack-source-id: e471569 Pull Request resolved: #154568

atalman · 2025-05-29T16:05:51Z

.github/scripts/generate_binary_build_matrix.py

@@ -53,6 +53,7 @@
        "nvidia-cusolver-cu11==11.4.1.48; platform_system == 'Linux' and platform_machine == 'x86_64' | "
        "nvidia-cusparse-cu11==11.7.5.86; platform_system == 'Linux' and platform_machine == 'x86_64' | "
        "nvidia-nccl-cu11==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | "
+        "nvidia-nvshmem-cu11==3.2.5; platform_system == 'Linux' and platform_machine == 'x86_64' | "


Hi @kwen2501 perhaps we should not modify 11.8 builds at this point we are planning on dropping support for them #147383

Thanks, removed from 11.8

Skylion007 · 2025-05-29T16:13:35Z

@kwen2501 You also need to modify the rpaths in: .ci/manywheel/build_cuda.sh. See this PR: #138547

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

kwen2501 · 2025-05-29T18:09:42Z

@Skylion007 thanks! Added rpath.

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

ghstack-source-id: c81bf59 Pull Request resolved: #154568

atalman

Hi @kwen2501 I believe it need to be added to: Bundling with cudnn and cublas. in .ci/manywheel/build_cuda.sh use case as well

To unblock pytorch/pytorch#154568

kwen2501 · 2025-06-04T17:35:06Z

@pytorchbot merge

pytorchmergebot · 2025-06-04T17:37:42Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ Pull Request resolved: pytorch#154568 Approved by: https://github.com/atalman ghstack dependencies: pytorch#154538

youkaichao · 2025-06-11T12:11:44Z

hmmm there are often scenarios where people patch and build nvshmem themselves. would pytorch bringing in a native dependency of nvshmem break such usage?

for example, vllm's recipes to use nvshmem is:

https://github.com/vllm-project/vllm/blob/5c8d34a42cff68dde652128726f7450032b8f474/tools/ep_kernels/install_python_libraries.sh#L33

@kwen2501

…4568)" This reverts commit 34c6371.

kwen2501 · 2025-07-10T17:18:40Z

@youkaichao thanks for raising the concern. Are those patches improvements / extensions to NVSHMEM? If so, would DeepEp be interested in upstreaming them to NVSHMEM? (It would be easier for DeepEp to maintain their codebase too.)
cc @albanD

youkaichao · 2025-07-11T00:39:02Z

@kwen2501 i think nvshmem 3.3 has integrated these patches. I haven't fully understand what would happen if multiple versions / instances of nvshmem exist in the same program yet.

seth-howell · 2025-07-11T06:50:28Z

We have already incorporated the changes done by DeepEP in NVSHMEM 3.3. There was one change about "receive queue support" that was ABI breaking but it was recently confirmed that they are not using that feature anymore and that it can be removed (deepseek-ai/DeepEP#147).

We do need to create a patch for DeepEP to get rid of those changes and use upstreamed NVSHMEM directly instead. I am working on that. Once that is done, the PyTorch integration and DeepEP usage should be just fine. I will post a link to the PR I open to keep you updated.

Other than this, DeepEP carries their own version of the device-side ibgda_device.cu file which uses an internal NVSHMEM IBGDA API to be able to do QP selection. In NVSHMEM 3.4, we are working on exposing a standard NVSHMEM API for doing QP selection. DeepEP will then be free to use the exposed API rather than their internal implementation. But this does not impact PyTorch integration.

…158039) This reverts commit 34c6371.

seth-howell · 2025-07-14T18:49:19Z

FYI - Opened Friday: deepseek-ai/DeepEP#295

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS

5496ba1

[ghstack-poisoned]

kwen2501 requested a review from a team as a code owner May 28, 2025 21:55

kwen2501 mentioned this pull request May 28, 2025

Turn on compile with NVSHMEM #154538

Closed

pytorch-bot bot added the topic: not user facing topic category label May 28, 2025

kwen2501 requested review from Skylion007, atalman and malfet May 28, 2025 22:17

kwen2501 added release notes: distributed (c10d) release notes category and removed topic: not user facing topic category labels May 28, 2025

Update on "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS"

e174a87

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

pytorch-bot bot added the topic: not user facing topic category label May 28, 2025

Update on "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS"

06ed8ef

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

kwen2501 added a commit that referenced this pull request May 28, 2025

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS

41b82cb

ghstack-source-id: e471569 Pull Request resolved: #154568

kwen2501 mentioned this pull request May 29, 2025

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154567

Closed

kwen2501 added the ciflow/trunk Trigger trunk jobs on your pull request label May 29, 2025

atalman reviewed May 29, 2025

View reviewed changes

Update on "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS"

45ace4d

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

Update on "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS"

266d9ad

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

Update on "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS"

51f8222

NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ [ghstack-poisoned]

kwen2501 added a commit that referenced this pull request May 29, 2025

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS

8b20f58

ghstack-source-id: c81bf59 Pull Request resolved: #154568

atalman mentioned this pull request Jun 3, 2025

Add nvidia_nvshmem_cu12 to list of allowed files pytorch/test-infra#6699

Merged

atalman approved these changes Jun 3, 2025

View reviewed changes

atalman added a commit to pytorch/test-infra that referenced this pull request Jun 3, 2025

Add nvidia_nvshmem_cu12 to list of allowed files (#6699)

6105c6f

To unblock pytorch/pytorch#154568

pytorchmergebot added the merging label Jun 4, 2025

pytorchmergebot added the Merged label Jun 4, 2025

pytorchmergebot closed this in 34c6371 Jun 4, 2025

pytorchmergebot removed the merging label Jun 4, 2025

atalman added a commit to atalman/pytorch that referenced this pull request Jul 10, 2025

Revert "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS (pytorch#15…

0495faf

…4568)" This reverts commit 34c6371.

atalman mentioned this pull request Jul 10, 2025

Revert "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS (#154568)" #158039

Merged

atalman added a commit that referenced this pull request Jul 11, 2025

Revert "Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS (#154568)" (#…

0afa9af

…158039) This reverts commit 34c6371.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154568

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154568

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154568

Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS #154568

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154568

✅ No Failures

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!