Enable `_lazy_clone` between CPU and MPS #148408

kurtamohler · 2025-03-04T03:10:57Z

Adds device arg to _lazy_clone to enable lazy cloning data from one device to another. At the moment, only the following cases are supported:

Source is a pinned CPU tensor and destination is MPS.
Source is an MPS tensor and destination is CPU.
Source and destination devices are the same.

This PR also adds support for pinned CPU tensors on MPS builds, which was not working properly before.

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2025-03-04T03:11:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148408

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3b1a2c7 with merge base 56e1c23 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / unstable-linux-focal-cuda12.6-py3.10-gcc11-sm89-xfail / build (gh)
ninja: build stopped: subcommand failed

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 16fbf84 Pull Request resolved: #148408

github-actions · 2025-03-04T03:14:27Z

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.

Caused by:

aten/src/ATen/native/native_functions.yaml

[ghstack-poisoned]

ghstack-source-id: 1efaeec Pull Request resolved: #148408

[ghstack-poisoned]

ghstack-source-id: d66f521 Pull Request resolved: #148408

[ghstack-poisoned]

ghstack-source-id: 7c251ee Pull Request resolved: #148408

c10/core/impl/COW.cpp

ghstack-source-id: 7c251ee Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: 22e24cc Pull Request resolved: #148408

aten/src/ATen/native/AutogradComposite.cpp

ghstack-source-id: 22e24cc Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: 7cb9a35 Pull Request resolved: #148408

ghstack-source-id: 7cb9a35 Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: ba80390 Pull Request resolved: #148408

ghstack-source-id: ba80390 Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: b84d27a Pull Request resolved: #148408

[ghstack-poisoned]

ghstack-source-id: 97b5249 Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: fdcc972 Pull Request resolved: pytorch#148408

aten/src/ATen/native/AutogradComposite.cpp

albanD · 2025-04-07T18:41:53Z

c10/core/Allocator.h

+  virtual const void* get_cpu_ptr_from_device_ptr(const void* device_ptr) const;
+  virtual void* get_device_ptr_from_cpu_ptr(void* cpu_ptr) const;
+  virtual const void* get_device_ptr_from_cpu_ptr(const void* cpu_ptr) const;
+  virtual bool has_unified_memory() const;


Why do you need to add this new concept here?
I would expect that, in the context of MPS, cpu Tensor is pure cpu, pinned cpu and mps Tensors are unified.

When CPU and Metal operators access the same location in the shared memory space, they use different addresses. (I wonder if there's a way to make them use the same address space?) So when we lazy clone an MPS tensor to pinned-CPU, we need a way to translate the MPS address into CPU address space and set the output DataPtr to that. Likewise, we need to translate in the other direction for pinned-CPU to MPS lazy clone. These functions provide an API for that

c10/core/Allocator.h

ghstack-source-id: fdcc972 Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: f6ae96d Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: a863bdd Pull Request resolved: pytorch#148408

ghstack-source-id: 344f648 Pull Request resolved: pytorch/pytorch#148408

[ghstack-poisoned]

ghstack-source-id: ead3b2f Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: 3f9e14a Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: 0179df2 Pull Request resolved: pytorch#148408

[ghstack-poisoned]

ghstack-source-id: 5e1efe9 Pull Request resolved: pytorch#148408

Update

e4a0cb8

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 4, 2025

[WIP] Add device arg to _lazy_clone

d18182a

ghstack-source-id: 16fbf84 Pull Request resolved: #148408

pytorchbot added the open source label Mar 4, 2025

Update

7c0f082

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 7, 2025

[WIP] Add device arg to _lazy_clone

917cff2

ghstack-source-id: 1efaeec Pull Request resolved: #148408

kurtamohler added the release notes: lazy release notes category label Mar 7, 2025

Update

501d68f

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 7, 2025

[WIP] Add device arg to _lazy_clone

81e43b7

ghstack-source-id: d66f521 Pull Request resolved: #148408

Update

3f9ca3c

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 7, 2025

[WIP] Add device arg to _lazy_clone

03715e7

ghstack-source-id: 7c251ee Pull Request resolved: #148408

kurtamohler commented Mar 8, 2025

View reviewed changes

c10/core/impl/COW.cpp Outdated Show resolved Hide resolved

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Mar 11, 2025

[WIP] Add device arg to _lazy_clone

f23a44d

ghstack-source-id: 7c251ee Pull Request resolved: pytorch#148408

Update

2a4340b

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 11, 2025

[WIP] Add device arg to _lazy_clone

741dddf

ghstack-source-id: 22e24cc Pull Request resolved: #148408

pytorch-bot bot added ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) module: inductor labels Mar 11, 2025

kurtamohler commented Mar 11, 2025

View reviewed changes

aten/src/ATen/native/AutogradComposite.cpp Outdated Show resolved Hide resolved

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

6b08cd1

ghstack-source-id: 22e24cc Pull Request resolved: pytorch#148408

Update

778a0af

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

f313ebf

ghstack-source-id: 7cb9a35 Pull Request resolved: #148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

b674698

ghstack-source-id: 7cb9a35 Pull Request resolved: pytorch#148408

Update

8130be6

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

7efc0d2

ghstack-source-id: ba80390 Pull Request resolved: #148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

de71bf1

ghstack-source-id: ba80390 Pull Request resolved: pytorch#148408

Update

e67656b

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Mar 12, 2025

[WIP] Add device arg to _lazy_clone

48dbbca

ghstack-source-id: b84d27a Pull Request resolved: #148408

Update

3d169bd

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 7, 2025

Enable _lazy_clone between CPU and MPS

10000

b131685

ghstack-source-id: 97b5249 Pull Request resolved: pytorch#148408

Update

a4993ea

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 7, 2025

Enable _lazy_clone between CPU and MPS

d207343

ghstack-source-id: fdcc972 Pull Request resolved: pytorch#148408

albanD reviewed Apr 7, 2025

View reviewed changes

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 7, 2025

Enable _lazy_clone between CPU and MPS

891e30f

ghstack-source-id: fdcc972 Pull Request resolved: pytorch#148408

kurtamohler added 2 commits April 7, 2025 21:05

Update

229632a

[ghstack-poisoned]

Update

7b38ca7

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 7, 2025

Enable _lazy_clone between CPU and MPS

7107348

ghstack-source-id: f6ae96d Pull Request resolved: pytorch#148408

Update

bd89c07

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 8, 2025

Enable _lazy_clone between CPU and MPS

4535e99

ghstack-source-id: a863bdd Pull Request resolved: pytorch#148408

Divigroup-RAP pushed a commit to Divigroup-RAP/PYTORCH that referenced this pull request Apr 22, 2025

Enable _lazy_clone between CPU and MPS

5291378

ghstack-source-id: 344f648 Pull Request resolved: pytorch/pytorch#148408

Update

22081eb

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 25, 2025

Enable _lazy_clone between CPU and MPS

ef8220f

ghstack-source-id: ead3b2f Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 25, 2025

Enable _lazy_clone between CPU and MPS

ee49831

ghstack-source-id: ead3b2f Pull Request resolved: pytorch#148408

Update

1e5de8a

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 28, 2025

Enable _lazy_clone between CPU and MPS

1035310

ghstack-source-id: 3f9e14a Pull Request resolved: pytorch#148408

kurtamohler added 2 commits April 28, 2025 20:15

Update

2f864f1

[ghstack-poisoned]

Update

c688b9c

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 28, 2025

Enable _lazy_clone between CPU and MPS

5b61a48

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 29, 2025

Enable _lazy_clone between CPU and MPS

6be7701

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 29, 2025

Enable _lazy_clone between CPU and MPS

931348a

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 30, 2025

Enable _lazy_clone between CPU and MPS F438

a8604ea

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request Apr 30, 2025

Enable _lazy_clone between CPU and MPS

d6a146f

ghstack-source-id: eb3a36e Pull Request resolved: pytorch#148408

Update

d5e4f89

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request May 1, 2025

Enable _lazy_clone between CPU and MPS

f9e1dd5

ghstack-source-id: 0179df2 Pull Request resolved: pytorch#148408

kurtamohler mentioned this pull request May 12, 2025

Add COW support for MPS backend. #138640

Closed

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request May 13, 2025

Enable _lazy_clone between CPU and MPS

9989471

ghstack-source-id: 0179df2 Pull Request resolved: pytorch#148408

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request May 16, 2025

Enable _lazy_clone between CPU and MPS

81794ee

ghstack-source-id: 0179df2 Pull Request resolved: pytorch#148408

Update

3b1a2c7

[ghstack-poisoned]

kurtamohler added a commit to kurtamohler/pytorch that referenced this pull request May 16, 2025

Enable _lazy_clone between CPU and MPS

8b1ea8b

ghstack-source-id: 5e1efe9 Pull Request resolved: pytorch#148408

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable `_lazy_clone` between CPU and MPS #148408

Enable `_lazy_clone` between CPU and MPS #148408

Enable _lazy_clone between CPU and MPS #148408

Are you sure you want to change the base?

Enable _lazy_clone between CPU and MPS #148408

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148408

✅ You can merge normally! (1 Unrelated Failure)

Attention! native_functions.yaml was changed

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Enable `_lazy_clone` between CPU and MPS #148408

Enable `_lazy_clone` between CPU and MPS #148408