-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Enable lazy cloning in Tensor.to
between CPU and MPS
#150569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gh/kurtamohler/33/base
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150569
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New Failures, 1 Unrelated FailureAs of commit 8fd51f9 with merge base 56e1c23 ( NEW FAILURES - The following jobs have failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Because of what I mentioned here, right now there is no guarantee that a COW MPS tensor will materialize when it should. I've found some places in the codebase where MPS ops cast the const data pointer to non-const and then mutate the data (for instance, any op that calls |
ghstack-source-id: ac5309c Pull Request resolved: pytorch#150569
ghstack-source-id: 902ece7 Pull Request resolved: pytorch#150569
ghstack-source-id: 3cd33fa Pull Request resolved: pytorch#150569
ghstack-source-id: 6f363f6 Pull Request resolved: pytorch#150569
Is it possible to extend the support to other kinds of devices? |
In general probably not, but the answer depends on the platform. The reason this can work for CPU-MPS is that M-series Macs have a shared memory space that both the cpu and gpu can access |
ghstack-source-id: 0dd5075 Pull Request resolved: pytorch#150569
ghstack-source-id: 0dd5075 Pull Request resolved: pytorch#150569
ghstack-source-id: 0dd5075 Pull Request resolved: pytorch#150569
Stack from ghstack (oldest at bottom):
Tensor.to
between CPU and MPS #150569_lazy_clone
between CPU and MPS #148408