-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Generalize poison fork logic for each device backend #144664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144664
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Cancelled Job, 4 Unrelated FailuresAs of commit 034efa9 with merge base c93e4b8 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit 83bd0b6. Reverted #144664 on behalf of https://github.com/atalman due to failing internal tests ([comment](#144664 (comment)))
@guangyey your PR has been successfully reverted. |
Getting following error: RuntimeError: Only one device type can be registered. But now, we have two device types: mtia and cuda
|
Sorry about that. I see now that in some scenarios, MTIA can be built together with CUDA in a single binary wheel. I’ll remove the assumption that only one device type is allowed to be registered at fork detection. |
#ifndef WIN32 | ||
auto& flag = at_fork_once_flags[static_cast<int>(device_type)]; | ||
c10::call_once(flag, [device_type]() { | ||
static at::DeviceType at_fork_device_type = device_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albanD I see now that in some scenarios, MTIA can be built together with CUDA in a single binary wheel. So I remove the assumption that only one device type is allowed to be registered at fork detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ho ok :(
Thanks for the update!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
Thanks |
Merge startedYour change will be merged while ignoring the following 7 checks: rocm / linux-focal-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.2), s390x-periodic / linux-manylinux-2_28-py3-cpu-s390x / test (default, 2, 10, linux.s390x), periodic / linux-focal-rocm-py3.10 / test (distributed, 1, 3, linux.rocm.gpu.4, module:rocm, oncall:distributed), xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 1, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 2, 4, linux.idc.xpu), xpu / linux-j E7EE ammy-xpu-2025.0-py3.9 / test (default, 3, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 4, 4, linux.idc.xpu) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
…#144664)" This reverts commit d86c141. Reverted pytorch#144664 on behalf of https://github.com/atalman due to failing periodic test: python test/test_cpp_extensions_mtia_backend.py TestCppExtensionMTIABackend.test_device_context ([comment](pytorch#144664 (comment)))
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
…#144664)" This reverts commit 83bd0b6. Reverted pytorch#144664 on behalf of https://github.com/atalman due to failing internal tests ([comment](pytorch#144664 (comment)))
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
…#144664)" This reverts commit d86c141. Reverted pytorch#144664 on behalf of https://github.com/atalman due to failing periodic test: python test/test_cpp_extensions_mtia_backend.py TestCppExtensionMTIABackend.test_device_context ([comment](pytorch#144664 (comment)))
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
…#144664)" This reverts commit 83bd0b6 C639 a>. Reverted pytorch#144664 on behalf of https://github.com/atalman due to failing internal tests ([comment](pytorch#144664 (comment)))
# Motivation Generalize the posion_fork code to make it reusable across different devices. Pull Request resolved: pytorch#144664 Approved by: https://github.com/EikanWang, https://github.com/albanD
Stack from ghstack (oldest at bottom):
Motivation
Generalize the posion_fork code to make it reusable across different devices.
cc @albanD @EikanWang