8000 initialize device when pinning memory on this device, short circuit i… by ngimel · Pull Request #145752 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

initialize device when pinning memory on this device, short circuit i… #145752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

ngimel
Copy link
Collaborator
@ngimel ngimel commented Jan 27, 2025

…s_pinned if device is not initialized
Do not land
RFC
potential fix for #144687

Now .is_pinned(device="cuda") does not initialize device and thus doesn't poison the fork (but it complains about device arg being deprecated). To not need device= arg we'd need to fix get_accelerator to not initialize device.
cc @malfet, @albanD

Copy link
pytorch-bot bot commented Jan 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145752

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ad816b7 with merge base 30dea84 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Collaborator
@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM !

Co-authored-by: albanD <albandes@fb.com>
@ngimel
Copy link
Collaborator Author
ngimel commented Jan 27, 2025

Hm no this actually initializes context on the 0th device (I think it shouldn't have, but I'll have to check)

In [1]: import torch

In [2]: torch.cuda.list_gpu_processes()
Out[2]: 'GPU:0\nno processes are running'

In [3]: x=torch.empty(1, pin_memory=True)

In [4]: torch.cuda.list_gpu_processes()
Out[4]: 'GPU:0\nprocess    1105514 uses     1188.000 MB GPU memory'

so won't work as is

@ngimel ngimel added the release notes: cuda release notes category label Jan 27, 2025
@ngimel
Copy link
Collaborator Author
ngimel commented Jan 28, 2025

The behavior above is caused by pinning itself, not by initializing context, so this diff is a strict improvement over existing state.

Copy link
Collaborator
@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@ngimel
Copy link
Collaborator Author
ngimel commented Jan 29, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 29, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 29, 2025 22:49 Inactive
@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 29, 2025 22:49 Inactive
@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 29, 2025 22:49 Inactive
@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@ngimel
Copy link
Collaborator Author
ngimel commented Jan 30, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator
< 8000 /tbody>

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

mori360 pushed a commit to mori360/pytorch that referenced this pull request Feb 6, 2025
pytorch#145752)

…s_pinned if device is not initialized
Do not land
RFC
potential fix for pytorch#144687

Now `.is_pinned(device="cuda")` does not initialize device and thus doesn't poison the fork (but it complains about `device` arg being deprecated). To not need `device=` arg we'd need to fix get_accelerator to not initialize device.

Pull Request resolved: pytorch#145752
Approved by: https://github.com/albanD

Co-authored-by: albanD <albandes@fb.com>
@github-actions github-actions bot deleted the ngimel/pinned_init branch March 2, 2025 02:10
BartlomiejStemborowski added a commit to BartlomiejStemborowski/pytorch that referenced this pull request Mar 12, 2025
PR pytorch#145752 has added a check in the isPinnedPtr to check if a device is initialized before checking if the tensor is pinned. Also that PR has added a lazy initialization trigger when an at::empty is called with a pinned param set to true. However, when the tensor is firstly created and it is pinned in a separate call by calling pin_memory() function, lazy device init is not called so is_pinned returns always false.

With this PR, the lazy initialization is moved to getPinnedMemoryAllocator function, thus it is assured that device is initialized before we pin a tensor.
pytorchmergebot pushed a commit to BartlomiejStemborowski/pytorch that referenced this pull request Mar 12, 2025
PR pytorch#145752 has added a check in the isPinnedPtr to check if a device is initialized before checking if the tensor is pinned. Also that PR has added a lazy initialization trigger when an at::empty is called with a pinned param set to true. However, when the tensor is firstly created and it is pinned in a separate call by calling pin_memory() function, lazy device init is not called so is_pinned returns always false.

With this PR, the lazy initialization is moved to getPinnedMemoryAllocator function, thus it is assured that device is initialized before we pin a tensor.
pytorchmergebot pushed a commit that referenced this pull request Mar 13, 2025
…itialization. (#149033)

PR #145752 has added a check in the isPinnedPtr to check if a device is initialized before checking if the tensor is pinned. Also that PR has added a lazy initialization trigger when an at::empty is called with a pinned param set to true. However, when the tensor is firstly created and it is pinned in a separate call by calling pin_memory() function, lazy device init is not called so is_pinned returns always false.

With this PR, the lazy initialization is moved to getPinnedMemoryAllocator function, thus it is assured that device is initialized before we pin a tensor.

Fixes #149032

@ngimel @albanD
Pull Request resolved: #149033
Approved by: https://github.com/ngimel, https://github.com/albanD
pytorchbot pushed a commit that referenced this pull request Mar 14, 2025
…itialization. (#149033)

PR #145752 has added a check in the isPinnedPtr to check if a device is initialized before checking if the tensor is pinned. Also that PR has added a lazy initialization trigger when an at::empty is called with a pinned param set to true. However, when the tensor is firstly created and it is pinned in a separate call by calling pin_memory() function, lazy device init is not called so is_pinned returns always false.

With this PR, the lazy initialization is moved to getPinnedMemoryAllocator function, thus it is assured that device is initialized before we pin a tensor.

Fixes #149032

@ngimel @albanD
Pull Request resolved: #149033
Approved by: https://github.com/ngimel, https://github.com/albanD

(cherry picked from commit 420a9be)
atalman pushed a commit that referenced this pull request Mar 17, 2025
…itialization. (#149183)

[regression] Fix pin_memory() when it is called before device lazy initialization. (#149033)

PR #145752 has added a check in the isPinnedPtr to check if a device is initialized before checking if the tensor is pinned. Also that PR has added a lazy initialization trigger when an at::empty is called with a pinned param set to true. However, when the tensor is firstly created and it is pinned in a separate call by calling pin_mem
7715
ory() function, lazy device init is not called so is_pinned returns always false.

With this PR, the lazy initialization is moved to getPinnedMemoryAllocator function, thus it is assured that device is initialized before we pin a tensor.

Fixes #149032

@ngimel @albanD
Pull Request resolved: #149033
Approved by: https://github.com/ngimel, https://github.com/albanD

(cherry picked from commit 420a9be)

Co-authored-by: Bartlomiej Stemborowski <bstemborowskix@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: cuda release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0