8000 Refactor CUDAAllocatorConfig to reuse AcceleratorAllocatorConfig by guangyey · Pull Request #150312 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@guangyey
Copy link
Collaborator
@guangyey guangyey commented Mar 31, 2025

Stack from ghstack (oldest at bottom):

Motivation

Refactor CUDAAllocatorConfig to reuse AcceleratorAllocatorConfig and ConfigTokenizer. We would deprecate those option that overleap with AcceleratorAllocatorConfig in the following PR and keep them only for BC.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k

@guangyey guangyey requested review from eqy and syed-ahmed as code owners March 31, 2025 16:12
@pytorch-bot
Copy link
pytorch-bot bot commented Mar 31, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150312

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (3 Unrelated Failures)

As of commit df9befb with merge base bb67660 (image):

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@guangyey guangyey changed the title Refactor CUDAAllocatorConfig to reuse AllocatorConfig [WIP] Refactor CUDAAllocatorConfig to reuse AllocatorConfig Mar 31, 2025
guangyey added 21 commits March 31, 2025 23:46
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey guangyey added release notes: cpp release notes category topic: not user facing topic category labels Apr 15, 2025
yangw-dev pushed a commit that referenced this pull request Aug 1, 2025
…fig (#150312)"

This reverts commit dfacf11.

Reverted #150312 on behalf of https://github.com/guangyey due to Static initialization order issue impact the downstream repo ([comment](#150312 (comment)))
[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #156175

3 similar comments
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #156175

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #156175

< 8000 !-- no margin wins, so we check it last and use its value if true. -->

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #156175

pytorchmergebot pushed a commit that referenced this pull request Aug 5, 2025
# Motivation
As @ScottTodd identified in this [comment](#150312 (comment)), using STL containers like `std::string` and `std::unordered_set` at static init time can cause static initialization order issues. This PR is based on and modified from his original PR: #159607. I’m stacking this PR here to help facilitate the landing and validation process.

Co-authored-by: @ScottTodd
Pull Request resolved: #159629
Approved by: https://github.com/ScottTodd, https://github.com/albanD
pytorchmergebot pushed a commit that referenced this pull request Aug 5, 2025
…llocatorConfig instead (#156165)

Pull Request resolved: #156165
Approved by: https://github.com/albanD
ghstack dependencies: #159629, #150312
pytorchmergebot pushed a commit that referenced this pull request Aug 5, 2025
# Motivation
This PR moves the implementation of `torch.cuda.memory._set_allocator_settings` to `torch._C._accelerator_setAllocatorSettings`.
Since the original API was intended as a temporary/internal utility, I am not exposing the new function as a public API.

Pull Request resolved: #156175
Approved by: https://github.com/albanD
ghstack dependencies: #159629, #150312, #156165
joshuuuasu added a commit to joshuuuasu/pytorch that referenced this pull request Aug 19, 2025
…onfig (pytorch#150312)"

Summary: reverting this diff since it caused S551328. Please see D80217492 for dertails.

Test Plan:
NA

Rollback Plan:

Differential Revision: D80553588
pytorch-bot bot pushed a commit that referenced this pull request Aug 20, 2025
…onfig (#150312)" (#161002)

Summary:

reverting this diff since it caused S551328. Please see D80217492 for dertails.

Test Plan:
NA

Rollback Plan:

Reviewed By: sayitmemory, jingsh

Differential Revision: D80553588
joshuuuasu added a commit to joshuuuasu/pytorch that referenced this pull request Aug 20, 2025
…onfig (pytorch#150312)" (pytorch#161002)

Summary:
Pull Request resolved: pytorch#161002

reverting this diff since it caused S551328. Please see D80217492 for dertails.

Test Plan:
NA

Rollback Plan:

Reviewed By: sayitmemory, jingsh

Differential Revision: D80553588
pytorchmergebot pushed a commit that referenced this pull request Aug 26, 2025
…onfig (#150312)" (#161002)

Summary: reverting this diff since it caused S551328. Please see D80217492 for dertails.

Test Plan:
NA

Rollback Plan:

Differential Revision: D80553588

Pull Request resolved: #161002
Approved by: https://github.com/jingsh, https://github.com/izaitsevfb
pytorchmergebot added a commit that referenced this pull request Aug 27, 2025
…locatorConfig (#150312)" (#161002)"

This reverts commit a03cc53.

Reverted #161002 on behalf of https://github.com/guangyey due to This PR breaks CI TestCudaMallocAsync::test_allocator_settings ([comment](#161002 (comment)))
guangyey added a commit that referenced this pull request Aug 27, 2025
…fig (#150312)"

This reverts commit ae1a706.

ghstack-source-id: 641c5bb
Pull Request resolved: #161628
guangyey added a commit that referenced this pull request Aug 27, 2025
…fig (#150312)"

This reverts commit ae1a706.

ghstack-source-id: 0a5aacc
Pull Request resolved: #161628
pytorchmergebot pushed a commit that referenced this pull request Aug 27, 2025
…fig (#150312)" (#161628)

This reverts commit ae1a706.
Pull Request resolved: #161628
Approved by: https://github.com/atalman
ghstack dependencies: #161625, #161626, #161627
@github-actions github-actions bot deleted the gh/guangyey/133/head branch September 5, 2025 02:09
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
# Motivation
As @ScottTodd identified in this [comment](pytorch#150312 (comment)), using STL containers like `std::string` and `std::unordered_set` at static init time can cause static initialization order issues. This PR is based on and modified from his original PR: pytorch#159607. I’m stacking this PR here to help facilitate the landing and validation process.

Co-authored-by: @ScottTodd
Pull Request resolved: pytorch#159629
Approved by: https://github.com/ScottTodd, https://github.com/albanD
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…orch#150312)

# Motivation
Refactor `CUDAAllocatorConfig` to reuse `AcceleratorAllocatorConfig` and `ConfigTokenizer`. We would deprecate those option that overleap with `AcceleratorAllocatorConfig` in the following PR and keep them only for BC.

Pull Request resolved: pytorch#150312
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#159629
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…6175)

# Motivation
This PR moves the implementation of `torch.cuda.memory._set_allocator_settings` to `torch._C._accelerator_setAllocatorSettings`.
Since the original API was intended as a tempora
5D99
ry/internal utility, I am not exposing the new function as a public API.

Pull Request resolved: pytorch#156175
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#159629, pytorch#150312, pytorch#156165
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…onfig (pytorch#150312)" (pytorch#161002)

Summary: reverting this diff since it caused S551328. Please see D80217492 for dertails.

Test Plan:
NA

Rollback Plan:

Differential Revision: D80553588

Pull Request resolved: pytorch#161002
Approved by: https://github.com/jingsh, https://github.com/izaitsevfb
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…locatorConfig (pytorch#150312)" (pytorch#161002)"

This reverts commit a03cc53.

Reverted pytorch#161002 on behalf of https://github.com/guangyey due to This PR breaks CI TestCudaMallocAsync::test_allocator_settings ([comment](pytorch#161002 (comment)))
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/h100-distributed ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged no-stale oncall: distributed Add this issue/PR to distributed oncall triage queue open source release notes: cpp release notes category Reverted Stale topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants

0