8000 Adds cudaMallocAsync as an alternative backend for the CUDA allocator by mcarilli · Pull Request #65365 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Adds cudaMallocAsync as an alternative backend for the CUDA allocator #65365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 73 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
f5f5cde
torch.cuda.backends.allocator
mcarilli Sep 20, 2021
16ea79b
cudaAllocatorModule
mcarilli Sep 20, 2021
f1f803d
remove backends.cuda binding, want envvar instead
mcarilli Sep 21, 2021
a73877e
use PYTORCH_CUDA_ALLOC_CONF
mcarilli Sep 21, 2021
35b322f
Stashing for vsibility of this idea
mcarilli Oct 11, 2021
6ada155
Taking shape enough to be worth showing
mcarilli Oct 31, 2021
a6f271e
Separated allocator config so functions can be shared
mcarilli Nov 8, 2021
9e2b3b7
docstring
mcarilli Nov 8, 2021
3026470
Fix multiple definition errors, simplify config
mcarilli Nov 8, 2021
430e43c
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Nov 8, 2021
57aa340
uncovered graph can of worms
mcarilli Nov 9, 2021
44a9c72
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Nov 14, 2021
bae5608
Should fix dangling streams issue
mcarilli Nov 15, 2021
0b6ffa1
approach to avoid freeing ungraphed pointers during capture
mcarilli Nov 16, 2021
f2038a3
Hash and operator== for UsageStream
mcarilli Nov 16, 2021
6b9d832
stashing work
mcarilli Nov 19, 2021
253851d
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Nov 19, 2021
cfd624e
Almost!
mcarilli Nov 19, 2021
bfaae65
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Nov 19, 2021
ddcec61
fix annoying link error
mcarilli Nov 24, 2021
06d58b5
Error messages for compile or runtime CUDA version < 11.4
mcarilli Nov 29, 2021
fdaaa9f
Completely avoid cudaGraphInstantiateFlagAutoFreeOnLaunch if cuda < 11.4
mcarilli Nov 30, 2021
8f94458
Fix CUDA_VERSION usage
mcarilli Nov 30, 2021
daf188f
resolve conflict in CUDACachingAllocator.cpp
mcarilli Dec 7, 2021
eeeac81
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Dec 7, 2021
4d7388b
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Dec 7, 2021
89b03b3
fix warning string
mcarilli Dec 7, 2021
eef5b31
reset properly
mcarilli Dec 9, 2021
27a2d68
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Dec 13, 2021
991d303
halfassed but defensible attempt at stat handling
mcarilli Dec 17, 2021
1815536
For discussion, highlight strange interaction between cacheInfo and c…
mcarilli Dec 17, 2021
4d8dca7
fdsa
mcarilli Dec 17, 2021
363fe3c
Let's see how this makes the docs look
mcarilli Dec 20, 2021
b13f118
before i forget
mcarilli Jan 1, 2022
4089885
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Jan 14, 2022
368a0de
typo
mcarilli Jan 14, 2022
c80a05f
All graph tests in test_cuda.py pass except for test_graph_cudnn_dropout
mcarilli Jan 14, 2022
db53e41
better skip
mcarilli Jan 16, 2022
2293a94
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Jan 17, 2022
bdc6d30
test_graph_cudnn_dropout now passes too
mcarilli Jan 18, 2022
7e7c12b
worth a try
mcarilli Jan 20, 2022
c90a3a0
enable p2p transfers for p2p-capable devices
mcarilli Jan 25, 2022
c2d84ea
test_cuda.py passes on my machine
mcarilli Jan 26, 2022
aa2bee7
TEST_CUDAMALLOCASYNC
mcarilli Jan 27, 2022
6fbb8cc
fixes OOM handling and test_record_stream
mcarilli Jan 28, 2022
78eb46e
fix regex for test_set_per_process_memory_fraction
mcarilli Jan 29, 2022
a240ec8
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Jan 31, 2022
a208bdb
test_out_of_memory_retry
mcarilli Feb 1, 2022
1189ff7
s/THC/Native requested by natalia
mcarilli Feb 1, 2022
05c4554
Resolve conflicts in CUDACachingAllocator.cpp
mcarilli Feb 2, 2022
c33ce86
Resolve conflict in CUDACachingAllocator.cpp
mcarilli Feb 9, 2022
c85cb9c
fix test_set_per_process_memory_fraction failure caused by knock-on e…
mcarilli Feb 10, 2022
7fe0e75
typo
mcarilli Feb 10, 2022
b534129
fix signature for unavailable version
mcarilli Feb 10, 2022
2f7d1b5
comment
mcarilli Feb 17, 2022
bc55994
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Feb 17, 2022
e0ec118
temporary test disables to unblock local CI run
mcarilli Feb 17, 2022
09709d5
resolve conflict in Copy.cu
mcarilli Mar 8, 2022
4cbde6f
#if CUDA_VERSION >= 11040
mcarilli Mar 8, 2022
6bbf293
use backend enum, use 71668 config parser, de-inline format_size, rem…
mcarilli Mar 16, 2022
e810e0f
avoid compiling pool-specific stuff with cuda < 11.4
mcarilli Mar 18, 2022
95af048
Resolves conflicts with #74213
mcarilli Mar 18, 2022
1cc5d02
Resolving conflicts
mcarilli Mar 26, 2022
a006a53
switches to flat_hash_set for recorded streams, restores original max…
mcarilli Mar 28, 2022
3d23053
Add Python-facing torch.cuda.get_allocator_backend()
mcarilli Mar 30, 2022
49d4332
Pytorch->PyTorch again lmao
mcarilli Mar 30, 2022
3cc7a1f
rename malloc and free, avoid gratuitous exception-catching in cacheInfo
mcarilli Apr 5, 2022
5401784
typos
mcarilli Apr 5, 2022
ec5b6ff
Merge remote-tracking branch 'upstream/master' into cudaMallocAsync
mcarilli Apr 5, 2022
c4a9acf
flake8 and mypy
mcarilli Apr 5, 2022
4c79ed8
let's see if this cleans up some failures
mcarilli Apr 5, 2022
2b1e0b2
un-static allocatorBackend checks in p2p and copy path
mcarilli Apr 5, 2022
9a47eff
Implements backend load time initialization. Builds and import succeeds.
mcarilli Apr 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
cudaAllocatorModule
  • Loading branch information
mcarilli committed Sep 20, 2021
commit 16ea79bc48fc37a99b65d36bb7a93e9da86cc472
2 changes: 1 addition & 1 deletion torch/backends/cuda/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def __setattr__(self, name, value):
return torch._C._set_cublas_allow_tf32(value)


class cuBLASModule:
class cudaAllocatorModule:
def __getattr__(self, name):
assert name == "allocator", "Unknown attribute " + name
return torch._C._get_cuda_allocator()
Expand Down
0