8000 [ROCm] CK Flash Attention Backend by xw285cornell · Pull Request #143695 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[ROCm] CK Flash Attention Backend #143695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 39 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
e8f1db3
[ROCm] CK Flash Attention Backend
alugorey Oct 7, 2024
627912f
Fix test_meta_outplace
alugorey Nov 5, 2024
b852e29
Add Tri Dao to LICENSE file for functionality pulled from flash-atten…
alugorey Nov 7, 2024
cf2b147
Add ck kernel blob list
alugorey Nov 7, 2024
b6777eb
Remove generated kernels from .gitignore
alugorey Nov 8, 2024
8c40287
Add generated files to git
alugorey Nov 8, 2024
9b3d3bf
remove plumbing to generate files
alugorey Nov 8, 2024
e57b9ca
remove examples dir from inclusion
alugorey Nov 15, 2024
8406e5f
Add files from example ck
alugorey Nov 15, 2024
ada5bc9
Remove references to ck examples
alugorey Nov 15, 2024
9b89fd7
Moved .h file to .hpp
alugorey Nov 18, 2024
4e86558
fix broken tests
xw285cornell Nov 19, 2024
2a73c20
Re-add file generation
alugorey Dec 5, 2024
c26e9e8
Fix bad cp/paste in .gitignore
alugorey Dec 5, 2024
d20b910
Fix lint
alugorey Dec 6, 2024
9397a72
fix cmakelists
alugorey Dec 6, 2024
180b36e
Remove old instance files from old location
alugorey Dec 6, 2024
04355d5
Fix fmha_bwd/fwd.hpp
alugorey Dec 6, 2024
4a21916
Remove old blob lists
alugorey Dec 6, 2024
f11e06e
add block_table for varlen_fwd
alugorey Dec 6, 2024
6f337fe
fix sdpUtils
alugorey Dec 9, 2024
ae33599
re-Add generated instances
alugorey Dec 12, 2024
81ed5f2
Remove ck cmakelists
alugorey Dec 12, 2024
1945136
remove calling ck cmakelists
alugorey Dec 12, 2024
f0686d1
Add blob lists
alugorey Dec 12, 2024
08a9b7a
Prevent user from setting ck as backend on unsupported arch
alugorey Dec 12, 2024
5a72c8e
remove generated files from .gitignore
alugorey Dec 12, 2024
e01c1a6
lint
alugorey Dec 13, 2024
9cd72d3
Add guards on USE_CK_FLASH_ATTENTION
alugorey Dec 13, 2024
b7f949c
Add target_compile_definition
alugorey Dec 13, 2024
63c42fe
Removing stale ck_kernel_blob_list.txt
jithunnair-amd Dec 18, 2024
cea43ba
Remove ck/fwd_blob_list.txt and ck/bwd_blob_list.txt because they con…
jithunnair-amd Dec 18, 2024
3761532
Rename CK autogenerated files using sha1sum to address Windows failur…
jithunnair-amd Dec 18, 2024
dbab30f
Remove flash_api.h from list of includes for hipify
jithunnair-amd Dec 18, 2024
fa0a89f
Add informative messages when USE_CK_FLASH_ATTENTION is enabled
jithunnair-amd Dec 18, 2024
cc6ba49
Use Cmake STATUS
jithunnair-amd Dec 18, 2024
17b694d
Add warning to users if building for more than one gfx arch in PYTORC…
jithunnair-amd Dec 19, 2024
223aa3d
makefile linter
xw285cornell Dec 21, 2024
9f5531f
Merge branch 'ROCm:main' into rocm_ck_sdpa
xw285cornell Jan 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add warning to users if building for more than one gfx arch in PYTORC…
…H_ROCM_ARCH
  • Loading branch information
jithunnair-amd authored and pytorchmergebot committed Jan 2, 2025
commit 17b694d7870457fed81bd3b9d053718cb3d2ce55
7 changes: 7 additions & 0 deletions aten/src/ATen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,13 @@ if(USE_FLASH_ATTENTION)
if(DEFINED ENV{USE_CK_FLASH_ATTENTION})
set(USE_CK_FLASH_ATTENTION $ENV{USE_CK_FLASH_ATTENTION})
if(USE_CK_FLASH_ATTENTION STREQUAL "1")
if(DEFINED ENV{PYTORCH_ROCM_ARCH})
list(LENGTH PYTORCH_ROCM_ARCH NUM_ARCHS)
if(NUM_ARCHS GREATER 1)
message(WARNING "Building CK for multiple archs can increase build time considerably!
Consider setting PYTORCH_ROCM_ARCH env var value as the gfx arch you need to build for")
endif()
endif()
message(STATUS "USE_CK_FLASH_ATTENTION is set; building PyTorch with CK Flash Attention enabled")
file(GLOB flash_attention_hip_ck_hip "native/transformers/hip/flash_attn/ck/*.hip")
list(APPEND native_transformers_hip_hip ${flash_attention_hip_ck_hip})
Expand Down
0