8000 [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables by xinyazhang · Pull Request #133866 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[ROCM] Properly disable Flash Attention/Efficient Attention with environment variables #133866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

xinyazhang
Copy link
Collaborator
@xinyazhang xinyazhang commented Aug 19, 2024

Now USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py can compile correctly

Fixes #125230

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

Copy link
pytorch-bot bot commented Aug 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133866

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 298fee6 with merge base df68315 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Collaborator
@jithunnair-amd jithunnair-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @xinyazhang Have you done any local testing with a build that has them disabled?

@xinyazhang
Copy link
Collaborator Author

LGTM. @xinyazhang Have you done any local testing with a build that has them disabled?

I run a local sdpa.py benchmark script (got this from slack and unsuitable to show it here) and force it to use FA. The script aborted as expected after reporting FA not supported.

jithunnair-amd
jithunnair-amd approved these changes Aug 21, 2024
@xinyazhang xinyazhang marked this pull request as ready for review August 21, 2024 15:02
@jeffdaily
Copy link
Collaborator

CI build failure is real.

2024-08-20T06:28:34.5898661Z In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/transformers/hip/attention_backward.hip:49:
2024-08-20T06:28:34.5900419Z /var/lib/jenkins/workspace/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:5:10: fatal error: 'aotriton/dtypes.h' file not found
2024-08-20T06:28:34.5901514Z     5 | #include <aotriton/dtypes.h>
2024-08-20T06:28:34.5901901Z       |          ^~~~~~~~~~~~~~~~~~~
2024-08-20T06:28:34.5902317Z 1 error generated when compiling for host.

@xinyazhang
Copy link
Collaborator Author
xinyazhang commented Aug 21, 2024

CI build failure is real.

@jeffdaily It is expected if AOTriton installation doesn't exist (which I think it's the case for the CI image)

2024-08-20T06:20:33.9782632Z --     ROCM_VERSION        : 
2024-08-20T06:20:33.9783026Z --     USE_FLASH_ATTENTION : ON
2024-08-20T06:20:33.9783425Z --     USE_MEM_EFF_ATTENTION : ON

Both should be OFF to disable AOTriton.

@xinyazhang
Copy link
Collaborator Author

Okay I think I found the problem. Within Dependencies.cmake, neither USE_FLASH_ATTENTION nor USE_MEM_EFF_ATTENTION is defined.

@xinyazhang xinyazhang force-pushed the xinyazhang/nofa-2.5main branch from e767d4b to 693d4e3 Compare August 21, 2024 23:08
@pruthvistony pruthvistony added the rocm This tag is for PRs from ROCm team label Aug 22, 2024
@jithunnair-amd jithunnair-amd requested a review from malfet August 22, 2024 22:32
@pruthvistony
Copy link
Collaborator

@malfet ,
Please help on review of this PR.

@malfet malfet added the ciflow/rocm Trigger "default" config CI on ROCm label Aug 22, 2024
@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 22, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@xinyazhang
Copy link
Collaborator Author

@jithunnair-amd it seems the merge was blocked by failing CI, which is supposed be fixed by #133884

@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge -f "Build issues resolved. This PR is for build scenarios not relevant to CI. Test failures are related to GQA which is addressed in #133884."

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Sep 11, 2024
…tion with environment variables (#1570)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can
compile correctly.

This is cherry-picked version of
pytorch#133866
pruthvistony added a commit to ROCm/pytorch that referenced this pull request Sep 11, 2024
…tion with environment variables (#1571)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can
compile correctly.

This is cherry-picked version of
pytorch#133866

---------

Co-authored-by: Pruthvi Madugundu <pruthvigithub@gmail.com>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Sep 11, 2024
…ronment variables (#1542)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can
compile correctly.

This is cherry-picked version of
pytorch#133866

Tested with `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python
setup.py develop --user` and `python -c 'import torch'`
pytorch-bot bot pushed a commit that referenced this pull request Sep 13, 2024
…ronment variables (#133866)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can compile correctly

Fixes #125230

Pull Request resolved: #133866
Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily, https://github.com/malfet
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
…ronment variables (pytorch#133866)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can compile correctly

Fixes pytorch#125230

Pull Request resolved: pytorch#133866
Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily, https://github.com/malfet
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Mar 17, 2025
…tion with environment variables (#1570)

Now `USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py` can
compile correctly.

This is cherry-picked version of
pytorch#133866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source rocm This tag is for PRs from ROCm team topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ROCm: fatal error: aotriton/flash.h: No such file or directory when building with USE_ROCM=1
7 participants
0