8000 [ROCm] Prevent accidental enablement of efficient attention. by xinyazhang · Pull Request #133331 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[ROCm] Prevent accidental enablement of efficient attention. #133331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

xinyazhang
Copy link
Collaborator
@xinyazhang xinyazhang commented Aug 13, 2024

Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes #132004

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

…d_tensors_cuda

ROCm's Efficient Attention (GPU kernel shared with FA) is more
tolerancing about the inputs.
Copy link
pytorch-bot bot commented Aug 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133331

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit e094699 with merge base 89795da (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pruthvistony pruthvistony added module: rocm AMD GPU support for Pytorch rocm This tag is for PRs from ROCm team rocm priority high priority ROCm PRs from performance or other aspects ciflow/rocm Trigger "default" config CI on ROCm topic: bug fixes topic category labels Aug 13, 2024
@pruthvistony pruthvistony added this to the 2.4.1 milestone Aug 13, 2024
@malfet
Copy link
Contributor
malfet commented Aug 16, 2024

@xinyazhang is this read for review? If so, can you please remove draft status

@xinyazhang
Copy link
Collaborator Author

is this read for review? If so, can you please remove draft status

Yes this is ready. I'll implement your suggestion and move it out of draft status.

@xinyazhang xinyazhang marked this pull request as ready for review August 16, 2024 18:22
@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 16, 2024
@jithunnair-amd jithunnair-amd requested a review from malfet August 22, 2024 22:55
@jeffdaily jeffdaily changed the title Prevent accidental enablement of efficient attention. [ROCm] Prevent accidental enablement of efficient attention. Aug 26, 2024
@malfet
Copy link
Contributor
malfet commented Aug 26, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 26, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@xinyazhang
Copy link
Collaborator Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Aug 26, 2024
@xinyazhang
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge -f "Unrelated CI failures. Critical fix needed for 2.4.1"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@jithunnair-amd
Copy link
Collaborator

@pytorchbot cherry-pick --onto release/2.4 -c critical

pytorchbot pushed a commit that referenced this pull request Aug 27, 2024
Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes #132004

Pull Request resolved: #133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)
@pytorchbot
Copy link
Collaborator

Cherry picking #133331

The cherry pick PR is at #134531 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated:

Details for Dev Infra team Raised by workflow job

atalman pushed a commit that referenced this pull request Aug 27, 2024
[ROCm] Prevent accidental enablement of efficient attention. (#133331)

Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes #132004

Pull Request resolved: #133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)

Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com>
xinyazhang added a commit to ROCm/pytorch that referenced this pull request Aug 28, 2024
…#134531)

[ROCm] Prevent accidental enablement of efficient attention. (pytorch#133331)

Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes pytorch#132004

Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)

Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Sep 9, 2024
…#134531) (#1565)

[ROCm] Prevent accidental enablement of efficient attention. (pytorch#133331)

Currently Efficient attention and Flash attention share the same set of
GPU kernels on ROCM and have common limitations on head sizes.

Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet,
https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)

Fixes pytorch#132004

Co-authored-by: pytorchbot <soumith+bot@pytorch.org>
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
…#133331)

Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes pytorch#132004

Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source rocm priority high priority ROCm PRs from performance or other aspects rocm This tag is for PRs from ROCm team topic: bug fixes topic category topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory Efficient Attention on ROCm results in image corruption on the diffusers SD3 pipeline
7 participants
0