You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ROCm] Prevent accidental enablement of efficient attention. (pytorch#134531)
[ROCm] Prevent accidental enablement of efficient attention. (pytorch#133331)
Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.
Fixespytorch#132004
Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd
(cherry picked from commit 46ecc67)
Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com>
0 commit comments