8000 [Draft][WIP] Enable XPU path for FlexAttention by liangan1 · Pull Request #143553 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[Draft][WIP] Enable XPU path for FlexAttention #143553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 61 commits into
base: main
Choose a base branch
from

Conversation

liangan1
Copy link
@liangan1 liangan1 commented Dec 19, 2024

Motivation

  1. The Attention has been the critical performance bottleneck in the current LLM models, and FlexAttention is a good choice to cover the broad variants in the transformers series models. With FlexAttention, it is easy for us to enable the paged attention and fused SDPA in the transformers repo on XPU device. Besides, it also provide a candidate to process attention in LLM ecosystem libraries ., e.g., vLLM, SGLang on XPU device.
  2. FlexAttention is good start point to push the intel triton based GEMM kernel to be matured. FlexAttention provide both flexattention kernel and flexdecoding kernel to cover both compute bound and memory bound GEMM computation, and different shapes should also been supported to serve LLM inference., e.g. head_dim=64, 96, 128, 256.

What does this PR do?

  1. Enable the device type for Flexattention kernel and UTs to ensure all important UTs pass on XPU device.
  2. For E2E model inference, ensure the functionality of LLM models inference with FlexAttention to be ready.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Copy link
pytorch-bot bot commented Dec 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/143553

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures

As of commit aa4be5a with merge base 129a297 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
linux-foundation-easycla bot commented Dec 19, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@liangan1 liangan1 marked this pull request as draft December 19, 2024 04:39
@EikanWang EikanWang added topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module ciflow/xpu Run XPU CI tasks labels Dec 24, 2024
Copy link
pytorch-bot bot commented Dec 24, 2024

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Dec 24, 2024
@EikanWang EikanWang self-requested a review December 24, 2024 02:14
@EikanWang EikanWang added the ciflow/xpu Run XPU CI tasks label Dec 24, 2024
Copy link
pytorch-bot bot commented Dec 24, 2024

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Dec 24, 2024
@liangan1
Copy link
Author

@pytorchbot rebase

Copy link
pytorch-bot bot commented Feb 10, 2025

You don't have permissions to rebase this PR since you are a first time contributor. If you think this is a mistake, please contact PyTorch Dev Infra.

@etaf etaf added the ciflow/xpu Run XPU CI tasks label Apr 28, 2025
remove duplicated wa
add format ignore flag
@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Apr 29, 2025
@anmyachev
Copy link
Collaborator

Hi @hoshibara!

Could you merge the latest changes from main? This will help us update PyTorch pin in Triton XPU repo.

@hoshibara
Copy link
Contributor

Hi @anmyachev, I've synced with latest codebase.

Copy link
Contributor
@drisspg drisspg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty nice and clean and extends in the location I would expect, do you have any details on performance, test coverage, compilation time, etc.?

@liangan1
Copy link
Author

@pytorchbot label ciflow/trunk

Copy link
pytorch-bot bot commented May 16, 2025

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@liangan1
Copy link
Author

@pytorchbot label ciflow/xpu

Copy link
pytorch-bot bot commented May 16, 2025

To add these label(s) (ciflow/xpu) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@hoshibara
Copy link
Contributor

@pytorchbot label ciflow/xpu

@pytorch-bot pytorch-bot bot added the ciflow/xpu Run XPU CI tasks label May 16, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label May 16, 2025
@hoshibara
Copy link
Contributor

@pytorchbot label ciflow/xpu

Copy link
pytorch-bot bot commented May 16, 2025

To add these label(s) (ciflow/xpu) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@hoshibara
Copy link
Contributor

@pytorchbot label ciflow/xpu

@pytorch-bot pytorch-bot bot added the ciflow/xpu Run XPU CI tasks label May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/xpu Run XPU CI tasks module: dynamo module: inductor open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

10 participants
0