-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[CPU][Brgemm] add support for int8 brgemm #143384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/143384
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (1 Unrelated Failure)As of commit ea8db8b with merge base 9631d1a ( UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on the usage of batch_size
? How is it to be used by SDPA?
The |
In that case, why not making the |
To efficiently do the softmax, which is before the gemm of A and V, we create the buffer shape for A as |
Can you elaborate the necessity of |
Thanks, Jiong. We assume |
Are you referring to the input to the softmax or the output of it, or both? |
Both the input and output of softmax. |
Can we make the layout of input and output different? The input can be blocked and the output can be in 2D and then we don't need batch-reduce semantics. |
Thanks, I will have a try to make the output layout different. |
As the layout change may impact the kernel perf, I would continue this PR after the perf is confirmed. |
be43b40
to
d592cc0
Compare
@jgong5 Hi, I have confirmed the perf, and remove the support of batch_size in brgemm. Please help review again, thanks! |
d592cc0
to
ea8db8b
Compare
@pytorchbot merge |
Merge failedReason: Approvers from one of the following sets are needed:
|
@peterbell10 @ezyang Could you help take a look? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okey dokey but you sure you don't want tests?
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
For INT8 SDPA kernel usage, we add support for INT8 Brgemm.
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov