support scaled mm on inductor #153602

shiyang-weng · 2025-05-15T08:21:20Z

Support scaled_mm on inductor
Fuse following pattern to scaled_mm

   #   + - - - - | - - - - - - | - - - - - +
   #   |    dq_per_tensor  dq_per_tensor   |
   #   |         |              |          |
   #   |    OPT(to_bf16)    OPT(to_bf16)   |
   #   |          \             |          |
   #   |                     permute       |
   #   |                     /             |
   #   |             addmm/mm              |
   #   |                |                  |
   #   |      OPT(quant_per_tensor)        |

~~cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov~~

pytorch-bot · 2025-05-15T08:21:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153602

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 0c87280 with merge base 4015166 ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-05-15T08:22:11Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…to check dtype

jerryzh168 · 2025-05-16T17:13:55Z

how does quantization pattern got produced for this?

we have moved the pt2e flow to torchao recently, would it be better for this to be added in torchao: https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e/inductor_passes?

Xia-Weiwen · 2025-05-17T05:18:18Z

we have moved the pt2e flow to torchao recently, would it be better for this to be added in torchao: https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e/inductor_passes?

Yeah I agree. I probably need to move this to Torchao.

scaled_mm draft

a1b80e6

shiyang-weng marked this pull request as draft May 15, 2025 08:21

pytorch-bot bot added the module: inductor label May 15, 2025

pytorchbot added the open source label May 15, 2025

shiyang-weng added 2 commits May 15, 2025 08:57

Merge branch 'main' into wengshiy/scaled_mm

c28256f

remove has_quant because scaled_mm not support output_scale yet

b9e8c91

shiyang-weng marked this pull request as ready for review May 16, 2025 07:14

shiyang-weng added 3 commits May 16, 2025 11:10

remove register embeddingbag; add extra_check but still not know how …

893b974

…to check dtype

add comment

fd6c756

exact check dtype

0c87280

jerryzh168 requested review from leslie-fang-intel, sanchitintel and Xia-Weiwen May 16, 2025 17:12

jerryzh168 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support scaled mm on inductor #153602

support scaled mm on inductor #153602

support scaled mm on inductor #153602

Are you sure you want to change the base?

support scaled mm on inductor #153602

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153602

❌ 1 New Failure

This PR needs a release notes: label

This PR needs a `release notes:` label