[Inductor] Add fused_attention pattern matcher with additional clone #108141

Valentine233 · 2023-08-29T10:48:57Z

A previous PR #106274 decomposes aten.dropout and would create a clone() when eval() or p=0. This makes many SDPA-related models fail to match fused_attention pattern matchers.

This PR adds new fused_attention pattern matchers with an additional clone to re-enable the SDPA op matching.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

pytorch-bot · 2023-08-29T10:49:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108141

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8eb66b8 with merge base 60bb02a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jgong5

Can we add remove_extra_clones call to joint_graph_passes the similar way as it is invoked in the post grad pass? This way, we don't need to add extra patterns to match the redundant clones.

eellison

@jgong5 the clone removal pass has subtle correctness conditions, so it might be better to do in post grad. we could look into doing it in the joint but I don't know if it should block this pr.

jgong5 · 2023-08-30T00:39:43Z

@jgong5 the clone removal pass has subtle correctness conditions, so it might be better to do in post grad. we could look into doing it in the joint but I don't know if it should block this pr.

Oh, I didn't realize it. Yes, let's get this PR in first then.

eellison · 2023-08-30T00:51:34Z

@pytorchbot merge

pytorchmergebot · 2023-08-30T00:53:25Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…108141) A previous PR #106274 decomposes `aten.dropout` and would create a `clone()` when `eval()` or `p=0`. This makes many SDPA-related models fail to match fused_attention pattern matchers. This PR adds new fused_attention pattern matchers with an additional clone to re-enable the SDPA op matching. Pull Request resolved: #108141 Approved by: https://github.com/jgong5, https://github.com/eellison

…ytorch#108141) A previous PR pytorch#106274 decomposes `aten.dropout` and would create a `clone()` when `eval()` or `p=0`. This makes many SDPA-related models fail to match fused_attention pattern matchers. This PR adds new fused_attention pattern matchers with an additional clone to re-enable the SDPA op matching. Pull Request resolved: pytorch#108141 Approved by: https://github.com/jgong5, https://github.com/eellison

…108141) (#108327) A previous PR #106274 decomposes `aten.dropout` and would create a `clone()` when `eval()` or `p=0`. This makes many SDPA-related models fail to match fused_attention pattern matchers. This PR adds new fused_attention pattern matchers with an additional clone to re-enable the SDPA op matching. Pull Request resolved: #108141 Approved by: https://github.com/jgong5, https://github.com/eellison

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

…0, cleanup" When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

…0, cleanup" When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually. This updates the inference patterns generated to trace with dropout_p = 0.0. Pull Request resolved: #109118 Approved by: https://github.com/drisspg, https://github.com/Valentine233

github-actions bot added module: inductor ciflow/inductor labels Aug 29, 2023

Valentine233 added the topic: not user facing topic category label Aug 29, 2023

Valentine233 requested review from jgong5 and XiaobingSuper August 29, 2023 10:50

Valentine233 added the intel This tag is for PR from Intel label Aug 29, 2023

pytorchbot added the open source label Aug 29, 2023

Valentine233 force-pushed the fuse_attention_pattern branch from 637d24a to 8eb66b8 Compare August 29, 2023 11:39

jgong5 requested changes Aug 29, 2023

View reviewed changes

albanD requested a review from drisspg August 29, 2023 17:49

drisspg requested a review from eellison August 29, 2023 18:09

[Inductor] Add fused_attention pattern matcher for additional clone

8eb66b8

eellison approved these changes Aug 29, 2023

View reviewed changes

jgong5 approved these changes Aug 30, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 30, 2023

pytorchmergebot added the merging label Aug 30, 2023

pytorchmergebot added Merged and removed merging labels Aug 30, 2023

pytorchmergebot closed this in 3a79621 Aug 30, 2023

This was referenced Aug 31, 2023

[Inductor] Add fused_attention pattern matcher with additional clone #108327

Merged

[v.2.1.0] Release Tracker #108055

Closed

eellison mentioned this pull request Sep 12, 2023

Trace attention inference patterns with p=0, cleanup #109118

Closed

github-actions bot deleted the fuse_attention_pattern branch February 28, 2025 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inductor] Add fused_attention pattern matcher with additional clone #108141

[Inductor] Add fused_attention pattern matcher with additional clone #108141

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Inductor] Add fused_attention pattern matcher with additional clone #108141

[Inductor] Add fused_attention pattern matcher with additional clone #108141

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108141

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!