use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

laithsakka · 2025-03-10T17:42:01Z

Stack from ghstack (oldest at bottom):

-> use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

this was discussed with @eellison and he recommended using statically_known_true here, the intuition is. We already have 0/1 specializations in place, if we reach those checks with dynamic shapes that are not already specialized
then we do not want them to specialize them, "a recompilation here is not justified".
Those are all non-semantic changing optimizations.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2025-03-10T17:42:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148893

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b68c0ef with merge base 0f8613b ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / cuda12.6-py3.10-gcc9-sm86 / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
demucs

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 0071a35 Pull Request resolved: #148893

…livous" [ghstack-poisoned]

eellison

request review when tests are fixed

…livous" [ghstack-poisoned]

ghstack-source-id: 09e82ca Pull Request resolved: #148893

…in bmm and mm decompositions . " this was discussed with eellison and he recommended using statically_known_true here, the intuition is. We already have 0/1 specializations in place, if we reach those checks with dynamic shapes that are not already specialized then we do not want them to specialize them, "a recompilation here is not justified". Those are all non-semantic changing optimizations. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: 2a375ed Pull Request resolved: #148893

ghstack-source-id: 992156c Pull Request resolved: pytorch/pytorch#148893

eellison

I'm a little worried this will regress performance in gpt-fast, but, I agree that today this is not the best experience.

Getting these firing will be a nice end result of @bobrenjc93's work

laithsakka · 2025-04-25T04:09:07Z

We can always go back to guard_or_false if this end up being a regression @eellison

laithsakka · 2025-04-25T04:09:47Z

@pytorchbot merge

pytorchmergebot · 2025-04-25T04:11:35Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-04-25T10:10:12Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

laithsakka · 2025-04-28T16:42:37Z

@pytorchbot merge -f "GH is broken"

pytorchmergebot · 2025-04-28T16:44:13Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

use statically known true instead of guard size oblivous

0ff6b44

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Mar 10, 2025

laithsakka added a commit that referenced this pull request Mar 10, 2025

use statically known true instead of guard size oblivous

f8a6d88

ghstack-source-id: 0071a35 Pull Request resolved: #148893

laithsakka mentioned this pull request Mar 10, 2025

[DRAFT] make reshape work for reshapeing 1dim unbacked non-contig to anything #148899

Open

10000

laithsakka changed the title ~~use statically known true instead of guard size oblivous~~ [DRAFT] use statically known true instead of guard size oblivous Mar 10, 2025

laithsakka requested a review from eellison March 10, 2025 18:48

laithsakka mentioned this pull request Mar 16, 2025

[Pushed as separate PR on different stack] cache kernel codegen and loading #149266

Closed

Update on "[DRAFT] use statically known true instead of guard size ob…

90f0943

…livous" [ghstack-poisoned]

eellison requested changes Mar 18, 2025

View reviewed changes

Update on "[DRAFT] use statically known true instead of guard size ob…

2e26d07

…livous" [ghstack-poisoned]

laithsakka added a commit that referenced this pull request Apr 21, 2025

use statically known true instead of guard size oblivous

19dc411

ghstack-source-id: 09e82ca Pull Request resolved: #148893

laithsakka changed the title ~~[DRAFT] use statically known true instead of guard size oblivous~~ use statically known true instead of guard size oblivious Apr 21, 2025

laithsakka changed the title ~~use statically known true instead of guard size oblivious~~ use statically known true instead of guard size oblivious in bmm and mm decompositions . Apr 21, 2025

laithsakka added the topic: not user facing topic category label Apr 21, 2025

laithsakka added a commit that referenced this pull request Apr 21, 2025

use statically known true instead of guard size oblivous

55d0c15

ghstack-source-id: 2a375ed Pull Request resolved: #148893

laithsakka changed the title ~~use statically known true instead of guard size oblivious in bmm and mm decompositions .~~ use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . Apr 21, 2025

laithsakka requested a review from eellison April 21, 2025 22:23

Divigroup-RAP pushed a commit to Divigroup-RAP/PYTORCH that referenced this pull request Apr 22, 2025

use statically known true instead of guard size oblivous

c8723ed

ghstack-source-id: 992156c Pull Request resolved: pytorch/pytorch#148893

laithsakka requested a review from bobrenjc93 April 22, 2025 17:12

eellison approved these changes Apr 24, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 25, 2025

pytorchmergebot added the merging label Apr 25, 2025

pytorchmergebot closed this in cbf8e0f Apr 28, 2025

pytorchmergebot added Merged and removed merging labels Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . #148893

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148893

✅ You can merge normally! (1 Unrelated Failure)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Merge started

Merge started