[inductor] Fix block ptr store if input is constant #148679

kundaMwiza · 2025-03-06T15:40:06Z

Since block ptr stores require explicit broadcasts, the input to tl.store needs to be reshaped and broadcasted. Currently, it is assumed that the input to be stored is in block form (e.g. XBLOCK), however it is possible for the input to be a scalar, and so special handling is required to reshape + broadcast the scalar to the output block shape.

Ideally the shape of the input would be an attribute of a TritonCSEVariable via shape propagation but that is not the case today. The patch in this PR determines if the input is a constant by checking the arguments to an FX store node which is not ideal. Maybe there is an alternative and simpler method

Fixes #ISSUE_NUMBER

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

… block ptrs Remove formatting changes Lint Rename vars Dont reshape float32 scalars Handle constants separately

pytorch-bot · 2025-03-06T15:40:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148679

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 10ced66 with merge base c65ee72 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kundaMwiza · 2025-03-06T15:40:26Z

torch/_inductor/codegen/triton.py

-                assert block_ptr not in advancements, (
-                    "duplicate advancement for pointer '{block_ptr}' at type '{symt}'"
-                )
+                assert (


lintrunner changes

kundaMwiza · 2025-03-06T15:40:41Z

torch/_inductor/codegen/triton.py

@@ -3039,9 +3059,9 @@ def sort(
        self.filter_masks(masks)
        masks = sorted(masks)
        assert not self._load_mask, "ops.sort not supported inside ops.masked"
-        assert self.persistent_reduction, (


lintrunner changes

kundaMwiza · 2025-03-06T15:43:05Z

@pytorchbot label "topic: not user facing"

eellison · 2025-03-20T16:38:14Z

torch/_inductor/codegen/triton.py

+            assert isinstance(value, CSEVariable)
+            # See `_shaped_constant`. Tensor constants are only created
+            # if the dtype is not float32.
+            if value.dtype is not torch.float32:


This is a lot of action at a distance. could we refactor this ?

And yes, i agree, we should add shape.

blaine-rister · 2025-05-10T22:08:29Z

torch/_inductor/codegen/triton.py

+                value = triton_reshape(
+                    str(value), [sympy.S.One], [sympy.S.One] * len(indexing.block_shape)
+                )
+            value = f"tl.broadcast_to({value}, {V.kernel.index_to_str(indexing.block_shape)})"


Nice catch!

https://github.com/pytorch/pytorch/pull/151399/files recently fixed a related bug by always broadcasting block ptr stores to indexing.block_shape. In light of that change, would it make sense to reuse indexing.codegen_broadcast_and_reshape for scalars instead of having separate code paths here?

Reshape scalars appropriately prior to broadcasting for tl.store's on…

10ced66

… block ptrs Remove formatting changes Lint Rename vars Dont reshape float32 scalars Handle constants separately

pytorch-bot bot added the module: inductor label Mar 6, 2025

kundaMwiza commented Mar 6, 2025

View reviewed changes

pytorch-bot bot added the topic: not user facing topic category label Mar 6, 2025

pytorchbot added the open source label Mar 6, 2025

colesbury requested a review from eellison March 6, 2025 18:35

colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 6, 2025

eellison requested a review from blaine-rister March 11, 2025 22:41

eellison reviewed Mar 20, 2025

View reviewed changes

8000
eellison mentioned this pull request Mar 25, 2025

[Inductor] track block shape of intermediary variables #149905

Open

blaine-rister reviewed May 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Fix block ptr store if input is constant #148679

[inductor] Fix block ptr store if input is constant #148679

[inductor] Fix block ptr store if input is constant #148679

Are you sure you want to change the base?

[inductor] Fix block ptr store if input is constant #148679

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148679

✅ No Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment