[Inductor] Fix the lowering of squeeze when input is not contiguous #146746

leslie-fang-intel · 2025-02-08T03:50:25Z

Stack from ghstack (oldest at bottom):

-> [Inductor] Fix the lowering of squeeze when input is not contiguous #146746

Summary
Fix issue #143498. The issue happens when we lowering select = torch.ops.aten.select.int(cat, 1, 0).

For example, when cat is contiguous with size[2, 2] stride[2,1]

for eager, it returns a view of size[2,] stride[2,]
for Inductor lowering, it returns wrong stride 1 instead of 2

TensorBox(
  ReinterpretView(
    StorageBox(
      ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))])
    ),
    FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]),
    origins=OrderedSet([select])
  )
)

To fix this issue, we give the right stride when lowering of squeeze.

Test Plan

python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2025-02-08T03:50:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146746

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 0e955ea with merge base fa0592b ():

NEW FAILURES - The following jobs have failed:

inductor / unit-test / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_amx, 2, 2, linux.8xlarge.amx) (gh)
'Test'
inductor / unit-test / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_avx2, 2, 2, linux.10xlarge.avx2) (gh)
'Test'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: b668dcb Pull Request resolved: #146746

[ghstack-poisoned]

ghstack-source-id: 1680d45 Pull Request resolved: #146746

test/inductor/test_unbacked_symints.py

torch/_inductor/lowering.py

[ghstack-poisoned]

ghstack-source-id: 1c08131 Pull Request resolved: #146746

eellison · 2025-02-12T19:55:56Z

for Inductor lowering, it returns wrong stride 1 instead of 2

What is returning a wrong stride ? The general contract is that intermediary inductor ir can have different strides than eager. if we are to run a fallback operator, we should be re-running it with FakeTensorMode in process_kernel with the correct strides from inductor ir.

I think something is going wrong in that process. We should not need to constrain inductor ir intermediates to exactly match the striding from eager.

eellison · 2025-02-12T21:08:38Z

torch/_inductor/lowering.py

+    return (
+        (
+            as_strided(x, new_shape, new_stride, new_offset)
+            if is_storage_and_layout
+            else view(x, new_shape)
+        )
+        if new_shape != x.get_size()
+        else x


We should not artificially induced as_strided here. The problem for this is the same problem that we had in my existing pr. Which is that x here is not contiguous:

pytorch/torch/_inductor/ir.py

Lines 2776 to 2780 in de964b9

# due to the size_hint's inability to process unbacked SymInts

# TODO: unbacked should not diverge from backed in determining striding

# Need to require contiguous here instead of realize, see:

# https://github.com/pytorch/pytorch/issues/145561

x = ExternKernel.require_contiguous(x)

This is another issue of apis not working well but regardless, if you update it to

x = ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size())) the test passes.

Hi @eellison, thanks for your comment. Understanding that, ExternKernel.require_contiguous only requires the stride order but not the exact stride number which causes this problem. Looking forward to your suggestions of fixing here: should we change x = ExternKernel.require_contiguous(x) to x = ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size()))? or change the implementation of require_contiguous as require_exact_strides.

Let's do ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size())). we can separately track making this the default

Changed. @eellison please take a look again.

[ghstack-poisoned]

ghstack-source-id: dd5a327 Pull Request resolved: #146746

leslie-fang-intel · 2025-02-15T01:31:29Z

@pytorchbot merge -f "unrelated CI failure"

pytorchmergebot · 2025-02-15T01:32:52Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…146746) **Summary** Fix issue #143498. The issue happens when we lowering `select = torch.ops.aten.select.int(cat, 1, 0)`. For example, when `cat` is contiguous with size[2, 2] stride[2,1] - for eager, it returns a view of size[2,] stride[2,] - for Inductor lowering, it returns wrong stride 1 instead of 2 ``` TensorBox( ReinterpretView( StorageBox( ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))]) ), FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]), origins=OrderedSet([select]) ) ) ``` To fix this issue, we give the right stride when lowering of `squeeze`. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498 ``` Pull Request resolved: #146746 Approved by: https://github.com/jgong5, https://github.com/sanchitintel, https://github.com/eellison

…ytorch#146746) **Summary** Fix issue pytorch#143498. The issue happens when we lowering `select = torch.ops.aten.select.int(cat, 1, 0)`. For example, when `cat` is contiguous with size[2, 2] stride[2,1] - for eager, it returns a view of size[2,] stride[2,] - for Inductor lowering, it returns wrong stride 1 instead of 2 ``` TensorBox( ReinterpretView( StorageBox( ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))]) ), FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]), origins=OrderedSet([select]) ) ) ``` To fix this issue, we give the right stride when lowering of `squeeze`. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498 ``` Pull Request resolved: pytorch#146746 Approved by: https://github.com/jgong5, https://github.com/sanchitintel, https://github.com/eellison

Update

d0ce07a

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Feb 8, 2025

leslie-fang-intel added a commit that referenced this pull request Feb 8, 2025

[Inductor] Fix the lowering of squeeze when input is not contiguous

9f405ce

ghstack-source-id: b668dcb Pull Request resolved: #146746

leslie-fang-intel added topic: not user facing topic category ciflow/trunk Trigger trunk jobs on your pull request labels Feb 8, 2025

pytorchbot added the open source label Feb 8, 2025

leslie-fang-intel marked this pull request as draft February 8, 2025 04:30

Update

8d397aa

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request Feb 8, 2025

[Inductor] Fix the lowering of squeeze when input is not contiguous

32a83bb

ghstack-source-id: 1680d45 Pull Request resolved: #146746

leslie-fang-intel requested review from eellison and jgong5 February 8, 2025 10:04

leslie-fang-intel marked this pull request as ready for review February 9, 2025 02:56

jgong5 reviewed Feb 9, 2025

View reviewed changes

test/inductor/test_unbacked_symints.py Show resolved Hide resolved

torch/_inductor/lowering.py Outdated Show resolved Hide resolved

Update

e5f48e5

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request Feb 10, 2025

[Inductor] Fix the lowering of squeeze when input is not contiguous

4500167

ghstack-source-id: 1c08131 Pull Request resolved: #146746

leslie-fang-intel requested review from jgong5 and sanchitintel February 10, 2025 01:53

jgong5 approved these changes Feb 10, 2025

View reviewed changes

sanchitintel approved these changes Feb 11, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

eellison reviewed Feb 12, 2025

View reviewed changes

leslie-fang-intel requested a review from eellison February 13, 2025 02:23

Update

0e955ea

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request Feb 14, 2025

[Inductor] Fix the lowering of squeeze when input is not contiguous

d65c853

ghstack-source-id: dd5a327 Pull Request resolved: #146746

eellison approved these changes Feb 15, 2025

View reviewed changes

pytorchmergebot added the merging label Feb 15, 2025

pytorchmergebot added the Merged label Feb 15, 2025

pytorchmergebot closed this in c1fcba3 Feb 15, 2025

pytorchmergebot removed the merging label Feb 15, 2025

leslie-fang-intel mentioned this pull request Feb 16, 2025

AOTI_TORCH_CHECK failed in aot_compile-d model #143498

Closed

eellison mentioned this pull request Mar 3, 2025

Make require_contiguous require exact strides instead of stride order #148235

Open

github-actions bot deleted the gh/leslie-fang-intel/180/head branch March 23, 2025 02:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inductor] Fix the lowering of squeeze when input is not contiguous #146746

[Inductor] Fix the lowering of squeeze when input is not contiguous #146746

This comment was marked as resolved.

	# due to the size_hint's inability to process unbacked SymInts
	# TODO: unbacked should not diverge from backed in determining striding
	# Need to require contiguous here instead of realize, see:
	# https://github.com/pytorch/pytorch/issues/145561
	x = ExternKernel.require_contiguous(x)

[Inductor] Fix the lowering of squeeze when input is not contiguous #146746

[Inductor] Fix the lowering of squeeze when input is not contiguous #146746

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146746

❌ 2 New Failures

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Merge started