-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[Inductor] Fix the lowering of squeeze when input is not contiguous #146746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inductor] Fix the lowering of squeeze when input is not contiguous #146746
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146746
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 0e955ea with merge base fa0592b ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This comment was marked as resolved.
This comment was marked as resolved.
What is returning a wrong stride ? The general contract is that intermediary inductor ir can have different strides than eager. if we are to run a fallback operator, we should be re-running it with FakeTensorMode in I think something is going wrong in that process. We should not need to constrain inductor ir intermediates to exactly match the striding from eager. |
torch/_inductor/lowering.py
Outdated
return ( | ||
( | ||
as_strided(x, new_shape, new_stride, new_offset) | ||
if is_storage_and_layout | ||
else view(x, new_shape) | ||
) | ||
if new_shape != x.get_size() | ||
else x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not artificially induced as_strided here. The problem for this is the same problem that we had in my existing pr. Which is that x
here is not contiguous:
Lines 2776 to 2780 in de964b9
# due to the size_hint's inability to process unbacked SymInts | |
# TODO: unbacked should not diverge from backed in determining striding | |
# Need to require contiguous here instead of realize, see: | |
# https://github.com/pytorch/pytorch/issues/145561 | |
x = ExternKernel.require_contiguous(x) |
This is another issue of apis not working well but regardless, if you update it to
x = ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size()))
the test passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @eellison, thanks for your comment. Understanding that, ExternKernel.require_contiguous
only requires the stride order but not the exact stride number which causes this problem. Looking forward to your suggestions of fixing here: should we change x = ExternKernel.require_contiguous(x)
to x = ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size()))
? or change the implementation of require_contiguous
as require_exact_strides
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do ExternKernel.require_exact_strides(x, FlexibleLayout.contiguous_strides(x.get_size()))
. we can separately track making this the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed. @eellison please take a look again.
@pytorchbot merge -f "unrelated CI failure" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…146746) **Summary** Fix issue #143498. The issue happens when we lowering `select = torch.ops.aten.select.int(cat, 1, 0)`. For example, when `cat` is contiguous with size[2, 2] stride[2,1] - for eager, it returns a view of size[2,] stride[2,] - for Inductor lowering, it returns wrong stride 1 instead of 2 ``` TensorBox( ReinterpretView( StorageBox( ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))]) ), FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]), origins=OrderedSet([select]) ) ) ``` To fix this issue, we give the right stride when lowering of `squeeze`. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498 ``` Pull Request resolved: #146746 Approved by: https://github.com/jgong5, https://github.com/sanchitintel, https://github.com/eellison
…146746) **Summary** Fix issue #143498. The issue happens when we lowering `select = torch.ops.aten.select.int(cat, 1, 0)`. For example, when `cat` is contiguous with size[2, 2] stride[2,1] - for eager, it returns a view of size[2,] stride[2,] - for Inductor lowering, it returns wrong stride 1 instead of 2 ``` TensorBox( ReinterpretView( StorageBox( ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))]) ), FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]), origins=OrderedSet([select]) ) ) ``` To fix this issue, we give the right stride when lowering of `squeeze`. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498 ``` Pull Request resolved: #146746 Approved by: https://github.com/jgong5, https://github.com/sanchitintel, https://github.com/eellison
…ytorch#146746) **Summary** Fix issue pytorch#143498. The issue happens when we lowering `select = torch.ops.aten.select.int(cat, 1, 0)`. For example, when `cat` is contiguous with size[2, 2] stride[2,1] - for eager, it returns a view of size[2,] stride[2,] - for Inductor lowering, it returns wrong stride 1 instead of 2 ``` TensorBox( ReinterpretView( StorageBox( ConcatKernel(name='buf10', layout=FixedLayout('cpu', torch.int64, size=[u0, 2], stride=[2, 1]), inputs=[ComputedBuffer(name='buf8', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b856449d0>, ranges=[u0, 1])), ComputedBuffer(name='buf9', layout=NonOwningLayout('cpu', torch.int64, size=[u0, 1], stride=[2, 1]), data=Pointwise(device=device(type='cpu'), dtype=torch.int64, inner_fn=<function ReinterpretView.make_loader.<locals>.loader at 0x7f6b85644790>, ranges=[u0, 1]))]) ), FixedLayout('cpu', torch.int64, size=[u0], stride=[**1**]), origins=OrderedSet([select]) ) ) ``` To fix this issue, we give the right stride when lowering of `squeeze`. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_unbacked_symints.py -k test_issue_143498 ``` Pull Request resolved: pytorch#146746 Approved by: https://github.com/jgong5, https://github.com/sanchitintel, https://github.com/eellison
Stack from ghstack (oldest at bottom):
Summary
Fix issue #143498. The issue happens when we lowering
select = torch.ops.aten.select.int(cat, 1, 0)
.For example, when
cat
is contiguous with size[2, 2] stride[2,1]To fix this issue, we give the right stride when lowering of
squeeze
.Test Plan
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov