8000 [Inductor] short-term fix for needs_fixed_stride_order silent incorrectness by zou3519 · Pull Request #133452 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[Inductor] short-term fix for needs_fixed_stride_order silent incorrectness #133452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

zou3519
Copy link
Contributor
@zou3519 zou3519 commented Aug 14, 2024

Stack from ghstack (oldest at bottom):

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:

  • new test

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

…ctness

This is a short-term fix for
#128084, for the purposes of a
2.4.1.
needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
Copy link
pytorch-bot bot commented Aug 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133452

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 04a6738 with merge base cd565bc (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zou3519 added a commit that referenced this pull request Aug 14, 2024
…ctness

This is a short-term fix for
#128084, for the purposes of a
2.4.1.
needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

ghstack-source-id: 057926f
Pull Request resolved: #133452
@zou3519 zou3519 added the ci-no-td Do not run TD on this PR label Aug 14, 2024
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Aug 14, 2024
…ctness

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

ghstack-source-id: 057926f
Pull Request resolved: #133452
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Aug 14, 2024
…ctness

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

ghstack-source-id: 8d19392
Pull Request resolved: #133452
@zou3519 zou3519 requested a review from eellison August 14, 2024 19:56
Copy link
Contributor
@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment

new_args = []
new_kwargs = {}
schema_args, schema_kwargs = torch._library.utils.schema_args_kwargs(schema)
for arg, fx_arg, schema_arg in zip(args, fx_node.args, schema_args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no contract that the fx node at this point is using fx_node.args instead of fx_node.kwargs. What I would expect to see here is torch.fx.operator_schemas.normalize_function and then iterate over the schema by argname

Copy link
Contributor Author
@zou3519 zou3519 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I added this to the list of things we might want to tighten restrictions for (#133250) -- it is the case that after AOTDispatcher (args, kwargs) are normalized, but I guess the inductor passes can do whatever they want

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way... the code as previously written assumes that fx_node.{args, kwargs} are zip-able with {args, kwargs}. Are you saying that's incorrect too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code to handle the case where the schema_args/schema_kwargs can be different from args/kwargs. The way constrain_to_fx_strides is called (and from how the code was written previously) it looks like fx_node.{args, kwargs} being zip-able with {args, kwargs} is a reasonable assumption (but if we think that's a bug I can fix it in a follow-up -- I want this PR to go in without changing too much since it needs to be cherry-picked)

…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Aug 15, 2024
…ctness

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

ghstack-source-id: 06dfc6d
Pull Request resolved: #133452
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Aug 15, 2024
…ctness

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

ghstack-source-id: 9558a63
Pull Request resolved: #133452
@zou3519 zou3519 requested a review from eellison August 15, 2024 14:55
@zou3519 zou3519 mentioned this pull request Aug 15, 2024
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
@zou3519 zou3519 added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 16, 2024
…ent incorrectness"

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

[ghstack-poisoned]
@zou3519
Copy link
Contributor Author
zou3519 commented Aug 16, 2024

@pytorchbot merge -f "unrelated failure"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

zou3519 added a commit that referenced this pull request Aug 19, 2024
…ctness (#133452)

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

Pull Request resolved: #133452
Approved by: https://github.com/eellison
zou3519 added a commit that referenced this pull request Aug 19, 2024
…ctness (#133452)

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

Pull Request resolved: #133452
Approved by: https://github.com/eellison
atalman pushed a commit that referenced this pull request Aug 21, 2024
…ctness (#133452) (#133888)

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

Pull Request resolved: #133452
Approved by: https://github.com/eellison
atalman added a commit to atalman/pytorch that referenced this pull request Aug 22, 2024
pytorch-bot bot pushed a commit that referenced this pull request Sep 13, 2024
…ctness (#133452)

This is a low-risk short-term fix for
#128084, for the purposes of
2.4.1. The actual fix for that issue is more risky and we'll target 2.5.

needs_fixed_stride_order is silently incorrect with args that are
mutable because it creates clones of those args, writes into them, and
doesn't update the original args.

This PR makes it so that needs_fixed_stride_order doesn't apply to
inputs that are being mutated.

This PR doesn't completely fix the problem, but it makes it less
incorrect: most of the time the input already has the correct strides
but inductor fails to recognize it, and in those cases writing directly
to the input is fine.

Test Plan:
- new test

Pull Request resolved: #133452
Approved by: https://github.com/eellison
@github-actions github-actions bot deleted the gh/zou3519/1044/head branch September 17, 2024 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0