8000 removed zero dim cpu logic from fake_tensor.py by zero000064 · Pull Request #147501 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

removed zero dim cpu logic from fake_tensor.py #147501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 73 commits into
base: main
Choose a base branch
from

Conversation

zero000064
Copy link
Contributor
@zero000064 zero000064 commented Feb 20, 2025

Fixes #144748
In #144748, the inconsistency between the eager mode and the inductor mode is reported as a bug.
The root cause is fake_tenosr.py's find-common-device method,

def _find_common_device(
, takes zero dim cpu tensor into account but the device check in adaption.h doesn't.

This fix is to add a list for some ops to bypass zero-dim-cpu-tensor check to align with the eager mode.

cc @eellison @zou3519

Copy link
pytorch-bot bot commented Feb 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147501

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 28f7a88 with merge base f5e0806 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@zero000064
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Feb 20, 2025
@zero000064 zero000064 marked this pull request as ready for review February 20, 2025 03:55
@zero000064 zero000064 requested a review from mruberry as a code owner February 20, 2025 03:55
@zero000064
Copy link
Contributor Author

@mruberry could you check if you're the right reviewer for this?

@zero000064 zero000064 marked this pull request as draft February 25, 2025 03:31
@zero000064 zero000064 marked this pull request as ready for review February 25, 2025 03:32
@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 25, 2025
Copy link
Contributor
@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will let the folks more familiar with core review this but this doesn't seem like the right fix.

Also - FakeTensor is meant to represent the semantics of eager. we should not make eager only changes without a legitimate reason outside of FakeTensor.

@zero000064
Copy link
Contributor Author
zero000064 commented Mar 2, 2025

@zou3519 could you review this pull request?

@zero000064
Copy link
Contributor Author

Checked failures in https://github.com/pytorch/pytorch/actions/runs/13427521197/job/37951270051, those codes seem to be auto-generated by some script, but unable to find the script for xpu, it might be outside this repository.

@huydhn
Copy link
Contributor
huydhn commented Mar 14, 2025

@pytorchbot rebase

@huydhn huydhn requested a review from eellison March 14, 2025 16:22
@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fake_tensor onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fake_tensor && git pull --rebase)

Copy link
Contributor
@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment - shouldnt change eager to make it align with fake tensor. fake tensor should change instead.

@zero000064
Copy link
Contributor Author 8000

@eellison Thanks for the comment, I guess the suggested change is to remove the zero dim cpu case in the fake tensor side instead of copying the logic in the eager mode.

@zero000064
Copy link
Contributor Author

@eellison , removed zero dim cpu tensor logic from fake_tensor.py, could you take a look again?

@zero000064 zero000064 requested a review from eellison March 24, 2025 03:47
@zero000064
Copy link
Contributor Author

Merging conflict, working on resolving the conflict.

@zero000064
Copy link
Contributor Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/147501/head returned non-zero exit code 1

Rebasing (1/12)
Rebasing (2/12)
Rebasing (3/12)
Rebasing (4/12)
dropping 5a0517f2babcb03519373e1ec60d39c31e61344c modified device check logic and added tests -- patch contents already upstream
Rebasing (5/12)
Rebasing (6/12)
Rebasing (7/12)
Rebasing (8/12)
Auto-merging test/test_fake_tensor.py
Auto-merging torch/_subclasses/fake_tensor.py
CONFLICT (content): Merge conflict in torch/_subclasses/fake_tensor.py
error: could not apply 626ae16d046... removed zero dim cpu tensor logic from fake_tensor.py
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply 626ae16d046... removed zero dim cpu tensor logic from fake_tensor.py

Raised by https://github.com/pytorch/pytorch/actions/runs/14120360541

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fake_tensor onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fake_tensor && git pull --rebase)

@zero000064
Copy link
Contributor Author

@malfet fixed all failures, could you reenable CI?

@zero000064
Copy link
Contributor Author

@malfet looks like no failure in CI, could you review this pull request?

@zero000064
Copy link
Contributor Author

@zero000064 zero000064 requested a review from eellison May 9, 2025 22:29
@zero000064
Copy link
Contributor Author

@eellison added a list to bypass zero dim cpu tensor logic, could you review the change again?

@huydhn huydhn requested a review from eellison May 15, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: amp (automated mixed precision) autocast module: compiled autograd compiled_autograd module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor NNC oncall: distri 5F2F buted Add this issue/PR to distributed oncall triage queue open source release notes: distributed (checkpoint) release notes: quantization release notes category topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[inductor] [cuda] [fake tensor] torch.nextafter loose the check for different device tensor on inductor
8 participants
0