8000 Trace attention inference patterns with p=0, cleanup by eellison · Pull Request #109118 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Trace attention inference patterns with p=0, cleanup #109118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

eellison
Copy link
Contributor
@eellison eellison commented Sep 12, 2023

Stack from ghstack (oldest at bottom):

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually.

This updates the inference patterns generated to trace with dropout_p = 0.0.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

@pytorch-bot
Copy link
pytorch-bot bot commented Sep 12, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109118

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 079cfbe with merge base d4990ad (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

args2 = self._clone_inputs(args1)

for training in [False, True]:
for training in [False] + ([True] if check_train else []):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit [False, Train] if check_train else [False]

Copy link
Contributor
@drisspg drisspg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual pattern definitions look much cleaner!

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually.

This updates the inference patterns generated to trace with dropout_p = 0.0.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov

[ghstack-poisoned]
When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually.

This updates the inference patterns generated to trace with dropout_p = 0.0.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 13, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm5.6-py3.8 / test (default, 3, 3, linux.rocm.gpu)

Details for Dev Infra team Raised by workflow job

@eellison
Copy link
Contributor Author

@pytorchbot merge -f "flakey failure, passing in the next pr"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x a24980962b62bb868987209cefb6febb34c42dab returned non-zero exit code 1

Auto-merging test/inductor/test_fused_attention.py
CONFLICT (content): Merge conflict in test/inductor/test_fused_attention.py
Auto-merging torch/_inductor/pattern_matcher.py
error: could not apply a24980962b6... Trace attention inference patterns with p=0, cleanup
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
Details for Dev Infra team Raised by workflow job

When dropout is traced in inference, it creates a clone() instead of training pattern of rand() etc. This was partially addressed by manually #108141, however that did not cover all of the patterns that included dropout, and there is no reason we should have to specify them manually.

This updates the inference patterns generated to trace with dropout_p = 0.0.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/eellison/538/head branch September 18, 2023 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0