8000 update pointwise cat heuristics by eellison · Pull Request #125772 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

update pointwise cat heuristics #125772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

eellison
Copy link
Contributor
@eellison eellison commented May 8, 2024

Stack from ghstack (oldest at bottom):

Fix for #122871. There are two cases where we emit pointwise cat:

  • fusing into a pointwise use
  • horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case where there are not reductions.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

Copy link
pytorch-bot bot commented May 8, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125772

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fb2a38f with merge base 3ccf107 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Collaborator
@Chillee Chillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay!

Perhaps some more tests for the pointwise cat?

@eellison
Copy link
Contributor Author
eellison commented May 8, 2024

There are already a bunch but i can add a couple more.

eellison added 2 commits May 9, 2024 11:01
Fix for #122871. There are two cases where we emit pointwise cat:

- fusing into a pointwise use
- horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case that we would have to emit separate copy_ kernels anyway. 

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
Fix for #122871. There are two cases where we emit pointwise cat:

- fusing into a pointwise use
- horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case that we would have to emit separate copy_ kernels anyway. 

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
@eellison
Copy link
Contributor Author
eellison commented May 9, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 9, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@eellison eellison added the topic: not user facing topic category label May 9, 2024
@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@izaitsevfb
Copy link
Contributor

@pytorchbot revert -m 'Fails numerical stability test for aps model, see D57215900' -c ghfirst

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request May 11, 2024
This reverts commit d19d932.

Reverted #125772 on behalf of https://github.com/izaitsevfb due to Fails numerical stability test for aps model, see D57215900 ([comment](#125772 (comment)))
@pytorchmergebot
Copy link
Collaborator

@eellison your PR has been successfully reverted.

eellison added 2 commits May 13, 2024 12:56
Fix for #122871. There are two cases where we emit pointwise cat:

- fusing into a pointwise use
- horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case that we would have to emit separate copy_ kernels anyway. 

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
Fix for #122871. There are two cases where we emit pointwise cat:

- fusing into a pointwise use
- horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case that we would have to emit separate copy_ kernels anyway. 

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
This reverts commit d19d932.

Reverted pytorch#125772 on behalf of https://github.com/izaitsevfb due to Fails numerical stability test for aps model, see D57215900 ([comment](pytorch#125772 (comment)))
tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
Fix for pytorch#122871. There are two cases where we emit pointwise cat:

- fusing into a pointwise use
- horizontally fusing copy_ kernels

The regression I looked into previously was due to being overly aggressive in the latter case. I've updated the logic there so that we only emit the horizontal fusion in the case where there are not reductions.

Pull Request resolved: pytorch#125772
Approved by: https://github.com/Chillee
pytorchmergebot pushed a commit that referenced this pull request May 15, 2024
Relanding just the pad in a single pass portion of [the pr](#118522). Not including
the transpose logic:

This was previously accepted and reviewed.

Pull Request resolved: #125773
Approved by: https://github.com/shunting314
ghstack dependencies: #125772
pytorchmergebot pushed a commit that referenced this pull request May 15, 2024
For mm inputs which are not inputs of the graph, assume that we can memory plan them in the aten.cat and exclude the padding cost in the benchmarking comparison. Technically we also have to do a small amount of 0s writing, but that should be relatively small and encompassed in the weighting of the padding time by `1.1`

Pull Request resolved: #125780
Approved by: https://github.com/shunting314
ghstack dependencies: #125772, #125773
pytorchmergebot pushed a commit that referenced this pull request May 17, 2024
Otherwise you get an error in constant_pad_nd.

Pull Request resolved: #126475
Approved by: https://github.com/huydhn
ghstack dependencies: #125772, #125773, #125780
ZelboK pushed a commit to ZelboK/pytorch that referenced this pull request May 19, 2024
…ytorch#125773)

Relanding just the pad in a single pass portion of [the pr](pytorch#118522). Not including
the transpose logic:

This was previously accepted and reviewed.

Pull Request resolved: pytorch#125773
Approved by: https://github.com/shunting314
ghstack dependencies: pytorch#125772
ZelboK pushed a commit to ZelboK/pytorch that referenced this pull request May 19, 2024
For mm inputs which are not inputs of the graph, assume that we can memory plan them in the aten.cat and exclude the padding cost in the benchmarking comparison. Technically we also have to do a small amount of 0s writing, but that should be relatively small and encompassed in the weighting of the padding time by `1.1`

Pull Request resolved: pytorch#125780
Approved by: https://github.com/shunting314
ghstack dependencies: pytorch#125772, pytorch#125773
ZelboK pushed a commit to ZelboK/pytorch that referenced this pull request May 19, 2024
Otherwise you get an error in constant_pad_nd.

Pull Request resolved: pytorch#126475
Approved by: https://github.com/huydhn
ghstack dependencies: pytorch#125772, pytorch#125773, pytorch#125780
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0