-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[Inductor] Construct subgraph with benchmarking args not example_inputs #153667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Differential Revision: D74259569
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153667
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit d53847f with merge base 7e16cb9 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D74484747 |
0fb6441
to
2a36eba
Compare
This pull request was exported from Phabricator. Differential Revision: D74484747 |
2a36eba
to
01f423c
Compare
…ts (pytorch#153667) Summary: Pull Request resolved: pytorch#153667 If the inputs to a subgraph has FlexibleLayout, the subgraph does not currently freeze the layouts here. Therefore, the `example_inputs` generated might not be consistent in layout with the `args` based in for benchmarking Test Plan: ` M, N, K = (4, 128, 14240) import torch torch.set_default_device("cuda") torch.compile(mode='max-autotune-no-cudagraphs') def foo(x, y): return (x + 1) @ y inps = [torch.rand([M, K], device='cuda', dtype=torch.bfloat16), torch.rand([K, N], device='cuda', dtype=torch.bfloat16)] foo(*inps) ` To produce FlexibleLayout with the stride change. Added to `test_subgraph_choice` Differential Revision: D74484747
This pull request was exported from Phabricator. Differential Revision: D74484747 |
01f423c
to
d53847f
Compare
…ts (pytorch#153667) Summary: If the inputs to a subgraph has FlexibleLayout, the subgraph does not currently freeze the layouts here. Therefore, the `example_inputs` generated might not be consistent in layout with the `args` based in for benchmarking Test Plan: ` M, N, K = (4, 128, 14240) import torch torch.set_default_device("cuda") torch.compile(mode='max-autotune-no-cudagraphs') def foo(x, y): return (x + 1) @ y inps = [torch.rand([M, K], device='cuda', dtype=torch.bfloat16), torch.rand([K, N], device='cuda', dtype=torch.bfloat16)] foo(*inps) ` To produce FlexibleLayout with the stride change. Added to `test_subgraph_choice` Differential Revision: D74900879
Summary: If the inputs to a subgraph has FlexibleLayout, the subgraph does not currently freeze the layouts here. Therefore, the
example_inputs
generated might not be consistent in layout with theargs
based in for benchmarkingTest Plan:
`
M, N, K = (4, 128, 14240)
import torch
torch.set_default_device("cuda")
torch.compile(mode='max-autotune-no-cudagraphs')
def foo(x, y):
return (x + 1) @ y
inps = [torch.rand([256, 256]) for _ in range(2)]
foo(*inps)
`
To produce FlexibleLayout. Also ran
test_max_autotune.py
Differential Revision: D74484747
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov