-
Notifications
You must be signed in to change notification settings - Fork 24.7k
inductor change needed to update triton pin #107722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107722
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit f862382 with merge base 138e289 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
The failed test:
seems to be related to this upgrade. We trigger an error in triton C++ code:
|
Cut a triton issue for the failure in mixed mm: triton-lang/triton#2156 |
with the new triton pin,
starts to fail with error:
I guess it may due to the test uses sparse tensor and triton may have changed it's alignment requirements. @cpuhrsch I saw the test is added by #102095 . Do you think this is a blocking test failure? |
@shunting314 - Given that it used to work and now with the moved pin fails, I'd consider this a blocking failure. It'd be good to figure out why this is breaking now with the new version of Triton. cc @amjames @pearu |
by any chance the test pass a view to trition which results in an unaligned address? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failing tests?
There are 2 failed tests mentioned above:
There are the only broken tests in CI. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
For the test failure in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this update breaks a kernel that worked previously, why is it ok to land it?
Regardless, we can create a standalone version of the kernel to more easily reproduce this error. |
There are 2 more test failures due to the testing environment is using an old version of trition: https://github.com/pytorch/pytorch/actions/runs/5947833455/job/16130966172 , https://github.com/pytorch/pytorch/actions/runs/5947833455/job/16130966318 . I think if we want, we can still make inductor work with older version of trition with a bit more complex code. Just not sure if we should do that or upgrading trition version in those cases instead. EDIT: I partially fixed the BC issue by checking if CompiledKernel has num_ctas attributes. To fully fix, we also need check if triton expect the new definition of instance_descriptor. |
I took a further look, actually the root cause of the issue is not kernel To repro: run
with the following breakpionts set:
at the first breakpoint, we are able to
|
The perf test looks mostly neutral link, although
I'll rerun the perf tests. Edit:
|
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
Split the pin update to a separate PR per @shintaro-iwasaki 's request to make FBCode side testing easier. |
@@ -49,8 +49,25 @@ def is_aligned(x): | |||
return V.graph.sizevars.statically_known_multiple_of(x.expr, ALIGNMENT) | |||
raise NotImplementedError(f"unhandled {type(x)}: {x}") | |||
|
|||
def is_aligned_8(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this share more code with is_aligned
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, I'll do that in a follow up PR to make cherry-picking this one earlier.
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
n = max(next_power_of_2(V.graph.sizevars.size_hint(n)), 16) | ||
k = max(next_power_of_2(V.graph.sizevars.size_hint(k)), 16) | ||
|
||
# According to https://github.com/openai/triton/issues/2156#issuecomment-1695897424 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, it could be 16x32 if you want to try to improve perf a bit :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means m can be 16 but n and k have to be at lease 32 for int8?
Since we have tl.tensor with shape [m, k], [k, n], [m, n] in the triton kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means m can be 16 but n and k have to be at lease 32 for int8?
k must be >= 32 but n and m can be >= 16 if both a
and b
in axb
are not transposed.
Resolve comment: #107722 (comment) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
Resolve comment: #107722 (comment) Pull Request resolved: #108135 Approved by: https://github.com/jansel ghstack dependencies: #107722
Pull Request resolved: #108104 Approved by: https://github.com/desertfire ghstack dependencies: #107722
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov