-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[Inductor] Record Triton’s Base32 Cache Key in .best_config for Debugging #148981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148981
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c2ae126 with merge base 86c6f71 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "topic: not user facing" |
ed6e4b8
to
0d804de
Compare
@davidberard98 thanks for running the tests again. I've added the cache_hash in the make_launcher function and hopefully the tests should pass. |
@davidberard98 are you able to run the tests once again to be sure that everything works ? thank you! |
9672c95
to
3836d0c
Compare
@davidberard98 I just rebased with |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Trying to merge so I can trigger the tests @pytorchbot merge nvm I can't. I'm using the changes of this PR for a tool we are building. |
This PR needs to be approved by an authorized maintainer before merge. |
f382f36
to
fb709ac
Compare
in order to have a match with the triton cache IRs
This reverts commit d3a66fa.
@davidberard98 I just rebased with upstream/viable/strict let me know if you can run the tests. Thanks |
This is a follow-up PR of the reverted one #147019 :
Modified TorchInductor’s autotuning flow so that each best_config JSON file also includes the Triton “base32” (or base64) cache key.
Motivation
Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config.
The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging.
Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set store_cubin = True since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @davidberard98 @clee2000 @eellison @masnesral