-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[Inductor] Record Triton’s Base32 Cache Key in .best_config
for Debugging
#147019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147019
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 51edce0 with merge base 6c089f5 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "ciflow/inductor" |
To add these label(s) (ciflow/inductor) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
@pytorchbot label "topic: not user facing" |
@fulvius31 would it be possible to add a test for this? |
Hey @davidberard98 , do you mean adding a test to verify that the new field is properly recorded in the best config JSON file? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I tested this with a newer version of Triton (a commit from Jan 28), and for the commit hash recorded in
.best_config
, I can't seem to find a corresponding Triton hash. Should I expect the corresponding Triton hash to be somewhere in the/tmp/torchinductor_${USER}
directory or is it somewhere else? (And if it is in/tmp/torchinductor...
then I think this might be broken for newer versions of Triton. - re: adding a test - yes, if it's not too hard to add a test, it would be great if we could check (a) whether the .best_config exists and contains the expected cache key, and (b) whether the cache key corresponds to a real inductor file somewhere.
The corresponding triton hash should be on ~/.triton/cache/ which is the default
I can work on it, absolutely. Can I open a new PR for that or you want me to include in this one? |
@davidberard98 were you able to test it? |
I've added a test for this PR as suggested, @davidberard98 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
letting @davidberard98 take this one
@fulvius31 Here's what my tmp directory looks like when I run your test (I put a breakpoint at the end of your test to look at the directory before the test gets removed) And my |
No worries at all ! Yes, the default triton cache directory depends if you installed triton from source or from torch. Btw, have you found the triton cache directory in the new field I added for best_config ? |
No - all the triton cache artifacts are shown in the gist I pasted If you delete your .triton/cache and re-run, do you still find the cache dir matching the .best_config file? |
Can you paste the content of best_config using the code with the edits I made ? @davidberard98
Yes, I do. Whenever the best_config is found, there is always a corresponding triton cache dir. |
@davidberard98 Based on your gist, on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see - thanks for pointing me to the right place. Approving for now, but I had a follow-up question: where does the name of the .best_config
file come from? (i.e. [best_config_hash].best_config
- where does [best_config_hash]
come from?)
Thank you for the review, merge it when you can! So, as far as I understood, the |
@pytorchbot merge -r |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: Comment with id 2697308151 not found Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@pytorchbot revert -m "broke inductor test inductor/test_max_autotune.py::TestMaxAutotune::test_cat_max_autotune_extern GH job link HUD commit link on inductor workfl 9E88 ow and rocm workflow" -c nosignal |
@pytorchbot successfully started a revert job. Check the current status here. |
@fulvius31 your PR has been successfully reverted. |
… for Debugging (#147019)" This reverts commit e3e45d9. Reverted #147019 on behalf of https://github.com/clee2000 due to broke inductor test inductor/test_max_autotune.py::TestMaxAutotune::test_cat_max_autotune_extern [GH job link](https://github.com/pytorch/pytorch/actions/runs/13653495421/job/38171259603) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/e3e45d90d8578083da8b51a3b1d911e9a4523e5b) on inductor workflow and rocm workflow ([comment](#147019 (comment)))
…ugging (pytorch#147019) Modified TorchInductor’s autotuning flow so that each `best_config` JSON file also includes the Triton “base32” (or base64) cache key. **Motivation** Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config. The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging. Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set `store_cubin = True` since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel. Pull Request resolved: pytorch#147019 Approved by: https://github.com/davidberard98
Hi @clee2000 , should I re-open another PR to push the commit with the fix ? |
I was able to get the error using rocm : https://termbin.com/vr1i And I was able to fix it : fulvius31@0a63a54 the test passes : https://termbin.com/pt06 |
…ging (#148981) This is a follow-up PR of the reverted one #147019 : Modified TorchInductor’s autotuning flow so that each best_config JSON file also includes the Triton “base32” (or base64) cache key. Motivation Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config. The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging. Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set store_cubin = True since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel. Pull Request resolved: #148981 Approved by: https://github.com/davidberard98
…ging (pytorch#148981) This is a follow-up PR of the reverted one pytorch#147019 : Modified TorchInductor’s autotuning flow so that each best_config JSON file also includes the Triton “base32” (or base64) cache key. Motivation Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config. The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging. Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set store_cubin = True since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel. Pull Request resolved: pytorch#148981 Approved by: https://github.com/davidberard98
@pytorchbot revert -c ghfirst -m "Sorry but this is breaking internally. @davidberard98 can you please help get these changes validated? Details in D73628297. To validate the fixes internally, you can follow the instructions here: https://fburl.com/fixing-ghfirst-reverts" This failure is extra weird since it seems to be failing on a test that's on github. Failure is: And the stack is pointing to this line of code that's also on github right here:
|
@pytorchbot successfully started a revert job. Check the current status here. |
Reverting PR 147019 failedReason: Command
Details for Dev Infra teamRaised by workflow job |
oh, clicked the wrong link in the diff. The follow up was the one to revert... |
…ging (pytorch#148981) This is a follow-up PR of the reverted one pytorch#147019 : Modified TorchInductor’s autotuning flow so that each best_config JSON file also includes the Triton “base32” (or base64) cache key. Motivation Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config. The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging. Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set store_cubin = True since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel. Pull Request resolved: pytorch#148981 Approved by: https://github.com/davidberard98
Modified TorchInductor’s autotuning flow so that each
best_config
JSON file also includes the Triton “base32” (or base64) cache key.Motivation
Debugging & Analysis: With this change, we can quickly identify which compiled binary and IRs belongs to a given best config.
The impact is minimal since it is only an extra field in .best_config. It can help advanced performance tuning or kernel-level debugging.
Also, since Triton already stores cubin/hsaco in its cache, developers/researchers can avoid to set
store_cubin = True
since they can get the cubin/hsaco in the Triton cache and with the code provided in this PR, they can easily match the best_config with the right Triton cache directory for the "best" kernel.cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @davidberard98