8000 [XPU][Inductor] Update Intel triton for release 2.7. by etaf · Pull Request #147727 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[XPU][Inductor] Update Intel triton for release 2.7. #147727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from

Conversation

Copy link
pytorch-bot bot commented Feb 24, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147727

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit d9277e9 with merge base e02a2ca (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@etaf etaf marked this pull request as draft February 24, 2025 07:08
@etaf etaf changed the title [XPU][Inductor] Update Intel triton for release 2.7. [WIP][XPU][Inductor] Update Intel triton for release 2.7. Feb 24, 2025
@etaf etaf added the ciflow/xpu Run XPU CI tasks label Feb 24, 2025
@etaf etaf added the keep-going Don't stop on first failure, keep running tests until the end label Feb 24, 2025
etaf added a commit that referenced this pull request Feb 25, 2025
ghstack-source-id: ac9eb19
Pull Request resolved: #147727
etaf added a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: 4334098
Pull Request resolved: #147727
etaf added a commit that referenced this pull request Feb 27, 2025
ghstack-source-id: dc6d278
Pull Request resolved: #147727
etaf added a commit that referenced this pull request Feb 27, 2025
ghstack-source-id: 1b7a25c
Pull Request resolved: #147727
@alexbaden
Copy link
Collaborator
alexbaden commented Feb 27, 2025

Failures we are tracking:

@etaf
Copy link
Collaborator Author
etaf commented Feb 28, 2025

Failures we are tracking:

Thanks @alexbaden.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
@anmyachev
Copy link
Collaborator

#148270 is also needed for inductor/test_aot_inductor.py -k test_triton_kernel_weird_param_order_xpu

@anmyachev
Copy link
Collaborator

Failures we are tracking:

Thanks @alexbaden.

@alexbaden hi, there is another Triton regression : inductor/test_torchinductor_opinfo.py::TestInductorOpInfoXPU::test_comprehensive_masked_std_xpu_int64

@etaf are these all the errors that are observed on the full test suite? Or are only a part of the tests still used? I'm trying to understand how fully we see the current problems at the moment, and whether new ones will unexpectedly appear?

@etaf
Copy link
Collaborator Author
etaf commented Mar 4, 2025

Failures we are tracking:

Thanks @alexbaden.

@alexbaden hi, there is another Triton regression : inductor/test_torchinductor_opinfo.py::TestInductorOpInfoXPU::test_comprehensive_masked_std_xpu_int64

@etaf are these all the errors that are observed on the full test suite? Or are only a part of the tests still used? I'm trying to understand how fully we see the current problems at the moment, and whether new ones will unexpectedly appear?

Hi, @anmyachev:
Originally, according to PyTorch's CI mechanism, a job would stop immediately upon encountering the first failed test case. However, because I added the keep-going label, each job now runs all scheduled test cases to completion. That said, the interface still only displays one failure per job by default. To view all failed cases, you need to check the raw log.

Regarding the specific test case inductor/test_torchinductor_opinfo.py::TestInductorOpInfoXPU::test_comprehensive_masked_std_xpu_int64, it does exist in the log at https://ossci-raw-job-status.s3.amazonaws.com/log/38095537613. It might have been overlooked earlier.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
@anmyachev
Copy link
Collaborator

@etaf @alexbaden the latest crashes: inductor/test_aot_inductor 3/3 failed! should be fixed after adding this change: b162b16. I checked that it works on our side.

etaf added 2 commits March 4, 2025 16:19
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
@etaf etaf requested a review from EikanWang March 7, 2025 03:10
@etaf etaf changed the title [WIP][XPU][Inductor] Update Intel triton for release 2.7. [XPU][Inductor] Update Intel triton for release 2.7. Mar 7, 2025
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #148323

@etaf etaf marked this pull request as ready for review March 7, 2025 14:45
pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2025
… them on Windows. (#148323)

In `fresh_inductor_cache` remove pyd files will raise permission error
on Windows because they are still used by the process.
So we clear the references to the loaded pyd libray obj and unload them
from the process.

Pull Request resolved: #148323
Approved by: https://github.com/jansel
ghstack dependencies: #148534, #148538, #147727
pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2025
Depends on #147727, which introduce triton xpu windows support
Pull Request resolved: #147637
Approved by: https://github.com/atalman
@github-actions github-actions bot deleted the gh/etaf/101/head branch April 12, 2025 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/xpu Run XPU CI tasks keep-going Don't stop on first failure, keep running tests until the end Merged module: inductor open source topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0