-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[CpuInductor] Enable NEON ISA detection on Linux ARM #129075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Warning: Unknown label
Please add the new label to .github/pytorch-probot.yml |
@pytorchbot rebase -b main |
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
Also, cleanup code a bit to use `x in [y, z]` instead of `x == y or x == z`
Successfully rebased |
c5e565d
to
35eddf6
Compare
|
||
__at_align__ float in_out_ptr0[16] = {0.0}; | ||
#endif | ||
alignas(64) float in_out_ptr0[16] = {0.0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuhancn Does it work on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 Is there a CI to test it on Windows? But at least godbolt believes it is supported even by a pretty old MSVC: https://godbolt.org/z/Tr6Wa9WE6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describ 8000 e this comment to others. Learn more.
@pytorchbot merge -i |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge -i |
I.e. for every platform we support, it should be something but invalid AVX512 on Intel Linux, and NEON on Apple silicon and Linux aarch64 2.4 aarch64 failures are expected, as pytorch/pytorch#129075 has not been picked for the release yet
@pytorchbot cherry-pick --onto release/2.4 -c critical |
Also, cleanup code a bit to use `x in [y, z]` instead of `x == y or x == z` And do not redefine `at_align`, but instead use `alignas(64)` as was suggested in https://github.com/pytorch/pytorch/pull/128686/files#r1639365978 Test plan: `python3 -c "import torch._inductor.codecache as cc; isa = cc.valid_vec_isa_list()[0];print(str(isa), bool(isa))"` Pull Request resolved: #129075 Approved by: https://github.com/jansel (cherry picked from commit b2a9b8d)
Cherry picking #129075The cherry pick PR is at #133578 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
@pytorchbot cherry-pick --onto release/2.4 -c critical |
Cherry picking #129075Command
Details for Dev Infra teamRaised by workflow job |
Also, cleanup code a bit to use `x in [y, z]` instead of `x == y or x == z` And do not redefine `at_align`, but instead use `alignas(64)` as was suggested in https://github.com/pytorch/pytorch/pull/128686/files#r1639365978 Test plan: `python3 -c "import torch._inductor.codecache as cc; isa = cc.valid_vec_isa_list()[0];print(str(isa), bool(isa))"` Pull Request resolved: #129075 Approved by: https://github.com/jansel
* [CpuInductor] Enable NEON ISA detection on Linux ARM (#129075) Also, cleanup code a bit to use `x in [y, z]` instead of `x == y or x == z` And do not redefine `at_align`, but instead use `alignas(64)` as was suggested in https://github.com/pytorch/pytorch/pull/128686/files#r1639365978 Test plan: `python3 -c "import torch._inductor.codecache as cc; isa = cc.valid_vec_isa_list()[0];print(str(isa), bool(isa))"` Pull Request resolved: #129075 Approved by: https://github.com/jansel * Fix merge mistakes --------- Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Checked that it works with 2.4.1:
|
Also, cleanup code a bit to use
x in [y, z]
instead ofx == y or x == z
And do not redefine
at_align
, but instead usealignas(64)
as was suggested in https://github.com/pytorch/pytorch/pull/128686/files#r1639365978Test plan:
python3 -c "import torch._inductor.codecache as cc; isa = cc.valid_vec_isa_list()[0];print(str(isa), bool(isa))"
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang