-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Implement the NumPy C SIMD vectorization interface #16397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6f47e93
to
999d86c
Compare
@mattip, Should I move the SIMD module and testing unit into a separated pr? |
Yes please. |
999d86c
to
0f30432
Compare
"NPYV" or universal intrinsics as NEP-38 define it, are types and functions intended to simplify vectorization of code on different platforms. This patch initialize NPYV for SIMD extensions SSE, AVX2, AVX512, VSX and NEON on the top of C definitions that defined by the new generated header '_cpu_dispatch.h' which included by 'cpu_dispatch.h'.
implement the following intrinsics for X86 extensions: - load, store - zero, setall, set, select, reinterpret - boolean conversions - (add, sub, mul, div, adds, subs) - logical - comparison - left and right shifting - combine, zip
implement the same intrinsics as X86 for NEON
implement the same intrinsics as X86 for Power/VSX little-endian mode
0f30432
to
791cae4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot here, very impressive. Is this all new code or was it adapted from somewhere else?
Is there a script we can run on each of the directories to verify that all the universal intrinsics are defined for all the variations?
#ifdef NPY_HAVE_NEON | ||
#include "neon/neon.h" | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Fallback if none of the above have been defined. Disables SIMD features */ |
Where is NPY_SIMD actually defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the extension header, e.g. avx512/avx512.h
.
No, I wrote it from scratch. I tried to put all my experience on it which mostly I gain it during working on OpenCV. |
Co-authored-by: Matti Picus <matti.picus@gmail.com>
Thanks. LGTM. |
This pullrequest changes
Implement the NumPy C SIMD vectorization interface
"NPYV" or universal intrinsics as NEP-38 define it, are types and functions
intended to simplify the vectorization of code on different platforms.
The current implementations support SIMD extensions SSE, AVX2, AVX512,
VSX and NEON on the top of C definitions that defined by the new
generated header '_cpu_dispatch.h' which included by 'cpu_dispatch.h'.
And covers the following operations: