10000 USIMD: Optimize the performace of np.einsum for all platforms by Qiyu8 · Pull Request #16641 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

USIMD: Optimize the performace of np.einsum for all platforms #16641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 256 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
256 commits
Select commit Hold shift + click to select a range
9d5812f
optimize sum_of_products_contig_stride0_outcontig_two using neon.
Qiyu8 Jun 16, 2020
0b6e5b6
optimize sum_of_products_stride0_contig_outstride0_two using neon.
Qiyu8 Jun 16, 2020
5ec4da5
optimize sum_of_products_contig_stride0_outstride0_two using neon.
Qiyu8 Jun 16, 2020
4454cb0
add dtype parameter
Qiyu8 Jun 20, 2020
7e40b1b
rebase
Qiyu8 Jul 3, 2020
fecd458
modified accoriding to new NPY_HAVE_NEON flag.
Qiyu8 Jul 3, 2020
c90ac6c
MAINT: Explicitly disallow object user dtypes
seberg Jul 11, 2020
1cdc9a8
BUG: fix mgrid output for lower precision float inputs
cjblocker Jul 12, 2020
5d1fbf4
TST: fixed dtype check error from code review
cjblocker Jul 12, 2020
2ab7954
rebase
Qiyu8 Jul 13, 2020
2b790d2
Merge branch 'einsum-neon' of github.com:Qiyu8/numpy; branch 'master'…
Qiyu8 Jul 13, 2020
f82c7d7
TST: update mgrid test from code review
cjblocker Jul 13, 2020
7a3962d
MAINT: reference issue in comments for added index_tricks tests
cjblocker Jul 13, 2020
4dcbcc2
recontructing einsum using usimd
Qiyu8 Jul 14, 2020
05cb5b7
using usimd based on current framework
Qiyu8 Jul 14, 2020
689b3ab
add prefetch in memory
Qiyu8 Jul 15, 2020
5aa6515
add reverse usimd
Qiyu8 Jul 15, 2020
d4286b9
initialize the cpu dispatching of einsum
seiko2plus Jul 14, 2020
f93f567
Merge pull request #1 from seiko2plus/einsum-neon-dispatch
Qiyu8 Jul 15, 2020
d3414f5
Merge branch 'einsum-neon' of github.com:Qiyu8/numpy into einsum-neon
Qiyu8 Jul 15, 2020
a03e729
Merge branch 'master' of github.com:numpy/numpy into einsum-neon
Qiyu8 Jul 15, 2020
7b028f4
rewrite using simd api
Qiyu8 Jul 15, 2020
f1329db
Update numpy/core/src/common/simd/avx2/reorder.h
Qiyu8 Jul 16, 2020
dee5064
Update numpy/core/src/common/simd/avx512/reorder.h
Qiyu8 Jul 16, 2020
1ba69b2
Update numpy/core/src/common/simd/vsx/reorder.h
Qiyu8 Jul 16, 2020
ad4fc5b
Merge branch 'master' of github.com:numpy/numpy into improve-usimd
Qiyu8 Jul 16, 2020
e1265b4
add shuffle api
Qiyu8 Jul 16, 2020
6173d1a
remove tabs and offset
Qiyu8 Jul 17, 2020
4c3d283
DOC: Remove links for C codes
takanori-pskq Jul 18, 2020
6bb947c
Fix exception causes in __init__.py
Ashutosh619-sudo Jul 18, 2020
5dd1fe6
Fix exception causes in __init__.py
Ashutosh619-sudo Jul 18, 2020
889a043
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
c4f35ff
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
08954bd
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
ebe08ed
Merge branch 'improve-usimd' of github.com:Qiyu8/numpy into einsum-neon
Qiyu8 Jul 20, 2020
a58cd31
add shuffle api
Qiyu8 Jul 20, 2020
e4c9005
Merge branch 'master' of github.com:numpy/numpy into einsum-neon
Qiyu8 Jul 20, 2020
7050666
remove redundant func
Qiyu8 Jul 20, 2020
eb33fe2
Transform to usimd, SSE/SSE2/AVX2 passed
Qiyu8 Jul 23, 2020
038de24
update
Qiyu8 Jul 23, 2020
3d74fab
Configure hypothesis for np.test()
Zac-HD Jul 16, 2020
668547a
Merge branch 'master' of https://github.com/numpy/numpy into einsum-neon
Qiyu8 Jul 23, 2020
59ba38c
modify neon shuffle
Qiyu8 Jul 23, 2020
8b9e8b2
fix neon shuffle api
Qiyu8 Jul 24, 2020
71b3618
Merge branch 'master' of https://github.com/numpy/numpy into einsum-neon
Qiyu8 Jul 24, 2020
8a4f3e8
Merge pull request #16900 from Ashutosh619-sudo/master
mattip Jul 24, 2020
cfb7a9c
BLD: update OpenBLAS build
mattip Jul 24, 2020
d45b16d
BUG: Allow array-like types to be coerced as object array elements
seberg Jul 24, 2020
b743bcc
Merge pull request #16940 from mattip/issue-16913
charris Jul 24, 2020
ba09393
DEP: Deprecate size-one ragged array coercion
seberg Jul 24, 2020
5920407
Update numpy/core/tests/test_array_coercion.py
seberg Jul 24, 2020
1e031f1
Update numpy/core/src/multiarray/array_coercion.c
seberg Jul 24, 2020
fe70857
DOC: add release note for #16815
cjblocker Jul 24, 2020
c7931f5
DOC: Fix the role of references (var -> macro)
takanori-pskq Jul 25, 2020
1377418
changed the name of the folder icons to logo
Jul 25, 2020
18673c5
DOC: Fixup
takanori-pskq May 26, 2020
e627135
add sum api to usimd
Qiyu8 Jul 25, 2020
8c8c3b7
remove print
Qiyu8 Jul 25, 2020
6c8be6a
Merge pull request #16944 from InessaPawson/master
rgommers Jul 25, 2020
f457a1a
Merge pull request #16815 from cjblocker/mgrid-float
mattip Jul 25, 2020
1ce5457
Merge pull request #16943 from seberg/deprecate-single-element-arrayl…
charris Jul 25, 2020
ce77458
ENH: enable colors for `runtests.py --ipython`
person142 Jul 26, 2020
c8df720
fix neon sum api, use normal for loop
Qiyu8 Jul 27, 2020
cba6d44
remove log
Qiyu8 Jul 27, 2020
745b1af
open maxop option
Qiyu8 Jul 27, 2020
7d04e22
Merge pull request #16949 from person142/runtests-ipython-colors
rgommers Jul 27, 2020
7b8bda5
MAINT: Bump hypothesis from 5.20.2 to 5.23.2
dependabot-preview[bot] Jul 27, 2020
e8d32d8
Merge pull request #16952 from numpy/dependabot/pip/hypothesis-5.23.2
charris Jul 27, 2020
9495f36
MAINT: Use arm64 instead of aarch64 on travis.
charris Jul 27, 2020
b26ef67
Merge pull request #16957 from charris/fix-arm64-warning
charris Jul 27, 2020
7c0c83e
use more efficient instrument.
Qiyu8 Jul 28, 2020
4690248
update numpy/lib/arraypad.py with appropriate chain exception (#16953)
nomanarshad94 Jul 28, 2020
ae008b4
add AVX512DQ compatibility
Qiyu8 Jul 28, 2020
622514a
Merge branch 'einsum-neon' of https://github.com/Qiyu8/numpy into ein…
Qiyu8 Jul 28, 2020
9743409
fix avx512 segment fault problem
Qiyu8 Jul 28, 2020
fb79b9b
ENH: Use f90 compiler specified in command line args for pgi compiler…
Jul 2, 2020
62ca9df
Merge pull request #16941 from seberg/types-are-not-arraylikes
charris Jul 28, 2020
d28ac9a
BLD: add win32 pypy build
mattip Jul 27, 2020
f99c01a
DOC: Fixed typo in lib/recfunctions.py (#16973)
jesseli2002 Jul 29, 2020
6f67399
DOC: Clarify input to irfft/irfft2/irfftn (#16950)
bharatr21 Jul 29, 2020
cf5e766
TST: fix tests for windows + PyPy
mattip Jul 28, 2020
b46e5d3
Merge pull request #16974 from mattip/pypy-win32
charris Jul 30, 2020
5311300
Update numpy/core/src/multiarray/einsum_p.h
Qiyu8 Jul 30, 2020
1493142
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
5b020c7
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
8879319
move to NPYV
Qiyu8 Jul 30, 2020
f4e5816
fix type error
Qiyu8 Jul 30, 2020
aca0ce6
MAINT: Added the `order` parameter to `np.array()` (#16966)
BvB93 Jul 30, 2020
e7c1d01
Merge pull request #16730 from danbeibei/fcompiler
charris Jul 30, 2020
b66f02b
MAINT: Implemented two dtype-related TODO's (#16622)
BvB93 Jul 31, 2020
6f0436d
ENH: Add Neon SIMD implementations for add, sub, mul, and div (#16969)
DumbMice Jul 31, 2020
d67326d
DOC: update val to be scalar or array like optional closes #16901 (#1…
leeyspaul Jul 31, 2020
27cf59d
DOC: Fix the declarations of C fuctions (#16897)
takanori-pskq Jul 31, 2020
210e542
Merge pull request #16896 from takanori-pskq/i13114-5
mattip Jul 31, 2020
2d39e7f
DOC: Remove the links for ``True`` and ``False`` (#16887)
takanori-pskq Jul 31, 2020
186c765
add vsx sum reduce
Qiyu8 Jul 31, 2020
4b83f05
DOC: Fix wrong markups in `arrays.dtypes`
takanori-pskq Jul 18, 2020
800c43b
Merge pull request #16894 from takanori-pskq/fix-doc-dtypes-quote
mattip Aug 1, 2020
7bda953
DOC: Add the new NumPy logo to Sphinx pages
bjnath Aug 1, 2020
f154484
DOC: Styling update for PR #16988
bjnath Aug 1, 2020
c3f7d3e
DOC: Delete old logo; updates PR #16988
bjnath Aug 1, 2020
122330d
Merge pull request #16988 from bjnath/update_logo_on_sphinx_pages
rgommers Aug 1, 2020
0f12338
Merge pull request #16879 from Zac-HD/isolate-hypothesis-config
mattip Aug 2, 2020
77bd10c
Update numpy/core/src/common/simd/neon/arithmetic.h
Qiyu8 Aug 3, 2020
320fa52
Update numpy/core/src/common/simd/avx2/arithmetic.h
Qiyu8 Aug 3, 2020
8c81038
Update numpy/core/src/common/simd/avx512/arithmetic.h
Qiyu8 Aug 3, 2020
778a40e
Update numpy/core/src/common/simd/sse/arithmetic.h
Qiyu8 Aug 3, 2020
8c45114
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
9076d48
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
4cf23d4
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
4ae3394
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
b473e9a
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
723f103
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
cc99242
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
bed1032
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
2e1df6d
BLD: pin setuptools<49.2.0
mattip Aug 3, 2020
60a21a3
Merge pull request #16993 from mattip/setuptools
charris Aug 3, 2020
113ef15
MAINT: Bump hypothesis from 5.23.2 to 5.23.9
dependabot-preview[bot] Aug 3, 2020
5b5a740
DOC: Add correctness vs strictness consideration for np.dtype (#16917)
anirudh2290 Aug 3, 2020
333e08e
BUG: Set readonly flag in array interface instead of warning (gh-16350)
abalkin Aug 3, 2020
5d09976
Merge pull request #16991 from numpy/dependabot/pip/hypothesis-5.23.9
charris Aug 3, 2020
dd3d935
MAINT: Bump pytest from 5.4.3 to 6.0.1
dependabot-preview[bot] Aug 3, 2020
2283e26
Merge pull request #16802 from seberg/user-dtypes-no-objects
mattip Aug 3, 2020
8f60522
Merge pull request #16992 from numpy/dependabot/pip/pytest-6.0.1
charris Aug 3, 2020
593ef5f
ENH: Speed up trim_zeros (#16911)
BvB93 Aug 4, 2020
e3c5213
MAINT: Chain exception in ``distutils/fcompiler/environment.py``. (#1…
nomanarshad94 Aug 4, 2020
e242859
DOC: Add note that allclose and isclose do not accept non-numeric typ…
iamsoto Aug 5, 2020
e1211b8
ENH: Add NumPy declarations to be used by Cython 3.0+ (#16986)
scoder Aug 5, 2020
6bed9a9
DOC: Improve intersect1d docstring (#16420)
dkogan Aug 5, 2020
40e8400
MAINT: Improve error handling in umathmodule setup (#17014)
eric-wieser Aug 6, 2020
3023d06
DOC: Fix non-matching pronoun in format.py documentation. (gh-17022)
phoenix-meadowlark Aug 6, 2020
29e2293
BUG: Raise correct errors in boolean indexing fast path (gh-17010)
asmeurer Aug 6, 2020
eec0aa2
NEP: Updated NEP-35 with keyword-only instruction (#17009)
pentschev Aug 7, 2020
cbd0897
BUG: fix a compile and a test warning
mattip Aug 9, 2020
8a92eb4
DOC: Disclaimer for FFT library
bjnath Aug 7, 2020
dbf3744
Merge pull request #17028 from bjnath/fft-disclaimer
rgommers Aug 9, 2020
961b56f
Merge pull request #17033 from mattip/random-pool_size
charris Aug 9, 2020
00a45b4
DOC: Use a less ambiguous example for array_split (#17039)
yogeshr59 Aug 10, 2020
b1d88e0
optimize sum_of_products_stride0_contig_outcontig_two by using neon i…
Qiyu8 Jun 10, 2020< 8000 /relative-time>
e27e051
optimize sum_of_products_contig_contig_outstride0_two by using neon i…
Qiyu8 Jun 12, 2020
74c31f1
optimize sum_of_products_contig_outstride0_one by using neon intrinsics
Qiyu8 Jun 12, 2020
5abc3a3
add benchmarks
Qiyu8 Jun 12, 2020
8f14897
optimize sum_of_products_contig_two using neon.
Qiyu8 Jun 16, 2020
5e201dd
optimize sum_of_products_contig_stride0_outcontig_two using neon.
Qiyu8 Jun 16, 2020
cc69acc
optimize sum_of_products_stride0_contig_outstride0_two using neon.
Qiyu8 Jun 16, 2020
86abb98
optimize sum_of_products_contig_stride0_outstride0_two using neon.
Qiyu8 Jun 16, 2020
b82f6c9
add dtype parameter
Qiyu8 Jun 20, 2020
cc81a94
modified accoriding to new NPY_HAVE_NEON flag.
Qiyu8 Jul 3, 2020
d0dee4d
recontructing einsum using usimd
Qiyu8 Jul 14, 2020
23b11a6
using usimd based on current framework
Qiyu8 Jul 14, 2020
8950bbe
initialize the cpu dispatching of einsum
seiko2plus Jul 14, 2020
cdf2c63
rewrite using simd api
Qiyu8 Jul 15, 2020
35ac5bb
add prefetch in memory
Qiyu8 Jul 15, 2020
3982569
add reverse usimd
Qiyu8 Jul 15, 2020
3602bfa
Update numpy/core/src/common/simd/avx2/reorder.h
Qiyu8 Jul 16, 2020
f4f7823
Update numpy/core/src/common/simd/avx512/reorder.h
Qiyu8 Jul 16, 2020
72603fe
Update numpy/core/src/common/simd/vsx/reorder.h
Qiyu8 Jul 16, 2020
078adbf
add shuffle api
Qiyu8 Jul 16, 2020
51345f0
remove tabs and offset
Qiyu8 Jul 17, 2020
0804a32
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
c8029db
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
1ae2518
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
5b4c786
add shuffle api
Qiyu8 Jul 20, 2020
a0c8ac0
remove redundant func
Qiyu8 Jul 20, 2020
e0a951e
Transform to usimd, SSE/SSE2/AVX2 passed
Qiyu8 Jul 23, 2020
1260e95
modify neon shuffle
Qiyu8 Jul 23, 2020
9b38fba
fix neon shuffle api
Qiyu8 Jul 24, 2020
007a82e
add sum api to usimd
Qiyu8 Jul 25, 2020
3522690
remove print
Qiyu8 Jul 25, 2020
865173d
fix neon sum api, use normal for loop
Qiyu8 Jul 27, 2020
3bf0674
remove log
Qiyu8 Jul 27, 2020
d493ed4
open maxop option
Qiyu8 Jul 27, 2020
f846ae1
use more efficient instrument.
Qiyu8 Jul 28, 2020
de25c6c
add AVX512DQ compatibility
Qiyu8 Jul 28, 2020
d837ad3
fix avx512 segment fault problem
Qiyu8 Jul 28, 2020
48f6d51
Update numpy/core/src/multiarray/einsum_p.h
Qiyu8 Jul 30, 2020
ca849ce
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
9fa5b4f
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
82844b7
move to NPYV
Qiyu8 Jul 30, 2020
9905f08
fix type error
Qiyu8 Jul 30, 2020
91e4bba
add vsx sum reduce
Qiyu8 Jul 31, 2020
ebf4933
Update numpy/core/src/common/simd/neon/arithmetic.h 8000
Qiyu8 Aug 3, 2020
753b38d
Update numpy/core/src/common/simd/avx2/arithmetic.h
Qiyu8 Aug 3, 2020
1b9d283
Update numpy/core/src/common/simd/avx512/arithmetic.h
Qiyu8 Aug 3, 2020
ae70f32
Update numpy/core/src/common/simd/sse/arithmetic.h
Qiyu8 Aug 3, 2020
1180231
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
fcdfada
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
43ef288
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
85d10d5
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
17fb2f0
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
a588d04
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
f4025f2
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
cebee98
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
5b0dbd7
Merge branch 'einsum-neon' of github.com:Qiyu8/numpy into einsum-neon
Qiyu8 Aug 11, 2020
b0f7b8c
optimize sum_of_products_stride0_contig_outcontig_two by using neon i…
Qiyu8 Jun 10, 2020
e7a158f
optimize sum_of_products_contig_contig_outstride0_two by using neon i…
Qiyu8 Jun 12, 2020
3224f56
optimize sum_of_products_contig_outstride0_one by using neon intrinsics
Qiyu8 Jun 12, 2020
1796d61
add benchmarks
Qiyu8 Jun 12, 2020
161a645
optimize sum_of_products_contig_two using neon.
Qiyu8 Jun 16, 2020
967476f
optimize sum_of_products_contig_stride0_outcontig_two using neon.
Qiyu8 Jun 16, 2020
7b33da3
optimize sum_of_products_stride0_contig_outstride0_two using neon.
Qiyu8 Jun 16, 2020
8dba9d6
optimize sum_of_products_contig_stride0_outstride0_two using neon.
Qiyu8 Jun 16, 2020
88e8ddb
add dtype parameter
Qiyu8 Jun 20, 2020
65d3260
modified accoriding to new NPY_HAVE_NEON flag.
Qiyu8 Jul 3, 2020
76db405
recontructing einsum using usimd
Qiyu8 Jul 14, 2020
eec1857
using usimd based on current framework
Qiyu8 Jul 14, 2020
21137f8
initialize the cpu dispatching of einsum
seiko2plus Jul 14, 2020
731879c
rewrite using simd api
Qiyu8 Jul 15, 2020
22180b2
add prefetch in memory
Qiyu8 Jul 15, 2020
1de1692
add reverse usimd
Qiyu8 Jul 15, 2020
edab5e0
Update numpy/core/src/common/simd/avx2/reorder.h
Qiyu8 Jul 16, 2020
4abce24
Update numpy/core/src/common/simd/avx512/reorder.h
Qiyu8 Jul 16, 2020
f818fab
Update numpy/core/src/common/simd/vsx/reorder.h
Qiyu8 Jul 16, 2020
0f534e2
add shuffle api
Qiyu8 Jul 16, 2020
09a3ebc
remove tabs and offset
Qiyu8 Jul 17, 2020
1ec8126
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
42aa799
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
f568bb0
Update numpy/core/src/common/simd/neon/reorder.h
Qiyu8 Jul 20, 2020
dbd79cd
add shuffle api
Qiyu8 Jul 20, 2020
a10c3ce
remove redundant func
Qiyu8 Jul 20, 2020
20e5fa8
Transform to usimd, SSE/SSE2/AVX2 passed
Qiyu8 Jul 23, 2020
88d5838
modify neon shuffle
Qiyu8 Jul 23, 2020
b0c526e
fix neon shuffle api
Qiyu8 Jul 24, 2020
6751d48
add sum api to usimd
Qiyu8 Jul 25, 2020
ea9fad0
remove print
Qiyu8 Jul 25, 2020
01e5145
fix neon sum api, use normal for loop
Qiyu8 Jul 27, 2020
ff8ab27
remove log
Qiyu8 Jul 27, 2020
e9d0d61
open maxop option
Qiyu8 Jul 27, 2020
be46956
use more efficient instrument.
Qiyu8 Jul 28, 2020
b85e32a
add AVX512DQ compatibility
Qiyu8 Jul 28, 2020
e66ff74
fix avx512 segment fault problem
Qiyu8 Jul 28, 2020
e208542
Update numpy/core/src/multiarray/einsum_p.h
Qiyu8 Jul 30, 2020
2298ea9
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
adb094c
re-implment SIMD kernels of einsum
seiko2plus Jul 29, 2020
96eb54f
move to NPYV
Qiyu8 Jul 30, 2020
dfaeb14
fix type error
Qiyu8 Jul 30, 2020
aed4e7b
add vsx sum reduce
Qiyu8 Jul 31, 2020
96aa316
Update numpy/core/src/common/simd/neon/arithmetic.h
Qiyu8 Aug 3, 2020
89f7861
Update numpy/core/src/common/simd/avx2/arithmetic.h
Qiyu8 Aug 3, 2020
787cc67
Update numpy/core/src/common/simd/avx512/arithmetic.h
Qiyu8 Aug 3, 2020
a561d94
Update numpy/core/src/common/simd/sse/arithmetic.h
Qiyu8 Aug 3, 2020
458c4ba
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
4e26633
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
f9b9250
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
31aac3a
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
1aee497
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
f46c61f
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
a172e68
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
25f4897
Update numpy/core/src/multiarray/einsum.dispatch.c.src
Qiyu8 Aug 3, 2020
ea7638c
Merge branch 'einsum-neon' of github.com:Qiyu8/numpy into einsum-neon
Qiyu8 Aug 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
re-implment SIMD kernels of einsum
  - paves the way for integer optimization
  - fix memory overflow/bus errors
  - improve the unrolling
  - activate the unrolling when NPYV isn't available
  - add support for fma3
  - robust/cleanup
  - other minor fixes
  • Loading branch information
seiko2plus authored and Qiyu8 committed Aug 11, 2020
commit adb094ca5a7b9359900468924d34a650304aaabe
Loading
0