8000 Add OpenMP support to argsort by sterrettm2 · Pull Request #195 · intel/x86-simd-sort · GitHub
[go: up one dir, main page]

Skip to content
< 8000 turbo-frame id="repo-content-turbo-frame" target="_top" data-turbo-action="advance" class="">

Add OpenMP support to argsort #195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 10, 2025
Merged

Conversation

sterrettm2
Copy link
Contributor
@sterrettm2 sterrettm2 commented Apr 9, 2025

This patch adds OpenMP support to argsort, in much the same way as quicksort and key-value sort.
Smaller benchmark sizes are included to show that there is not a significant regression for those smaller sizes.

128/100k/1m/10m/100m Benchmarks
Benchmark                                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort/random_128/* vs. simdargsort/random_128/*]int64_t                 -0.0066         -0.0060           538           534           537           534
[simdargsort/random_128/* vs. simdargsort/random_128/*]uint64_t                +0.0038         +0.0042           538           540           538           540
[simdargsort/random_128/* vs. simdargsort/random_128/*]double                  +0.0068         +0.0072           398           401           398           401
[simdargsort/random_128/* vs. simdargsort/random_128/*]int32_t                 -0.0081         -0.0075           377           374           377           374
[simdargsort/random_128/* vs. simdargsort/random_128/*]uint32_t                -0.0072         -0.0070           378           375           378           375
[simdargsort/random_128/* vs. simdargsort/random_128/*]float                   -0.0427         -0.0425           405           388           405           388
OVERALL_GEOMEAN                                                                -0.0091         -0.0087             0             0             0             0

Benchmark                                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]int64_t                 -0.6343         -0.6343       1328931        486040       1328908        485967
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]uint64_t                -0.6199         -0.6200       1332846        506636       1332837        506521
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]double                  -0.6506         -0.6506       1216462        425010       1216343        424938
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]int32_t                 -0.6073         -0.6074       1159372        455239       1159356        455141
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]uint32_t                -0.6264         -0.6265       1136127        424415       1136070        424348
[simdargsort/random_100k/* vs. simdargsort/random_100k/*]float                   -0.6114         -0.6115       1173640        456028       1173632        455971
OVERALL_GEOMEAN                                                                  -0.6253         -0.6253             0             0             0             0

Benchmark                                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]int64_t                 -0.7172         -0.7173      24256168       6858762      24256005       6858362
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]uint64_t                -0.7142         -0.7142      23993105       6856463      23992567       6856165
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]double                  -0.7372         -0.7373      22509499       5914566      22509178       5914256
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]int32_t                 -0.7338         -0.7338      19369104       5156914      19368839       5156653
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]uint32_t                -0.7386         -0.7386      19365743       5062443      19364946       5061877
[simdargsort/random_1m/* vs. simdargsort/random_1m/*]float                   -0.7682         -0.7682      19175104       4445467      19173415       4444931
OVERALL_GEOMEAN                                                              -0.7355         -0.7355             0             0             0             0

Benchmark                                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]int64_t                 -0.7188         -0.7206     535672018     150621678     535662039     149682185
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]uint64_t                -0.7123         -0.7137     522665977     150358943     522662888     149658818
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]double                  -0.7063         -0.7067     516422888     151683883     516408490     151469114
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]int32_t                 -0.7343         -0.7345     421554181     112011550     421499972     111900516
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]uint32_t                -0.7383         -0.7384     421089877     110216843     421017197     110153338
[simdargsort/random_10m/* vs. simdargsort/random_10m/*]float                   -0.7343         -0.7343     422777178     112339041     422770468     112336650
OVERALL_GEOMEAN                                                                -0.7243         -0.7249             0             0             0             0

Benchmark                                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]int64_t                 -0.6427         -0.6708   11788397461    4211981148   11787561815    3880782867
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]uint64_t                -0.6655         -0.6686   11836971450    3959080747   11835896648    3922945957
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]double                  -0.6655         -0.6728   11628384117    3889522807   11627635685    3804081330
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]int32_t                 -0.6790         -0.6848    9693999063    3111721408    9693378022    3054949141
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]uint32_t                -0.6809         -0.6844    9680041695    3088878433    9679286446    3055234564
[simdargsort/random_100m/* vs. simdargsort/random_100m/*]float                   -0.6852         -0.7007    9624159505    3030112691    9623024906    2879783882
OVERALL_GEOMEAN                                                                  -0.6701         -0.6805            11             4            11             3

Copy link
Contributor
@r-devulap r-devulap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @sterrettm2 !

@r-devulap r-devulap merged commit 14f504c into intel:main Apr 10, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0