8000 BUG: Performance regression in argsort on sorted data · Issue #28714 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Performance regression in argsort on sorted data #28714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rhshadrach opened this issue Apr 15, 2025< 8000 /relative-time> · 4 comments
Open

BUG: Performance regression in argsort on sorted data #28714

rhshadrach opened this issue Apr 15, 2025 · 4 comments
Labels

Comments

@rhshadrach
Copy link
rhshadrach commented Apr 15, 2025

Describe the issue:

From 1.26.4 to 2.0.0, I'm seeing a performance regression in argsort when performed on sorted data. On non-sorted data, I am seeing a performance enhancement when upgrading to 2.0.

Reproduce the code example:

N = 10**5
arr = np.arange(N)
%timeit arr.argsort()
# 1.59 ms ± 32.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) <-- 2.0
# 1.6 ms ± 23.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)  <-- 2.2.4
# 444 μs ± 6.29 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)  <-- 1.26.4

Error message:

Python and NumPy Versions:

Python: 3.12.3
NumPy: Versions reported above.

Runtime Environment:

[{'numpy_version': '2.2.4',
'python': '3.12.3 (main, Feb 4 2025, 14:48:35) [GCC 13.3.0]',
'uname': uname_result(system='Linux', node='brokenglass', release='6.11.0-21-generic', version='#21~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 24 16:52:15 UTC 2', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/home/richard/dev/venvs/pandas/lib/python3.12/site-packages/numpy.libs/libscipy_openblas64_-6bb31eeb.so',
'internal_api': 'openblas',
'num_threads': 32,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.28'}]

Context for the issue:

pandas just recently upgraded our dev pin on NumPy to be 2.0+ and we are working through the identified performance regressions:

pandas-dev/asv-runner#50

@charris
Copy link
Member
charris commented Apr 15, 2025

Out of curiosity, what platform? I ask because the default integer type on windows has changed in NumPy 2.

NVM, I see this is on Linux.

Possibly related, #23707, #25610.

@charris
Copy link
Member
charris commented Apr 15, 2025

@rdevulap Thoughts?

@r-devulap
Copy link
Member

@rhshadrach this is because of the AVX2 argsort introduced in #25610 which unfortunately has regressions for ordered arrays specially on 64-bit dtypes.

| Change   | Before [174ac7bc] <main>   | After [680b6823] <avx2_arg>   |   Ratio | Benchmark (Parameter)                                                                    |
|----------|----------------------------|-------------------------------|---------|------------------------------------------------------------------------------------------|
| +        | 63.8±0.5μs                 | 260±1μs                       |    4.07 | bench_function_base.Sort.time_argsort('quick', 'int64', ('ordered',))                    |

Coincidentally, I was working on a fix for this in intel/x86-simd-sort#197.

I'm happy to port that change to 2.0.x or any other version if needed.

@r-devulap
Copy link
Member

Fix for this is now part of: #28619

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
0