Description
Describe the issue:
From 1.26.4 to 2.0.0, I'm seeing a performance regression in argsort
when performed on sorted data. On non-sorted data, I am seeing a performance enhancement when upgrading to 2.0.
Reproduce the code example:
N = 10**5
arr = np.arange(N)
%timeit arr.argsort()
# 1.59 ms ± 32.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) <-- 2.0
# 1.6 ms ± 23.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) <-- 2.2.4
# 444 μs ± 6.29 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) <-- 1.26.4
Error message:
Python and NumPy Versions:
Python: 3.12.3
NumPy: Versions reported above.
Runtime Environment:
[{'numpy_version': '2.2.4',
'python': '3.12.3 (main, Feb 4 2025, 14:48:35) [GCC 13.3.0]',
'uname': uname_result(system='Linux', node='brokenglass', release='6.11.0-21-generic', version='#21~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 24 16:52:15 UTC 2', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/home/richard/dev/venvs/pandas/lib/python3.12/site-packages/numpy.libs/libscipy_openblas64_-6bb31eeb.so',
'internal_api': 'openblas',
'num_threads': 32,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.28'}]
Context for the issue:
pandas just recently upgraded our dev pin on NumPy to be 2.0+ and we are working through the identified performance regressions: