Description
Describe the issue:
The assertion in the repro below fails when using numpy >= 2.1.0 and succeeds with numpy <= 2.0.2, with pandas 2.2.3 (latest; I didn't try an earlier pandas).
Reproduce the code example:
import pandas as pd
import numpy as np
print(f'{pd.__version__=}')
print(f'{np.__version__=}')
s = pd.Series([1.0,2.0,3.0])
s_r = np.rint(s).astype(np.int32)
# This next line makes a copy in numpy 2.0.2
# and incorrect takes a reference in numpy 2.1.0.
a = np.array(s_r, dtype=np.int32)
print('before:')
print(f's_r:\n{s_r}')
print(f'a:\n{a}')
a += 1
print('after:')
np.testing.assert_array_equal(s_r, [1.0,2.0,3.0])
print(f's_r:\n{s_r}')
print(f'a:\n{a}')
Error message:
Traceback (most recent call last):
File "/usr/local/google/home/daiweili/src/nptest/test.py", line 20, in <module>
np.testing.assert_array_equal(s_r, [1.0,2.0,3.0])
File "/usr/local/google/home/daiweili/src/nptest/venv/lib/python3.11/site-packages/numpy/_utils/__init__.py", line 85, in wrapper
return fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/google/home/daiweili/src/nptest/venv/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 1025, in assert_array_equal
assert_array_compare(operator.__eq__, actual, desired, err_msg=err_msg,
File "/usr/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/google/home/daiweili/src/nptest/venv/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 889, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not equal
Mismatched elements: 3 / 3 (100%)
Max absolute difference among violations: 1.
Max relative difference among violations: 1.
ACTUAL: array([2, 3, 4], dtype=int32)
DESIRED: array([1., 2., 3.])
Python and NumPy Versions:
3.11.9 (main, Jun 19 2024, 00:38:48) [GCC 13.2.0]
2.1.0
Runtime Environment:
[{'numpy_version': '2.0.2',
'python': '3.11.9 (main, Jun 19 2024, 00:38:48) [GCC 13.2.0]',
'uname': uname_result(system='Linux', node='', release='6.10.11-1rodete2-amd64', version='#1 SMP PREEMPT_DYNAMIC Debian 6.10.11-1rodete2 (2024-10-16)', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2',
'AVX512F',
'AVX512CD',
'AVX512_SKX'],
'not_found': ['AVX512_KNL',
'AVX512_KNM',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'SkylakeX',
'filepath': '/home/daiweili/src/nptest/venv/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-99b71e71.so',
'internal_api': 'openblas',
'num_threads': 64,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.27'}]
Context for the issue:
This was causing numerical bugs in our codebase. We managed to work around it by explicitly doing a .copy(), but according to the documentation in https://numpy.org/doc/2.1/reference/generated/numpy.array.html, the copy should be happening in the array init.