-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: np.searchsorted segfaults on structured arrays in 2.2.2 #28190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmmm, thanks for the issue. I can't reproduce on Mac or my linux machine even though the linux machine looks rather similar setup wise. Maybe someone else will be able to do but maybe you can answer a few things?
Since you can reproduce this (whether only in pytest or not). Could you please try running with gdb? That is either:
After the It would be good to just add |
Apologies, I should have tested the reproducer further on my end. You're correct, it doesn't reproduce but it does reproduce within pytest import numpy as np
import pytest
from numpy.testing import assert_array_equal
#@pytest.mark.skip(reason="https://github.com/numpy/numpy/issues/28190")
@pytest.mark.parametrize("na", [7])
def test_lexical_binary_search(na):
rng = np.random.default_rng(seed=42)
time = np.arange(20.0, dtype=np.float64)[:, None]
ant1, ant2 = (a.astype(np.int32)[None, :] for a in np.triu_indices(na, 1))
named_arrays = [("time", time), ("antenna1", ant1), ("antenna2", ant2)]
names, arrays = zip(*named_arrays)
arrays = tuple(a.ravel() for a in np.broadcast_arrays(*arrays))
structured_dtype = np.dtype([(n, a.dtype) for n, a in zip(names, arrays)])
carray = np.zeros(arrays[0].size, structured_dtype)
for n, a in zip(names, arrays):
carray[n] = a
choice = rng.choice(np.arange(carray.size), 10)
sarray = np.zeros(choice.size, structured_dtype)
sarray["time"] = carray["time"][choice]
sarray["antenna1"] = carray["antenna1"][choice]
sarray["antenna2"] = carray["antenna2"][choice]
idx = np.searchsorted(carray, sarray)
assert_array_equal(carray[idx], sarray) gdb trace follows$ gdb --args python -m pytest -p no:faulthandler -k test_lexical_binary_search
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) n
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
(No debugging symbols found in python)
(gdb) r
Starting program: /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/bin/python -m pytest -p no:faulthandler -k test_lexical_binary_search
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdba006c0 (LWP 1704498)]
[New Thread 0x7fffdb0006c0 (LWP 1704499)]
[New Thread 0x7fffda6006c0 (LWP 1704500)]
[New Thread 0x7fffd9c006c0 (LWP 1704501)]
[New Thread 0x7fffd92006c0 (LWP 1704502)]
[New Thread 0x7fffd88006c0 (LWP 1704503)]
[New Thread 0x7fffd7e006c0 (LWP 1704504)]
[New Thread 0x7fffd74006c0 (LWP 1704505)]
[New Thread 0x7fffd6a006c0 (LWP 1704506)]
[New Thread 0x7fffd60006c0 (LWP 1704507)]
[New Thread 0x7fffd56006c0 (LWP 1704508)]
[New Thread 0x7fffd00006c0 (LWP 1704515)]
===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.11.11, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/simon/code/xarray-ms
configfile: pyproject.toml
collecting ... warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libncursesw.so.6
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libtinfo.so.6
collected 56 items / 55 deselected / 1 selected
tests/test_basic.py
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x000000000052cb28 in ?? ()
(gdb) bt
#0 0x000000000052cb28 in ?? ()
#1 0x0000000000540ad1 in _PyEval_EvalFrameDefault ()
#2 0x0000000000583d88 in ?? ()
#3 0x0000000000583706 in ?? ()
#4 0x000000000056e817 in PyObject_Call ()
#5 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#6 0x0000000000583613 in ?? ()
#7 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#8 0x0000000000563eaf in _PyFunction_Vectorcall ()
#9 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#10 0x000000000056c361 in _PyObject_Call_Prepend ()
#11 0x0000000000654489 in ?? ()
#12 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#13 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#14 0x0000000000563eaf in _PyFunction_Vectorcall ()
#15 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#16 0x0000000000563eaf in _PyFunction_Vectorcall ()
#17 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#18 0x000000000056c361 in _PyObject_Call_Prepend ()
#19 0x0000000000654489 in ?? ()
#20 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#21 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#22 0x0000000000563eaf in _PyFunction_Vectorcall ()
#23 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#24 0x0000000000563eaf in _PyFunction_Vectorcall ()
#25 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#26 0x000000000056c361 in _PyObject_Call_Prepend ()
#27 0x0000000000654489 in ?? ()
#28 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#29 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#30 0x0000000000563eaf in _PyFunction_Vectorcall ()
#31 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#32 0x0000000000563eaf in _PyFunction_Vectorcall ()
#33 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
--Type <RET> for more, q to quit, c to continue without paging--
#34 0x000000000056c361 in _PyObject_Call_Prepend ()
#35 0x0000000000654489 in ?? ()
#36 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#37 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#38 0x000000000060f1ed in ?? ()
#39 0x000000000060e998 in PyEval_EvalCode ()
#40 0x0000000000627228 in ?? ()
#41 0x000000000054b179 in ?? ()
#42 0x000000000054b051 in PyObject_Vectorcall ()
#43 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#44 0x0000000000563eaf in _PyFunction_Vectorcall ()
#45 0x0000000000639122 in ?? ()
#46 0x0000000000638a27 in Py_RunMain ()
#47 0x00000000005ffc8d in Py_BytesMain ()
#48 0x00007ffff7c2a1ca in __libc_start_call_main (main=main@entry=0x5ffbe0, argc=argc@entry=7, argv=argv@entry=0x7fffffffd2a8) at ../sysdeps/nptl/libc_start_call_main.h:58
#49 0x00007ffff7c2a28b in __libc_start_main_impl (main=0x5ffbe0, argc=7, argv=0x7fffffffd2a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd298)
at ../csu/libc-start.c:360
#50 0x00000000005ffb15 in _start ()
(gdb) The pytest version is 8.3.3 $ pip freeze | grep pytest
pytest==8.3.3 |
edited by seberg
With PYTHON_MALLOC (seberg: This is the interesting traceback) $ PYTHONMALLOC=malloc_debug gdb --args python -m pytest -p no:faulthandler -k test_lexical_binary_search
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n])
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
(No debugging symbols found in python)
(gdb) run
Starting program: /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/bin/python -m pytest -p no:faulthandler -k test_lexical_binary_search
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc6006c0 (LWP 1706199)]
[New Thread 0x7fffdbc006c0 (LWP 1706200)]
[New Thread 0x7fffdb2006c0 (LWP 1706201)]
[New Thread 0x7fffda8006c0 (LWP 1706202)]
[New Thread 0x7fffd9e006c0 (LWP 1706203)]
[New Thread 0x7fffd94006c0 (LWP 1706204)]
[New Thread 0x7fffd8a006c0 (LWP 1706205)]
[New Thread 0x7fffd80006c0 (LWP 1706206)]
[New Thread 0x7fffd76006c0 (LWP 1706207)]
[New Thread 0x7fffd6c006c0 (LWP 1706208)]
[New Thread 0x7fffd62006c0 (LWP 1706209)]
[New Thread 0x7fffd0c006c0 (LWP 1706210)]
===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.11.11, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/simon/code/xarray-ms
configfile: pyproject.toml
collecting ... warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libncursesw.so.6
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libtinfo.so.6
collected 56 items / 55 deselected / 1 selected
tests/test_basic.py
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff7173174 in PyArray_GetClearFunction ()
from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
(gdb) bt
#0 0x00007ffff7173174 in PyArray_GetClearFunction ()
from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#1 0x00007ffff71b38c8 in PyArray_ClearArray ()
from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#2 0x00007ffff7128c68 in array_dealloc ()
from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#3 0x0000000000563f5f in _PyFunction_Vectorcall ()
#4 0x000000000056e817 in PyObject_Call ()
#5 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#6 0x0000000000563eaf in _PyFunction_Vectorcall ()
#7 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#8 0x0000000000563eaf in _PyFunction_Vectorcall ()
#9 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#10 0x000000000056c361 in _PyObject_Call_Prepend ()
#11 0x0000000000654489 in ?? ()
#12 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#13 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#14 0x0000000000563eaf in _PyFunction_Vectorcall ()
#15 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#16 0x0000000000563eaf in _PyFunction_Vectorcall ()
#17 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#18 0x000000000056c361 in _PyObject_Call_Prepend ()
#19 0x0000000000654489 in ?? () |
correct
yes, it worked on 2.2.1 and failed on 2.2.2 (just pip installed 2.2.1 in the venv and it succeeded in gdb) |
Thanks, the second backtrace seems to point at a very clear direction. I suspect that there is a reference count bug in |
Just FYI, I tried around a bit and I can reproduce this locally only on linux and only with So the "when/where" it happens is a bit cryptic to me right now (maybe it is just random), but probably best to just dig and find out what is wrong and the rest will become clear. |
@ngoldbaum seems to be related to gh-28154. Adding an incref in the path where This is a very strange code path, if you pass in a A subtle, but unintentionally large, change I missed/forgot, is that making the dtype canonical does two things for structured dtypes (meaning that something happens now):
I doubt this matters here as such (the problem is rather that we always take the path), and I think it is likely an OK change, but an unfortunately large one for a backport :/. EDIT: Still a bit unclear why it isn't showing on main for me. Might just be random... It does die while cleaning/clearing which explains needing |
Here's a shorter reproducer, which crashes for me on a Mac with 2.2.3 as well as a self-built linux build.
|
I bisected this crash to #28160 . |
FWIW, I never added the backport flag... @charris maybe check with me before backporting PRs I didn't explicitly mark as being ready to backport? @seberg what do you think about going with the more minimal change I originally suggested that didn't touch the branch that structured DTypes go into? |
Yeah, I saw the backport coming through I think and just didn't think it wasn't a big deal, but I missed this reason against it. My gut-feeling is to just do the original minimal fix for backporting (probably on main). For long-term, I think just adding the |
I'll take this on today so we can get a quick 2.2.3 out with a fix for the crash on the stable release series.
If you could take this on I'd appreciate it :) |
This does trigger on main. Still not totally clear to me why the original script doesn't, but I used this as a regression test. See #28198. |
This closes numpygh-28190 and fixes another issue in the initial code that triggered the regression. Note that we may still want to avoid this, since this does lead to constructing (view compatible) structured dtypes unnecessarily here. It would also compactify the dtype. For building unnecessary dtypes, the better solution may be to just introduce a "canonical" flag to the dtypes (now that we have the space).
This closes numpygh-28190 and fixes another issue in the initial code that triggered the regression. Note that we may still want to avoid this, since this does lead to constructing (view compatible) structured dtypes unnecessarily here. It would also compactify the dtype. For building unnecessary dtypes, the better solution may be to just introduce a "canonical" flag to the dtypes (now that we have the space).
* BUG: Fix searchsorted and CheckFromAny byte-swapping logic This closes numpygh-28190 and fixes another issue in the initial code that triggered the regression. Note that we may still want to avoid this, since this does lead to constructing (view compatible) structured dtypes unnecessarily here. It would also compactify the dtype. For building unnecessary dtypes, the better solution may be to just introduce a "canonical" flag to the dtypes (now that we have the space). * STY: Adopt code comment suggestions
Uh oh!
There was an error while loading. Please reload this page.
Describe the issue:
The reproducer below succeeded on numpy 2.2.1
but fails on numpy 2.2.2
Reproduce the code example:
Edit: updated reproducer to include pytest
Error message:
Python and NumPy Versions:
2.2.2
3.11.11 (main, Dec 4 2024, 08:55:08) [GCC 13.2.0]
Runtime Environment:
[{'numpy_version': '2.2.2',
'python': '3.11.11 (main, Dec 4 2024, 08:55:08) [GCC 13.2.0]',
'uname': uname_result(system='Linux', node='simon-t14', release='6.8.0-51-generic', version='#52-Ubuntu SMP PREEMPT_DYNAMIC Thu Dec 5 13:09:44 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-6bb31eeb.so',
'internal_api': 'openblas',
'num_threads': 12,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.28'}]
Context for the issue:
This isn't high priority for me as the test case where this failed demonstrates an idea rather than core functionality.
The text was updated successfully, but these errors were encountered: