8000 BUG: np.searchsorted segfaults on structured arrays in 2.2.2 · Issue #28190 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: np.searchsorted segfaults on structured arrays in 2.2.2 #28190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sjperkins opened this issue Jan 20, 2025 · 13 comments · Fixed by #28418
Closed

BUG: np.searchsorted segfaults on structured arrays in 2.2.2 #28190

sjperkins opened this issue Jan 20, 2025 · 13 comments · Fixed by #28418

Comments

@sjperkins
Copy link
sjperkins commented Jan 20, 2025

Describe the issue:

The reproducer below succeeded on numpy 2.2.1

but fails on numpy 2.2.2

Reproduce the code example:

Edit: updated reproducer to include pytest

import numpy as np
import pytest
from numpy.testing import assert_array_equal


@pytest.mark.parametrize("na", [7])
def test_lexical_binary_search(na):
  rng = np.random.default_rng(seed=42)

  time = np.arange(20.0, dtype=np.float64)[:, None]
  ant1, ant2 = (a.astype(np.int32)[None, :] for a in np.triu_indices(na, 1))
  named_arrays = [("time", time), ("antenna1", ant1), ("antenna2", ant2)]
  names, arrays = zip(*named_arrays)
  arrays = tuple(a.ravel() for a in np.broadcast_arrays(*arrays))
  structured_dtype = np.dtype([(n, a.dtype) for n, a in zip(names, arrays)])

  carray = np.zeros(arrays[0].size, structured_dtype)
  for n, a in zip(names, arrays):
    carray[n] = a

  choice = rng.choice(np.arange(carray.size), 10)

  sarray = np.zeros(choice.size, structured_dtype)

  sarray["time"] = carray["time"][choice]
  sarray["antenna1"] = carray["antenna1"][choice]
  sarray["antenna2"] = carray["antenna2"][choice]

  idx = np.searchsorted(carray, sarray)
  assert_array_equal(carray[idx], sarray)

Error message:

Fatal Python error: Segmentation fault

Current thread 0x00007c21e7cad080 (most recent call first):
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/terminal.py", line 463 in write_ensure_prefix
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/terminal.py", line 633 in pytest_runtest_logreport
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/runner.py", line 246 in call_and_report
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/runner.py", line 132 in runtestprotocol
  File "/hom
8000
e/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/main.py", line 337 in _main
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/main.py", line 283 in wrap_session
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py", line 175 in main
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/_pytest/config/__init__.py", line 201 in console_main
  File "/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/bin/py.test", line 8 in <module>

Extension modules: numpy._core._multiarray_umath, numpy.linalg._umath_linalg, pyarrow.lib, arcae.lib.arrow_tables, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, yaml._yaml, psutil._psutil_linux, psutil._psutil_posix, markupsafe._speedups, tornado.speedups, numcodecs.compat_ext, numcodecs.blosc, numcodecs.zstd, numcodecs.lz4, numcodecs._shuffle, msgpack._cmsgpack, numcodecs.jenkins, numcodecs.vlen, numcodecs.fletcher32 (total: 68)
Segmentation fault (core dumped)

Python and NumPy Versions:

2.2.2
3.11.11 (main, Dec 4 2024, 08:55:08) [GCC 13.2.0]

Runtime Environment:

[{'numpy_version': '2.2.2',
'python': '3.11.11 (main, Dec 4 2024, 08:55:08) [GCC 13.2.0]',
'uname': uname_result(system='Linux', node='simon-t14', release='6.8.0-51-generic', version='#52-Ubuntu SMP PREEMPT_DYNAMIC Thu Dec 5 13:09:44 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-6bb31eeb.so',
'internal_api': 'openblas',
'num_threads': 12,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.28'}]

Context for the issue:

This isn't high priority for me as the test case where this failed demonstrates an idea rather than core functionality.

@seberg
Copy link
Member
seberg commented Jan 20, 2025

Hmmm, thanks for the issue. I can't reproduce on Mac or my linux machine even though the linux machine looks rather similar setup wise.

Maybe someone else will be able to do but maybe you can answer a few things?

  • I assume this is a pip intalled NumPy? (we got show_runtime, so likely nothing new).
  • Please try the example as pasted here (always good)? It seems possible to me that it isn't actually a reproducer and the crash happens only in the test suite. In that case we don't know if it is just random, or the problem is earlier.
  • Am I correct to think that this is new in 2.2.2 and the example succeeded in 2.2.1? Because that makes it more important to look into it.

Since you can reproduce this (whether only in pytest or not). Could you please try running with gdb? That is either:

  • gdb --args python, then r or run. Then you can just copy paste the code (or put it in a script to run and include it in the initial argument).
  • If you run it as the full test suite, use: gdb --args python -m pytest -p no:faulthandler <pytest args> where <pytest args is whatever you are currently using. Then again r to.
    (I hope this works, I haven't had to use the -p no:faulthandler before, but I think it may be needed now.)

After the r things should run normally until the crash happens. After the crash run bt to get the backtrace and copy paste it here from the start. (We probably only need the first 10 entries or so, you can stop after a few lines if you are clearly inside Python.)

It would be good to just add PYTHONMALLOC=malloc_debug before the gdb/python call (additionally or on its own). Since that is a quick and easy way to find some types of issues.

@sjperkins
Copy link
Author
sjperkins commented Jan 20, 2025

Apologies, I should have tested the reproducer further on my end. You're correct, it doesn't reproduce but it does reproduce within pytest

import numpy as np
import pytest
from numpy.testing import assert_array_equal


#@pytest.mark.skip(reason="https://github.com/numpy/numpy/issues/28190")
@pytest.mark.parametrize("na", [7])
def test_lexical_binary_search(na):
  rng = np.random.default_rng(seed=42)

  time = np.arange(20.0, dtype=np.float64)[:, None]
  ant1, ant2 = (a.astype(np.int32)[None, :] for a in np.triu_indices(na, 1))
  named_arrays = [("time", time), ("antenna1", ant1), ("antenna2", ant2)]
  names, arrays = zip(*named_arrays)
  arrays = tuple(a.ravel() for a in np.broadcast_arrays(*arrays))
  structured_dtype = np.dtype([(n, a.dtype) for n, a in zip(names, arrays)])

  carray = np.zeros(arrays[0].size, structured_dtype)
  for n, a in zip(names, arrays):
    carray[n] = a

  choice = rng.choice(np.arange(carray.size), 10)

  sarray = np.zeros(choice.size, structured_dtype)

  sarray["time"] = carray["time"][choice]
  sarray["antenna1"] = carray["antenna1"][choice]
  sarray["antenna2"] = carray["antenna2"][choice]

  idx = np.searchsorted(carray, sarray)
  assert_array_equal(carray[idx], sarray)
gdb trace follows
$ gdb --args python -m pytest -p no:faulthandler -k test_lexical_binary_search
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) n
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
(No debugging symbols found in python)
(gdb) r
Starting program: /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/bin/python -m pytest -p no:faulthandler -k test_lexical_binary_search
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdba006c0 (LWP 1704498)]
[New Thread 0x7fffdb0006c0 (LWP 1704499)]
[New Thread 0x7fffda6006c0 (LWP 1704500)]
[New Thread 0x7fffd9c006c0 (LWP 1704501)]
[New Thread 0x7fffd92006c0 (LWP 1704502)]
[New Thread 0x7fffd88006c0 (LWP 1704503)]
[New Thread 0x7fffd7e006c0 (LWP 1704504)]
[New Thread 0x7fffd74006c0 (LWP 1704505)]
[New Thread 0x7fffd6a006c0 (LWP 1704506)]
[New Thread 0x7fffd60006c0 (LWP 1704507)]
[New Thread 0x7fffd56006c0 (LWP 1704508)]
[New Thread 0x7fffd00006c0 (LWP 1704515)]
===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.11.11, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/simon/code/xarray-ms
configfile: pyproject.toml
collecting ... warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libncursesw.so.6
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libtinfo.so.6
collected 56 items / 55 deselected / 1 selected                                                                                                                                                

tests/test_basic.py 
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x000000000052cb28 in ?? ()
(gdb) bt
#0  0x000000000052cb28 in ?? ()
#1  0x0000000000540ad1 in _PyEval_EvalFrameDefault ()
#2  0x0000000000583d88 in ?? ()
#3  0x0000000000583706 in ?? ()
#4  0x000000000056e817 in PyObject_Call ()
#5  0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#6  0x0000000000583613 in ?? ()
#7  0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#8  0x0000000000563eaf in _PyFunction_Vectorcall ()
#9  0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#10 0x000000000056c361 in _PyObject_Call_Prepend ()
#11 0x0000000000654489 in ?? ()
#12 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#13 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#14 0x0000000000563eaf in _PyFunction_Vectorcall ()
#15 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#16 0x0000000000563eaf in _PyFunction_Vectorcall ()
#17 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#18 0x000000000056c361 in _PyObject_Call_Prepend ()
#19 0x0000000000654489 in ?? ()
#20 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#21 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#22 0x0000000000563eaf in _PyFunction_Vectorcall ()
#23 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#24 0x0000000000563eaf in _PyFunction_Vectorcall ()
#25 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#26 0x000000000056c361 in _PyObject_Call_Prepend ()
#27 0x0000000000654489 in ?? ()
#28 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#29 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#30 0x0000000000563eaf in _PyFunction_Vectorcall ()
#31 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#32 0x0000000000563eaf in _PyFunction_Vectorcall ()
#33 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
--Type <RET> for more, q to quit, c to continue without paging--
#34 0x000000000056c361 in _PyObject_Call_Prepend ()
#35 0x0000000000654489 in ?? ()
#36 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#37 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#38 0x000000000060f1ed in ?? ()
#39 0x000000000060e998 in PyEval_EvalCode ()
#40 0x0000000000627228 in ?? ()
#41 0x000000000054b179 in ?? ()
#42 0x000000000054b051 in PyObject_Vectorcall ()
#43 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#44 0x0000000000563eaf in _PyFunction_Vectorcall ()
#45 0x0000000000639122 in ?? ()
#46 0x0000000000638a27 in Py_RunMain ()
#47 0x00000000005ffc8d in Py_BytesMain ()
#48 0x00007ffff7c2a1ca in __libc_start_call_main (main=main@entry=0x5ffbe0, argc=argc@entry=7, argv=argv@entry=0x7fffffffd2a8) at ../sysdeps/nptl/libc_start_call_main.h:58
#49 0x00007ffff7c2a28b in __libc_start_main_impl (main=0x5ffbe0, argc=7, argv=0x7fffffffd2a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd298)
    at ../csu/libc-start.c:360
#50 0x00000000005ffb15 in _start ()
(gdb) 

The pytest version is 8.3.3

$ pip freeze | grep pytest
pytest==8.3.3

@sjperkins
Copy link
Author
sjperkins commented Jan 20, 2025

With PYTHON_MALLOC (seberg: This is the interesting traceback)

$ PYTHONMALLOC=malloc_debug gdb --args python -m pytest -p no:faulthandler -k test_lexical_binary_search
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) 
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
(No debugging symbols found in python)
(gdb) run
Starting program: /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/bin/python -m pytest -p no:faulthandler -k test_lexical_binary_search
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc6006c0 (LWP 1706199)]
[New Thread 0x7fffdbc006c0 (LWP 1706200)]
[New Thread 0x7fffdb2006c0 (LWP 1706201)]
[New Thread 0x7fffda8006c0 (LWP 1706202)]
[New Thread 0x7fffd9e006c0 (LWP 1706203)]
[New Thread 0x7fffd94006c0 (LWP 1706204)]
[New Thread 0x7fffd8a006c0 (LWP 1706205)]
[New Thread 0x7fffd80006c0 (LWP 1706206)]
[New Thread 0x7fffd76006c0 (LWP 1706207)]
[New Thread 0x7fffd6c006c0 (LWP 1706208)]
[New Thread 0x7fffd62006c0 (LWP 1706209)]
[New Thread 0x7fffd0c006c0 (LWP 1706210)]
===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.11.11, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/simon/code/xarray-ms
configfile: pyproject.toml
collecting ... warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libncursesw.so.6
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libtinfo.so.6
collected 56 items / 55 deselected / 1 selected                                                                                                                                                

tests/test_basic.py 
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff7173174 in PyArray_GetClearFunction ()
   from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
(gdb) bt
#0  0x00007ffff7173174 in PyArray_GetClearFunction ()
   from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#1  0x00007ffff71b38c8 in PyArray_ClearArray ()
   from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#2  0x00007ffff7128c68 in array_dealloc ()
   from /home/simon/.cache/pypoetry/virtualenvs/xarray-ms-jDhc3Ane-py3.11/lib/python3.11/site-packages/numpy/_core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
#3  0x0000000000563f5f in _PyFunction_Vectorcall ()
#4  0x000000000056e817 in PyObject_Call ()
#5  0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#6  0x0000000000563eaf in _PyFunction_Vectorcall ()
#7  0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#8  0x0000000000563eaf in _PyFunction_Vectorcall ()
#9  0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#10 0x000000000056c361 in _PyObject_Call_Prepend ()
#11 0x0000000000654489 in ?? ()
#12 0x000000000052ebf3 in _PyObject_MakeTpCall ()
#13 0x000000000053c3fd in _PyEval_EvalFrameDefault ()
#14 0x0000000000563eaf in _PyFunction_Vectorcall ()
#15 0x0000000000540cdb in _PyEval_EvalFrameDefault ()
#16 0x0000000000563eaf in _PyFunction_Vectorcall ()
#17 0x0000000000533bf2 in _PyObject_FastCallDictTstate ()
#18 0x000000000056c361 in _PyObject_Call_Prepend ()
#19 0x0000000000654489 in ?? ()

@sjperkins
Copy link
Author
sjperkins commented Jan 20, 2025
  • I assume this is a pip intalled NumPy? (we got show_runtime, so likely nothing new).

correct

  • Am I correct to think that this is new in 2.2.2 and the example succeeded in 2.2.1? Because that makes it more important to look into it.

yes, it worked on 2.2.1 and failed on 2.2.2 (just pip installed 2.2.1 in the venv and it succeeded in gdb)

@seberg
Copy link
Member
seberg commented Jan 20, 2025

Thanks, the second backtrace seems to point at a very clear direction. I suspect that there is a reference count bug in searchsorted on the dtype. (i.e. presumably we lose one references to the structured dtype and at cleanup that bytes us).

@seberg seberg added this to the 2.2.3 release milestone Jan 20, 2025
@seberg
Copy link
Member
seberg commented Jan 20, 2025

Just FYI, I tried around a bit and I can reproduce this locally only on linux and only with 2.2.3 or 2.2.x particularly not on main (indeed pytest is not requires, calling at as function is sufficient).

So the "when/where" it happens is a bit cryptic to me right now (maybe it is just random), but probably best to just dig and find out what is wrong and the rest will become clear.

@seberg
Copy link
Member
seberg commented Jan 20, 2025

@ngoldbaum seems to be related to gh-28154. Adding an incref in the path where in_descr is given seems to fix it. I'll come back to it later, I think that is probably correct and the whole code path was just always broken (the comments are confusing me a bit of where and when references do get stolen).

This is a very strange code path, if you pass in a descr why not make sure it is in native byte-order before passing it in.

A subtle, but unintentionally large, change I missed/forgot, is that making the dtype canonical does two things for structured dtypes (meaning that something happens now):

  • All included dtypes are byte-swapped to be native.
  • The dtype is also made "compact" (fields put into order without gaps).

I doubt this matters here as such (the problem is rather that we always take the path), and I think it is likely an OK change, but an unfortunately large one for a backport :/.

EDIT: Still a bit unclear why it isn't showing on main for me. Might just be random... It does die while cleaning/clearing which explains needing debug allocations because the flags would normally prevent any cleanup work (filling with 0xdddddd probably sets the HASREF flag).

@hawkinsp
Copy link
Contributor
hawkinsp commented Jan 20, 2025

Here's a shorter reproducer, which crashes for me on a Mac with 2.2.3 as well as a self-built linux build.

In [1]: import numpy as np

In [2]: x = np.array([(0, 1.)], dtype=[('time', '<i8'), ('value', '<f8')])

In [3]: y = np.array((0, 0.), dtype=[('time', '<i8'), ('value', '<f8')])

In [4]: x.searchsorted(y)
Segmentation fault: 11

@hawkinsp
Copy link
Contributor

I bisected this crash to #28160 .

@ngoldbaum
Copy link
Member

I doubt this matters here as such (the problem is rather that we always take the path), and I think it is likely an OK change, but an unfortunately large one for a backport :/.

FWIW, I never added the backport flag...

@charris maybe check with me before backporting PRs I didn't explicitly mark as being ready to backport?

@seberg what do you think about going with the more minimal change I originally suggested that didn't touch the branch that structured DTypes go into?

@seberg
Copy link
Member
seberg commented Jan 20, 2025

Yeah, I saw the backport coming through I think and just didn't think it wasn't a big deal, but I missed this reason against it.

My gut-feeling is to just do the original minimal fix for backporting (probably on main). For long-term, I think just adding the INCREF may be fine, but we should maybe re-organize searchsorted anyway to normalize the descriptor first (and not rely on this flag).

@ngoldbaum
Copy link
Member

My gut-feeling is to just do the original minimal fix for backporting (probably on main)..

I'll take this on today so we can get a quick 2.2.3 out with a fix for the crash on the stable release series.

For long-term, I think just adding the INCREF may be fine, but we should maybe re-organize searchsorted anyway to normalize the descriptor first (and not rely on this flag).

If you could take this on I'd appreciate it :)

@ngoldbaum
Copy link
Member

#28190 (comment)

This does trigger on main. Still not totally clear to me why the original script doesn't, but I used this as a regression test. See #28198.

spxiwh added a commit to spxiwh/pycbc that referenced this issue Feb 4, 2025
ahnitz pushed a commit to gwastro/pycbc that referenced this issue Feb 4, 2025
@charris charris modified the milestones: 2.2.3 release, 2.2.4 release Feb 13, 2025
seberg added a commit to seberg/numpy that referenced this issue Mar 3, 2025
This closes numpygh-28190 and fixes another issue in the initial code
that triggered the regression.

Note that we may still want to avoid this, since this does lead to
constructing (view compatible) structured dtypes unnecessarily here.

It would also compactify the dtype.  For building unnecessary dtypes,
the better solution may be to just introduce a "canonical" flag to
the dtypes (now that we have the space).
seberg added a commit to seberg/numpy that referenced this issue Mar 3, 2025
This closes numpygh-28190 and fixes another issue in the initial code
that triggered the regression.

Note that we may still want to avoid this, since this does lead to
constructing (view compatible) structured dtypes unnecessarily here.

It would also compactify the dtype.  For building unnecessary dtypes,
the better solution may be to just introduce a "canonical" flag to
the dtypes (now that we have the space).
@mhvk mhvk closed this as completed in 01e98f1 Mar 5, 2025
charris pushed a commit to charris/numpy that referenced this issue Mar 5, 2025
* BUG: Fix searchsorted and CheckFromAny byte-swapping logic

This closes numpygh-28190 and fixes another issue in the initial code
that triggered the regression.

Note that we may still want to avoid this, since this does lead to
constructing (view compatible) structured dtypes unnecessarily here.

It would also compactify the dtype.  For building unnecessary dtypes,
the better solution may be to just introduce a "canonical" flag to
the dtypes (now that we have the space).

* STY: Adopt code comment suggestions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
0