8000 Segfault when constructing an array from a "bad" class · Issue #7264 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Segfault when constructing an array from a "bad" class #7264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anntzer opened this issue Feb 16, 2016 · 26 comments
Closed

Segfault when constructing an array from a "bad" class #7264

anntzer opened this issue Feb 16, 2016 · 26 comments

Comments

@anntzer
Copy link
Contributor
anntzer commented Feb 16, 2016
In [1]: class C:
    def __getitem__(self, i): raise IndexError
    def __len__(self): return 42
   ...: 

In [2]: np.array(C())
Fatal Python error: Segmentation fault

A bit contrieved; on the other hand you can imagine hitting an IndexError in some nested function called by __getitem__ accidentally bubbling up.

EDITED: gdb backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6132bae in PyArray_DTypeFromObjectHelper (obj=obj@entry=<C at remote 0x7ffff66cf898>, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffd478, 
    string_type=string_type@entry=0) at numpy/core/src/multiarray/common.c:538
538             if (Py_TYPE(objects[i]) != common_type) {
(gdb) bt
#0  0x00007ffff6132bae in PyArray_DTypeFromObjectHelper (obj=obj@entry=<C at remote 0x7ffff66cf898>, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffd478, 
    string_type=string_type@entry=0) at numpy/core/src/multiarray/common.c:538
#1  0x00007ffff6132e63 in PyArray_DTypeFromObject (obj=obj@entry=<C at remote 0x7ffff66cf898>, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffd478)
    at numpy/core/src/multiarray/common.c:184
#2  0x00007ffff613c7a5 in PyArray_GetArrayParamsFromObject (op=<C at remote 0x7ffff66cf898>, requested_dtype=<optimized out>, writeable=<optimized out>, 
    out_dtype=0x7fffffffd478, out_ndim=0x7fffffffd46c, out_dims=0x7fffffffd480, out_arr=0x7fffffffd470, context=0x0) at numpy/core/src/multiarray/ctors.c:1560
#3  0x00007ffff613cb7d in PyArray_FromAny (op=op@entry=<C at remote 0x7ffff66cf898>, newtype=0x0, min_depth=0, max_depth=0, flags=flags@entry=112, context=<optimized out>)
    at numpy/core/src/multiarray/ctors.c:1692
#4  0x00007ffff613ceff in PyArray_CheckFromAny (op=<C at remote 0x7ffff66cf898>, descr=<optimized out>, min_depth=min_depth@entry=0, max_depth=max_depth@entry=0, 
    requires=112, context=context@entry=0x0) at numpy/core/src/multiarray/ctors.c:1870
#5  0x00007ffff61c2d25 in _array_fromobject (__NPY_UNUSED_TAGGEDignored=<optimized out>, args=<optimized out>, kws=0x0) at numpy/core/src/multiarray/multiarraymodule.c:1714
#6  0x00007ffff79bcaa9 in PyCFunction_Call () from /usr/lib/libpython3.5m.so.1.0
#7  0x00007ffff7a34a01 in PyEval_EvalFrameEx () from /usr/lib/libpython3.5m.so.1.0
#8  0x00007ffff7a35df2 in ?? () from /usr/lib/libpython3.5m.so.1.0
#9  0x00007ffff7a35ed3 in PyEval_EvalCodeEx () from /usr/lib/libpython3.5m.so.1.0
#10 0x00007ffff7a35efb in PyEval_EvalCode () from /usr/lib/libpython3.5m.so.1.0
#11 0x00007ffff7a55074 in ?? () from /usr/lib/libpython3.5m.so.1.0
#12 0x00007ffff7a57585 in PyRun_FileExFlags () from /usr/lib/libpython3.5m.so.1.0
#13 0x00007ffff7a576f6 in PyRun_SimpleFileExFlags () from /usr/lib/libpython3.5m.so.1.0
#14 0x00007ffff7a6e504 in Py_Main () from /usr/lib/libpython3.5m.so.1.0
#15 0x0000000000400af7 in main ()
@charris
Copy link
Member
charris commented Feb 16, 2016

A backtrace might be interesting here.

@anntzer
Copy link
Contributor Author
anntzer commented Feb 16, 2016

What should I change in setup.py to get a -Og build? Right now it's not very helpful :-)

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff630a240 in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
(gdb) bt
#0  0x00007ffff630a240 in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#1  0x00007ffff630a493 in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#2  0x00007ffff6313c15 in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#3  0x00007ffff6313fed in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#4  0x00007ffff631436f in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#5  0x00007ffff638fdb5 in ?? () from /usr/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so
#6  0x00007ffff79bcaa9 in PyCFunction_Call () from /usr/lib/libpython3.5m.so.1.0
#7  0x00007ffff7a34a01 in PyEval_EvalFrameEx () from /usr/lib/libpython3.5m.so.1.0
#8  0x00007ffff7a35df2 in ?? () from /usr/lib/libpython3.5m.so.1.0
#9  0x00007ffff7a35ed3 in PyEval_EvalCodeEx () from /usr/lib/libpython3.5m.so.1.0
#10 0x00007ffff7a35efb in PyEval_EvalCode () from /usr/lib/libpython3.5m.so.1.0
#11 0x00007ffff7a55074 in ?? () from /usr/lib/libpython3.5m.so.1.0
#12 0x00007ffff7a57585 in PyRun_FileExFlags () from /usr/lib/libpython3.5m.so.1.0
#13 0x00007ffff7a576f6 in PyRun_SimpleFileExFlags () from /usr/lib/libpython3.5m.so.1.0
#14 0x00007ffff7a6e504 in Py_Main () from /usr/lib/libpython3.5m.so.1.0
#15 0x0000000000400af7 in main ()

@pv
Copy link
Member
pv commented Feb 16, 2016

rm -rf build
OPT="-ggdb -Og" python setup.py build
(or python runtests.py -g --ipython)

@anntzer
Copy link
Contributor Author
anntzer commented Feb 16, 2016

Indeed, the segfault is very revealing. Edited the original post.

@ahaldane
Copy link
Member

I also happened to get a stack trace, but I get a different trace:

PyArray_DTypeFromObjectHelper (obj=obj@entry=0x7fffef215ef0, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8, string_type=string_type@entry=0) at numpy/core/src/multiarray/common.c:536
536     common_type = size > 0 ? Py_TYPE(objects[0]) : NULL;
(gdb) bt
#0  PyArray_DTypeFromObjectHelper (obj=obj@entry=0x7fffef215ef0, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8, string_type=string_type@entry=0) at numpy/core/src/multiarray/common.c:536
#1  0x00007ffff14d6e53 in PyArray_DTypeFromObject (obj=obj@entry=0x7fffef215ef0, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8) at numpy/core/src/multiarray/common.c:184
#2  0x00007ffff14e0795 in PyArray_GetArrayParamsFromObject (op=0x7fffef215ef0, requested_dtype=<optimized out>, writeable=<optimized out>, out_dtype=0x7fffffffcba8, out_ndim=0x7fffffffcb9c, 
    out_dims=0x7fffffffcbb0, out_arr=0x7fffffffcba0, context=0x0) at numpy/core/src/multiarray/ctors.c:1560
#3  0x00007ffff14e0b6d in PyArray_FromAny (op=op@entry=0x7fffef215ef0, newtype=0x0, min_depth=0, max_depth=0, flags=flags@entry=112, context=<optimized out>) at numpy/core/src/multiarray/ctors.c:1692
#4  0x00007ffff14e0eef in PyArray_CheckFromAny (op=0x7fffef215ef0, descr=<optimized out>, min_depth=min_depth@entry=0, max_depth=max_depth@entry=0, requires=112, context=context@entry=0x0)
    at numpy/core/src/multiarray/ctors.c:1870
#5  0x00007ffff1566cf5 in _array_fromobject (__NPY_UNUSED_TAGGEDignored=<optimized out>, args=<optimized out>, kws=0x0) at numpy/core/src/multiarray/multiarraymodule.c:1714
#6  0x00007ffff79bcaa9 in PyCFunction_Call () from /usr/lib/libpython3.5m.so.1.0
#7  0x00007ffff7a34a01 in PyEval_EvalFrameEx () from /usr/lib/libpython3.5m.so.1.0
#8  0x00007ffff7a35df2 in ?? () from /usr/lib/libpython3.5m.so.1.0

(python runtime junk below this point)

I only got the trace because I also wanted to suggest gdb --args python3 runtests.py -g --ipython as my preferred method, but @pv beat me to it! (he suggested it to me another time)

@anntzer
Copy link
Contributor Author
anntzer commented Feb 16, 2016

Looks like it's a matter of calling PySequence_Fast_GET_SIZE on the result of PySequence_Fast (to get the correct size) rather than getting the length on the object itself.

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

@anntzer : Maybe it's something with my environment, but I ran your code, and I created an array object successfully. This is with v1.10.4 (conda package) and Python 2.7.11 on Windows.

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

@charris : This is a regression AFAICT. I can run this code successfully as well using a PyPI installation of numpy, Python 3.5.1 on Windows. However, I can replicate the segmentation fault on the master branch in that same environment though.

@charris
Copy link
Member
charris commented Feb 16, 2016

If it is a regression it should be possible to bisect.

@charris
Copy link
Member
charris commented Feb 16, 2016

Works for me in master. Let's try 1.11.x
EDIT: WFM in 1.11.x
EDIT: WFM in 1.10.x

Might be a compiler/python version issue. I'm running Python 2.7.10, gcc 5.3.1 on Fedora 23 x86_64.

@ahaldane
Copy link
Member

I get the segfault in master, but only in python3, not python2.

I haven't looked at the code at all, but here's one hint:

Program received signal SIGSEGV, Segmentation fault.
PyArray_DTypeFromObjectHelper (obj=obj@entry=0x7fffef213c88, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8, string_type=string_type@entry=0) at numpy/core/src/multiarray/common.c:536
536     common_type = size > 0 ? Py_TYPE(objects[0]) : NULL;
(gdb) p objects[0]
$1 = (PyObject *) 0x2000

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

In the spirit of what @charris did, I too will summarize my results:

  1. Windows 7, Python 2.7.11, v1.10.4 (conda package): no error
  2. Windows 7, Python 3.5.1, v.10.4 (PyPI package), MKL: no error
  3. Windows 7, Python 3.5.1, master, MKL: segfault
  4. Cygwin 64, Python 3.4.3, master, gcc 4.9.3: segfault
  5. Cygwin 64, Python 2.7.10, master, gcc 4.9.3: no error

Thus, it doesn't seem to be a regression, so I am inclined to agree with @charris diagnosis.

@ahaldane
Copy link
Member

I suspect some kind of memory corruption/double decref, because now, without having changed anything, I get a different stack trace.

#0  0x00007ffff79cff60 in PyType_IsSubtype () from /usr/lib/libpython3.5m.so.1.0
#1  0x00007ffff14d63bc in PyArray_DTypeFromObjectHelper (obj=0x7ffff76eaf58 <main_arena+1048>, maxdims=maxdims@entry=31, out_dtype=out_dtype@entry=0x7fffffffcba8, string_type=string_type@entry=0)
    at numpy/core/src/multiarray/common.c:214
#2  0x00007ffff14d6c03 in PyArray_DTypeFromObjectHelper (obj=obj@entry=0x7fffef213eb8, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8, string_type=string_type@entry=0)
    at numpy/core/src/multiarray/common.c:558
#3  0x00007ffff14d6e53 in PyArray_DTypeFromObject (obj=obj@entry=0x7fffef213eb8, maxdims=maxdims@entry=32, out_dtype=out_dtype@entry=0x7fffffffcba8) at numpy/core/src/multiarray/common.c:184
#4  0x00007ffff14e0795 in PyArray_GetArrayParamsFromObject (op=0x7fffef213eb8, requested_dtype=<optimized out>, writeable=<optimized out>, out_dtype=0x7fffffffcba8, out_ndim=0x7fffffffcb9c, 
    out_dims=0x7fffffffcbb0, out_arr=0x7fffffffcba0, context=0x0) at numpy/core/src/multiarray/ctors.c:1560
#5  0x00007ffff14e0b6d in PyArray_FromAny (op=op@entry=0x7fffef213eb8, newtype=0x0, min_depth=0, max_depth=0, flags=flags@entry=112, context=<optimized out>) at numpy/core/src/multiarray/ctors.c:1692
#6  0x00007ffff14e0eef in PyArray_CheckFromAny (op=0x7fffef213eb8, descr=<optimized out>, min_depth=min_depth@entry=0, max_depth=max_depth@entry=0, requires=112, context=context@entry=0x0)
    at numpy/core/src/multiarray/ctors.c:1870
#7  0x00007ffff1566cf5 in _array_fromobject (__NPY_UNUSED_TAGGEDignored=<optimized out>, args=<optimized out>, kws=0x0) at numpy/core/src/multiarray/multiarraymodule.c:1714
#8  0x00007ffff79bcaa9 in PyCFunction_Call () from /usr/lib/libpython3.5m.so.1.0
#9  0x00007ffff7a34a01 in PyEval_EvalFrameEx () from /usr/lib/libpython3.5m.so.1.0

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

@ahaldane : I wonder if it's an issue with Python 3.5 (or just Python 3) since the issue can't seem to be reproduced in Python 2.x. However, @anntzer , you, and I all got segfaults using Python 3.5 but under otherwise completely different circumstances on master. It could actually be a regression in the sense that something broke on Python 3.5 between v1.10.4 and now.

@charris
Copy link
Member
charris commented Feb 16, 2016

Also segfaults on Python 3.4.

@ahaldane
Copy link
Member

Well, ignoring my last comment for a second, I think I see the bug:

From common.c, line 530, where obj is the instance of C from the example.

    /*
     * fails if convertable to list but no len is defined which some libraries
     * require to get object arrays
     */
    size = PySequence_Size(obj);
    if (size < 0) {
        goto fail;
    }

    /* Recursive case, first check the sequence contains only one type */
    seq = PySequence_Fast(obj, "Could not convert object to sequence");
    if (seq == NULL) {
        goto fail;
    }
    objects = PySequence_Fast_ITEMS(seq);
    common_type = size > 0 ? Py_TYPE(objects[0]) : NULL;
    for (i = 1; i < size; ++i) {
        if (Py_TYPE(objects[i]) != common_type) {
            common_type = NULL;
            break;
        }
    }

Note that size is computed based on C itself, returning 42. Then PySequence_Fast converts it to a list. But the list's length is 0! So then we try to iterate through a list of length 0 as a C-array, while thinking it is a list of length 42, which will clearly cause a segfault.

Fix is to compute size as PyList_Size(seq), I believe.

@anntzer
Copy link
Contributor Author
anntzer commented Feb 16, 2016

PySequence_Fast can return a tuple too, so you should use PySequence_Fast_GET_SIZE.

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

@ahaldane : Hmmm...any reason why that didn't happen for me then in v.10.4 (see here) under the same conditions with Python 3.5.1?

@charris
Copy link
Member
charris commented Feb 16, 2016

I get a segfault with python 3.4 and numpy 1.9. So not a new problem.

8000

@ahaldane
Copy link
Member

Not sure.. since we are essentially reading into memory we don't own, really all sorts of things could happen depending on what happens to be next to us in memory. It could be that in my case, the next value was 0x2000, and memory at location 0x2000 is out-of-mapped-memory, but maybe in your case the next value was 0x1476981234 which happened to be in memory. That's probably why I got segfaults at different places.

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

Perhaps, though no one has been able to reproduce it with v1.10 (maintenance or release) OR with a Python 2.x distribution. How odd...

@ahaldane
Copy link
Member

Aha, see #5106 where the change happened.

So on the one hand, if the object doesn't provide len we want to create an object array. On the other hand, we need to protect against the object giving us an incorrect len!

So I think to satisfy both cases we need to do both checks... a little weird.

@gfyoung
Copy link
Contributor
gfyoung commented Feb 16, 2016

@charris was able to reproduce this bug with v1.9 though. That PR wouldn't have been in the codebase at the time (it seems to have come just right after the last v1.9 release), unless it had been backported?

EDIT: it seems like it was backported into maintenance/1.9.x, but still, the question remains in my mind: why not in v1.10 or with Python 2.x?

E7EE

@anntzer
Copy link
Contributor Author
anntzer commented Feb 16, 2016

Then call PySequence_Size and bail out if it returns -1, but if it succeeds, use PySequence_Fast_GET_SIZE for the loop?

@ahaldane
Copy link
Member

Yeah, I was just trying to understand the result of that.. we end up with

ValueError: cannot copy sequence with size 42 to array axis with dimension 0

for your example. We get the same error if I bail if size != PySequence_Fast_GET_SIZE(seq). In fact, bailing from that function at any point leads to creation of an object array with the same error.

A somewhat obscure error message, but it's an obscure case! And it's better than a segfault.

@anntzer
Copy link
Contributor Author
anntzer commented Feb 17, 2016

I'd say raising an exception is perfectly enough in a case like this.

ahaldane added a commit to ahaldane/numpy that referenced this issue Feb 19, 2016
jaimefrio pushed a commit to jaimefrio/numpy that referenced this issue Mar 22, 2016
seberg pushed a commit that referenced this issue May 21, 2020
When __getitem__ fails, assignment falls back on __iter__, which may not
have the same length as __len__, resulting in a segfault.
See gh-7264.

* BUG: Don't rely on __len__ for safe iteration.

Skip __len__ checking entirely, since this has no guarantees of being
correct. Instead, first convert to PySequence_Fast and use that length.
Also fixes a refleak when creating a tmp array fails.
See gh-7264.

* TST: Update test for related edge-case.

The previous fix more gracefully handles this edge case by skipping the
__len__ check. Rather than raising with an unhelpful error message, just
create the array with whatever elements C() actually yields.
See gh-7264.
charris pushed a commit to charris/numpy that referenced this issue May 22, 2020
When __getitem__ fails, assignment falls back on __iter__, which may not
have the same length as __len__, resulting in a segfault.
See numpygh-7264.

* BUG: Don't rely on __len__ for safe iteration.

Skip __len__ checking entirely, since this has no guarantees of being
correct. Instead, first convert to PySequence_Fast and use that length.
Also fixes a refleak when creating a tmp array fails.
See numpygh-7264.

* TST: Update test for related edge-case.

The previous fix more gracefully handles this edge case by skipping the
__len__ check. Rather than raising with an unhelpful error message, just
create the array with whatever elements C() actually yields.
See numpygh-7264.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
0