-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Segfault when constructing an array from a "bad" class #7264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A backtrace might be interesting here. |
What should I change in setup.py to get a -Og build? Right now it's not very helpful :-)
|
rm -rf build |
Indeed, the segfault is very revealing. Edited the original post. |
I also happened to get a stack trace, but I get a different trace:
I only got the trace because I also wanted to suggest |
Looks like it's a matter of calling |
@anntzer : Maybe it's something with my environment, but I ran your code, and I created an array object successfully. This is with |
@charris : This is a regression AFAICT. I can run this code successfully as well using a PyPI installation of |
If it is a regression it should be possible to bisect. |
Works for me in master. Let's try 1.11.x Might be a compiler/python version issue. I'm running Python 2.7.10, gcc 5.3.1 on Fedora 23 x86_64. |
I get the segfault in master, but only in python3, not python2. I haven't looked at the code at all, but here's one hint:
|
In the spirit of what @charris did, I too will summarize my results:
Thus, it doesn't seem to be a regression, so I am inclined to agree with @charris diagnosis. |
I suspect some kind of memory corruption/double decref, because now, without having changed anything, I get a different stack trace.
|
@ahaldane : I wonder if it's an issue with Python 3.5 (or just Python 3) since the issue can't seem to be reproduced in Python 2.x. However, @anntzer , you, and I all got segfaults using Python 3.5 but under otherwise completely different circumstances on |
Also segfaults on Python 3.4. |
Well, ignoring my last comment for a second, I think I see the bug: From common.c, line 530, where /*
* fails if convertable to list but no len is defined which some libraries
* require to get object arrays
*/
size = PySequence_Size(obj);
if (size < 0) {
goto fail;
}
/* Recursive case, first check the sequence contains only one type */
seq = PySequence_Fast(obj, "Could not convert object to sequence");
if (seq == NULL) {
goto fail;
}
objects = PySequence_Fast_ITEMS(seq);
common_type = size > 0 ? Py_TYPE(objects[0]) : NULL;
for (i = 1; i < size; ++i) {
if (Py_TYPE(objects[i]) != common_type) {
common_type = NULL;
break;
}
} Note that size is computed based on Fix is to compute |
PySequence_Fast can return a tuple too, so you should use PySequence_Fast_GET_SIZE. |
I get a segfault with python 3.4 and numpy 1.9. So not a new problem. |
Not sure.. since we are essentially reading into memory we don't own, really all sorts of things could happen depending on what happens to be next to us in memory. It could be that in my case, the next value was 0x2000, and memory at location 0x2000 is out-of-mapped-memory, but maybe in your case the next value was 0x1476981234 which happened to be in memory. That's probably why I got segfaults at different places. |
Perhaps, though no one has been able to reproduce it with |
Aha, see #5106 where the change happened. So on the one hand, if the object doesn't provide So I think to satisfy both cases we need to do both checks... a little weird. |
@charris was able to reproduce this bug with EDIT: it seems like it was backported into |
Then call |
Yeah, I was just trying to understand the result of that.. we end up with
for your example. We get the same error if I bail if A somewhat obscure error message, but it's an obscure case! And it's better than a segfault. |
I'd say raising an exception is perfectly enough in a case like this. |
When __getitem__ fails, assignment falls back on __iter__, which may not have the same length as __len__, resulting in a segfault. See gh-7264. * BUG: Don't rely on __len__ for safe iteration. Skip __len__ checking entirely, since this has no guarantees of being correct. Instead, first convert to PySequence_Fast and use that length. Also fixes a refleak when creating a tmp array fails. See gh-7264. * TST: Update test for related edge-case. The previous fix more gracefully handles this edge case by skipping the __len__ check. Rather than raising with an unhelpful error message, just create the array with whatever elements C() actually yields. See gh-7264.
When __getitem__ fails, assignment falls back on __iter__, which may not have the same length as __len__, resulting in a segfault. See numpygh-7264. * BUG: Don't rely on __len__ for safe iteration. Skip __len__ checking entirely, since this has no guarantees of being correct. Instead, first convert to PySequence_Fast and use that length. Also fixes a refleak when creating a tmp array fails. See numpygh-7264. * TST: Update test for related edge-case. The previous fix more gracefully handles this edge case by skipping the __len__ check. Rather than raising with an unhelpful error message, just create the array with whatever elements C() actually yields. See numpygh-7264.
A bit contrieved; on the other hand you can imagine hitting an IndexError in some nested function called by
__getitem__
accidentally bubbling up.EDITED: gdb backtrace:
The text was updated successfully, but these errors were encountered: