8000 Valgrind error in lowlevel_strided_loops.c.src · Issue #8922 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Valgrind error in lowlevel_strided_loops.c.src #8922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rainwoodman opened this issue Apr 10, 2017 · 19 comments
Closed

Valgrind error in lowlevel_strided_loops.c.src #8922

rainwoodman opened this issue Apr 10, 2017 · 19 comments

Comments

@rainwoodman
Copy link
Contributor

The line seems to be a 'swap', but why would it access beyond the limit?

It may be a false positive; but I'd raise the alarm here for caution.

==22767== Warning: set address range perms: large range [0x3f319040, 0xd4348940) (defined)
==22767== Warning: set address range perms: large range [0x3f319028, 0xd4348958) (noaccess)
==22767== Invalid read of size 8
==22767==    at 0xFF702D3: _aligned_strided_to_strided_size8_srcstride0 (lowlevel_strided_loops.c.src:219)
==22767==    by 0xFF2352B: raw_array_assign_array (array_assign_array.c:96)
==22767==    by 0xFF23D51: PyArray_AssignArray (array_assign_array.c:351)
==22767==    by 0xFFBAE9F: PyArray_ConcatenateArrays (multiarraymodule.c:449)
==22767==    by 0xFFBB339: PyArray_Concatenate (multiarraymodule.c:618)
==22767==    by 0xFFBB460: array_concatenate (multiarraymodule.c:2127)
==22767==    by 0x4F1B33B: call_function (ceval.c:4427)
==22767==    by 0x4F1B33B: PyEval_EvalFrameEx (ceval.c:3061)
==22767==    by 0x4F1E02B: PyEval_EvalCodeEx (ceval.c:3659)
==22767==    by 0x4F1AF4D: fast_function (ceval.c:4522)
==22767==    by 0x4F1AF4D: call_function (ceval.c:4447)
==22767==    by 0x4F1AF4D: PyEval_EvalFrameEx (ceval.c:3061)
==22767==    by 0x4F1E02B: PyEval_EvalCodeEx (ceval.c:3659)
==22767==    by 0x4EA6D3B: function_call (funcobject.c:523)
==22767==    by 0x4E81FD2: PyObject_Call (abstract.c:2546)
==22767==  Address 0x3bd2faf8 is 0 bytes after a block of size 8 alloc'd
==22767==    at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==22767==    by 0xFEFAEFB: PyDataMem_NEW (alloc.c:173)
==22767==    by 0xFF341E4: PyArray_NewFromDescr_int (ctors.c:1058)
==22767==    by 0xFF34255: PyArray_NewFromDescr (ctors.c:1142)
==22767==    by 0xFFD3F6B: PyArray_FromScalar (scalarapi.c:313)
==22767==    by 0xFF36101: PyArray_FromAny (ctors.c:1730)
==22767==    by 0x11AFA042: get_ufunc_arguments (ufunc_object.c:824)
==22767==    by 0x11AFC734: PyUFunc_GenericFunction (ufunc_object.c:2566)
==22767==    by 0x11AFDD84: ufunc_generic_call (ufunc_object.c:4340)
==22767==    by 0x4E81FD2: PyObject_Call (abstract.c:2546)
==22767==    by 0x4E828B1: PyObject_CallFunctionObjArgs (abstract.c:2773)
==22767==    by 0x4EBCF2B: try_rich_compare (object.c:622)
@juliantaylor
Copy link
Contributor

how did you trigger this error?

@rainwoodman
Copy link
Contributor Author

I ran valgrind on scipy test suite with numpy verison of
1.13.0.dev0+cce86d6 / (a7d244c + fix for 8264)

@rainwoodman
Copy link
Contributor Author

Just by looking at the stack trace seems like something like concatenate([[numpy.int64(3)], [2, 3, 4]]) may trigger this?

@seberg
Copy link
Member
seberg commented Apr 11, 2017

Without further information, this could also be caused by the scipy test suit/code using uninitialized arrays, it would be good to get a working reproducing example.

@jaimefrio
Copy link
Member

There may be something fishy in concatenate... I recently tried to rebase #7450 and started getting some weird errors, where importing numpy would crash, and adding a couple of printf statements made everything work. I will have more time the next couple of weeks, will take a closer look at it.

@rainwoodman
Copy link
Contributor Author

Address 0x3bd2faf8 is 0 bytes after a block of size 8 alloc'd. So it is not uninitalized array but an access over the border. I am fiddling to find a case on the master.

@rainwoodman
Copy link
Contributor Author

This will reproduce the error. Add file a test case for this?

import numpy as np
x = np.zeros((1, 1))
ibad = np.vstack(np.where(x == 99.))

If the array being stacked are not created from where valgrind doesn't complain.

==11210== Invalid read of size 8
==11210==    at 0xD916CA3: _aligned_strided_to_strided_size8_srcstride0 (lowlevel_strided_loops.c.src:219)
==11210==    by 0xD8C990B: raw_array_assign_array (array_assign_array.c:96)
==11210==    by 0xD8CA131: PyArray_AssignArray (array_assign_array.c:351)
==11210==    by 0xD962504: PyArray_ConcatenateArrays (multiarraymodule.c:441)
==11210==    by 0xD9629F9: PyArray_Concatenate (multiarraymodule.c:609)
==11210==    by 0xD962B20: array_concatenate (multiarraymodule.c:2163)
==11210==    by 0x4F1B33B: call_function (ceval.c:4427)
==11210==    by 0x4F1B33B: PyEval_EvalFrameEx (ceval.c:3061)
==11210==    by 0x4F1E02B: PyEval_EvalCodeEx (ceval.c:3659)
==11210==    by 0x4F1AF4D: fast_function (ceval.c:4522)
==11210==    by 0x4F1AF4D: call_function (ceval.c:4447)
==11210==    by 0x4F1AF4D: PyEval_EvalFrameEx (ceval.c:3061)
==11210==    by 0x4F1B021: fast_function (ceval.c:4512)
==11210==    by 0x4F1B021: call_function (ceval.c:4447)
==11210==    by 0x4F1B021: PyEval_EvalFrameEx (ceval.c:3061)
==11210==    by 0x4F1E02B: PyEval_EvalCodeEx (ceval.c:3659)
==11210==    by 0x4F1E118: PyEval_EvalCode (ceval.c:691)
==11210==  Address 0x6446ac8 is 0 bytes after a block of size 8 alloc'd
==11210==    at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==11210==    by 0xD8A126B: PyDataMem_NEW (alloc.c:185)
==11210==    by 0xD8DA7D4: PyArray_NewFromDescr_int (ctors.c:1057)
==11210==    by 0xD8DC764: PyArray_NewFromDescr (ctors.c:1141)
==11210==    by 0xD8DC764: PyArray_FromAny (ctors.c:1762)
==11210==    by 0xF47B7BE: get_ufunc_arguments (ufunc_object.c:825)
==11210==    by 0xF480206: PyUFunc_GenericFunction (ufunc_object.c:2615)
==11210==    by 0xF481B64: ufunc_generic_call (ufunc_object.c:4385)
==11210==    by 0x4E81FD2: PyObject_Call (abstract.c:2546)
==11210==    by 0x4E828B1: PyObject_CallFunctionObjArgs (abstract.c:2773)
==11210==    by 0xD8A59EC: array_richcompare (arrayobject.c:1409)
==11210==    by 0x4EBCF2B: try_rich_compare (object.c:622)
==11210==    by 0x4EBEF45: do_richcmp (object.c:930)
==11210==    by 0x4EBEF45: PyObject_RichCompare (object.c:982)

@seberg
Copy link
Member
seberg commented Apr 20, 2017

Thanks for finding an example!

@juliantaylor
Copy link
Contributor

hm PyArray_ConcatenateArrays calls PyArray_AssignArray with a size = 0 array due to the at_least2d in vstack. That function does not deal with that case.

@juliantaylor
Copy link
Contributor

The underlying issue is actually in np.where, when it produces empty arrays the views can point to invalid memory.
So it is probably a good thing our assignment function tries to read one element also for empty arrays.

juliantaylor added a commit to juliantaylor/numpy that referenced this issue Apr 20, 2017
When creating the tuple view of np.where, make sure the arrays point to
valid memory when they are empty.
Numpy assumes that even empty arrays have valid memory, e.g. in the
source stride == 0 assignments.
Closes numpygh-8922.
@rainwoodman
Copy link
Contributor Author

There is no way to guarentee non-numpy functions does not produce empty arrays with invalid pointer. I suggest numpy shall not read any element from size=0 arrays.

Maybe its time to add a valgrind test suite? For now we can write tests to assert that numpy functions

  • never returns invalid data pointer of size-0 arrays and
  • can deal with invalid data pointer of size-0 arrays;

with no valgrind errors.

@juliantaylor
Copy link
Contributor

valgrind is too slow for an automated test, even the compiler based address sanitizers are somewhat on the slow side for numpy.
But usually someone runs valgrind and address sanitizers over the testsuite before release.

@matthew-brett
Copy link
Contributor

Could we do the valdrind test on Circle-CI - for example? Or a buildbot somewhere?

@rainwoodman
Copy link
Contributor Author

Collecting memory related tests into a single file for regression tests under valgrind -- it doesn't have to be ran automatically on every push / pr.

@juliantaylor
Copy link
Contributor

pretty much every test is memory related, there is not much point trying to separate them from the rest.
Or at least you could run valgrind on the full suite several dozen times in the time it takes you to separate them.

@rainwoodman
Copy link
Contributor Author

That's a fair point (every test is memory related).

but valgrind targeted tests are slightly different -- normal python test runs won't be able to trigger them ( they don't even contain an assertion). They only trigger valgrind errors. The current policy seems to be only remember memory related test cases in the issue tracker; which means regressions will be hard to detected.

What about extending runtests.py to add a --valgrind mode, which runs a suite of special tests that are for valgrind. The initial suite could be a simple collection of old seg faults on the issue tracker; then the suite will be gradually build up.

@juliantaylor
Copy link
Contributor

It indeed could be useful to mark issues related to segfaults and uninitialized memory in the testsuite, could be done via decorators that could then be run via nosetests test labels.
But I'm not sure how often people will think of this when adding tests and we don't really gain very much on top of just following the release procedure.

@njsmith
Copy link
Member
njsmith commented Apr 20, 2017

I think it'd be fine to include tests whose primary purpose is to exercise otherwise untested code paths and catch regressions when run under valgrind. I don't know that they need any special ceremony beyond a comment explaining the problem that motivated them.

Running valgrind or other sanitizers in an automated way is also an interesting idea, but isn't trivial and should probably be split off into it's own issue.

@seberg
Copy link
Member
seberg commented Apr 21, 2017

We could try to do the automated thing on tagged tests, but even if, I think that before a release someone will still run the full test suit in valgrind anyway, since a huge amount of test could in principle trigger valgrind errors and there are few tests that are obvious candidates (maybe iterator tests, etc. and regressions).

I rather think, if you have something that likely causes an error, we will typically check with valgrind anyway, and in all cases will check with valgrind before a release, so as long as there is everything covered in the tests, there is probably little point in trying to think of which parts should be "valgrind" also.

mherkazandjian pushed a commit to mherkazandjian/numpy that referenced this issue May 30, 2017
When creating the tuple view of np.where, make sure the arrays point to
valid memory when they are empty.
Numpy assumes that even empty arrays have valid memory, e.g. in the
source stride == 0 assignments.
Closes numpygh-8922.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
0