8000 "import numpy" leaks memory (again) · Issue #12615 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

"import numpy" leaks memory (again) #12615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
skrah opened this issue Dec 26, 2018 · 16 comments
Closed

"import numpy" leaks memory (again) #12615

skrah opened this issue Dec 26, 2018 · 16 comments

Comments

@skrah
Copy link
Contributor
skrah commented Dec 26, 2018

The leaks in #10157 were fixed, but it looks as if there are new ones. The second one (or both) could also be Cython issues. The leaks show up by just importing numpy, without doing anything else.

>>> import numpy as np
>>> np.version.full_version
'1.15.4'

==14718== 56 bytes in 1 blocks are definitely lost in loss record 1,566 of 4,881
==14718==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14718==    by 0x472C70: _PyMem_RawMalloc (obmalloc.c:75)
==14718==    by 0x473942: PyObject_Malloc (obmalloc.c:616)
==14718==    by 0x565A00: _PyObject_GC_Alloc (gcmodule.c:1696)
==14718==    by 0x565ACA: _PyObject_GC_Malloc (gcmodule.c:1718)
==14718==    by 0x565B8C: _PyObject_GC_NewVar (gcmodule.c:1747)
==14718==    by 0x4809E1: PyTuple_New (tupleobject.c:117)
==14718==    by 0x434A02: _PyStack_AsTuple (call.c:1280)
==14718==    by 0x431955: _PyObject_FastCallDict (call.c:115)
==14718==    by 0x4345E6: object_vacall (call.c:1198)
==14718==    by 0x4349BF: PyObject_CallFunctionObjArgs (call.c:1263)
==14718==    by 0x515347: _PyErr_CreateException (errors.c:78)
==14718== 56 bytes in 1 blocks are definitely lost in loss record 1,568 of 4,881
==14718==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14718==    by 0x472C70: _PyMem_RawMalloc (obmalloc.c:75)
==14718==    by 0x473942: PyObject_Malloc (obmalloc.c:616)
==14718==    by 0x565A00: _PyObject_GC_Alloc (gcmodule.c:1696)
==14718==    by 0x565ACA: _PyObject_GC_Malloc (gcmodule.c:1718)
==14718==    by 0x565B8C: _PyObject_GC_NewVar (gcmodule.c:1747)
==14718==    by 0x4809E1: PyTuple_New (tupleobject.c:117)
==14718==    by 0x480F3F: PyTuple_Pack (tupleobject.c:218)
==14718==    by 0xC26BB73: __Pyx_InitCachedConstants (mtrand.c:38492)
==14718==    by 0xC2717D2: __pyx_pymod_exec_mtrand (mtrand.c:40605)
==14718==    by 0x46E0CC: PyModule_ExecDef (moduleobject.c:414)
==14718==    by 0x524B3F: exec_builtin_or_dynamic (import.c:2107)
@skrah
Copy link
Contributor Author
skrah commented Dec 26, 2018

For comparison, numpy==1.15.1 is fine here.

@eric-wieser
< 8000 /svg> Copy link
Member

Does master show the same problem?

1.15.1 - 1.15.4 is already a pretty narrow range to bisect - thanks!

@seberg
Copy link
Member
seberg commented Dec 26, 2018

Hmm, I see the second one (and many like it), but it is in mtrand probably, so it could be cython as well (did we change the cython version for some reason between those releases)?

Running on master, I further see a couple to do with ufunc adding, at least for the linalg ones:

==15823== 248 bytes in 20 blocks are definitely lost in loss record 5,467 of 6,507
==15823==    at 0x4839D7B: realloc (vg_replace_malloc.c:826)
==15823==    by 0xA25FC22: _parse_signature (ufunc_object.c:785)
==15823==    by 0xA26C3B4: PyUFunc_FromFuncAndDataAndSignatureAndIdentity (ufunc_object.c:4937)
==15823==    by 0xA26C159: PyUFunc_FromFuncAndDataAndSignature (ufunc_object.c:4872)
==15823==    by 0x12C2E12D: addUfuncs (umath_linalg.c.src:3604)
==15823==    by 0x12C2E2A9: PyInit__umath_linalg (umath_linalg.c.src:3679)
<snip>

EDIT: This is master with cython 0.29

@charris
Copy link
Member
charris commented Dec 26, 2018

Cython should be fixed at 0.28.5 for Linux in the 1.15 series of wheels, although it hard to be sure. Windows looks like it uses the most recent. I don't recall what the source releases used, but would suspect they are the same. Would the Cython version be part of the generated code?

@skrah Where did you get numpy. If you built it yourself, what Cython version?

@eric-wieser
Copy link
Member

It would be great if we could catch this in CI

@seberg
Copy link
Member
seberg commented Dec 26, 2018

There is this linked in the old issue: twmr@ec7b73f although maybe it would be better to do it outside our testsuit.

@seberg
Copy link
Member
seberg commented Dec 26, 2018

I will post a patch for the ufunc leak, but the original one, I did not check further.

@seberg
Copy link
Member
seberg commented Dec 26, 2018

So for the other mtrand.c leaks, cython generates code that has two lines:

  __pyx_tuple__29 = PyTuple_Pack(1, __pyx_kp_u_b_0); if (unlikely(!__pyx_tuple__29)) __PYX_ERR(0, 1712, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_tuple__29);
  __Pyx_GIVEREF(__pyx_tuple__29);

/* A few lines later: */

  __pyx_tuple__29 = PyTuple_Pack(1, __pyx_kp_u_b_0); if (unlikely(!__pyx_tuple__29)) __PYX_ERR(0, 1719, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_tuple__29);
  __Pyx_GIVEREF(__pyx_tuple__29);

the code works fine, but seems like it generates the same thing twice or so?

@seberg
Copy link
Member
seberg commented Dec 26, 2018

Ahh, this occurs for identical errors being given in multiple places (so the error string ­– and type – is identical). I will open a cython issue, lets see if that is it.

EDIT: Tthe cython bug I saw was already known and fixed in 0.29.2.

@skrah
Copy link
Contributor Author
skrah commented Dec 26, 2018

I checked out numpy master and built with cython master (Cython version 3.0a0), there is no leak.

So the future looks bright. :)

The leaks I had were in the release tarball 1.15.4 and I think also 1.16rc, built from source.

@charris
Copy link
Member
charris commented Dec 26, 2018

Cython version 3.0a0

One more thing to wait on for 1.16.0 :(

@charris
Copy link
Member
charris commented Dec 26, 2018

Looks like Cython 0.29.2 also fixes the problem. cython/cython#2768.

@seberg
Copy link
Member
seberg commented Dec 27, 2018

Should we add this to the test suit/as a special test? I guess we can really add it to the test suite if we mark it as slow. It would be good to write it in a way so that adding a bit more code than only the import is easy (although running a lot of code in valgrind is slow).

I also ran the whole test suit through the memory lost checker (did valgrind often, but not with that), will sieve through it, I think it exposed at least one bug.

@charris
Copy link
Member
charris commented Dec 27, 2018

It would be useful to have an easy way to run the check. Perhaps an option added to runtests?

@seberg
Copy link
Member
seberg commented Dec 27, 2018

Yeah, maybe that is the best version. I see quite a few lost Dtype/descriptors in valgrind. I think running a refcount checker may be a good idea (and give a better idea where things go wrong).

EDIT: Found the issue (or probably, didn't rerun the full test suit).

@seberg
Copy link
Member
seberg commented Jan 11, 2019

@skrah thanks for the heads up. I think the numpy test suit now runs cleanly with the exception of f2py for which there is a PR (although that needs some work), most of that will be backported probably.

There are still one or two points that are flaky or I do not quite understand, but I think they are not real leaks anyway. Using a debug python build and a refcount tester may be a next step in the future.

@seberg seberg closed this as completed Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0