8000 "import numpy" leaks memory · Issue #10157 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

"import numpy" leaks memory #10157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
skrah opened this issue Dec 4, 2017 · 17 comments
Closed

"import numpy" leaks memory #10157

skrah opened this issue Dec 4, 2017 · 17 comments

Comments

@skrah
Copy link
Contributor
skrah commented Dec 4, 2017

Just executing import numpy as np leaks memory:

Python version: 3.6.3
Numpy version: 0d749ad

This could also be a CPython issue, but other extensions I have tested don't leak
on 3.6.3. I have seen similar leak patterns with Valgrind if an extension did not use
the GC alloc functions or if there was a cyclic GC problem.

Example (there are more):

==8063== 1,709 bytes in 18 blocks are definitely lost in loss record 4,430 of 4,646
==8063==    at 0x4C2A9A1: malloc (vg_replace_malloc.c:299)
==8063==    by 0x41D06D: _PyMem_RawMalloc (obmalloc.c:73)
==8063==    by 0x41D84C: PyObject_Malloc (obmalloc.c:479)
==8063==    by 0x4F61AC: PyUnicode_New (unicodeobject.c:1281)
==8063==    by 0x4FBC95: _PyUnicode_FromUCS1 (unicodeobject.c:2173)
==8063==    by 0x4FC8A9: PyUnicode_FromKindAndData (unicodeobject.c:2244)
==8063==    by 0x5C8392: r_object (marshal.c:1156)
==8063==    by 0x5C863F: r_object (marshal.c:1218)
==8063==    by 0x5C9136: r_object (marshal.c:1389)
==8063==    by 0x5C994A: read_object (marshal.c:1487)
==8063==    by 0x5CA29E: marshal_loads (marshal.c:1787)
==8063==    by 0x4BC911: _PyCFunction_FastCallDict (methodobject.c:234)
==8063==
@skrah skrah changed the title "import numpy" leaks memory (Python 3.6.3) "import numpy" leaks memory Dec 4, 2017
@skrah
Copy link
Contributor Author
skrah commented Dec 4, 2017

The build with Python 3.4 also leaks.

@seberg
Copy link
Member
seberg commented Dec 4, 2017

Yes, it does, and I doubt it will be fixed soon. I am not sure how much is leaked, so it might be possible to make sure that it leaks, but does not leak more every time. (I.e. numpy explicitly caches some stuff, and not all of that is cleaned up, it might be that there are worse leaking bugs around though, especially interned strings for example should not matter much overall).

@seberg
Copy link
Member
seberg commented Dec 4, 2017

In other words, do you know how bad it leaks? It is a bug, but until now it never seemed like a big priority I guess. Of course any work improving is always welcome.

@skrah
Copy link
Contributor Author
skrah commented Dec 4, 2017

The leak isn't big, just a couple of KB. I think ideally cached values should not show up under "definitely lost", i.e. it's fine if they aren't cleaned up as long as they are still reachable at the end of the program.

"definitely lost" is a minor inconvenience if numpy is imported in another test suite (example: test_buffer in the Python stdlib) and you run Valgrind on that.

I understand if it isn't a big priority, perhaps I'll dig around a bit myself some day.

==13071==    definitely lost: 164,417 bytes in 87 blocks
==13071==    indirectly lost: 0 bytes in 0 blocks
==13071==      possibly lost: 166,667 bytes in 96 blocks
==13071==    still reachable: 1,799,758 bytes in 3,522 blocks
==13071==         suppressed: 0 bytes in 0 blocks

@seberg
Copy link
Member
seberg commented Dec 4, 2017

A couple of KB does sound like it might be more then a few cached strings and dtype objects though, hmmm.

@skrah
Copy link
Contributor Author
skrah 8000 commented Dec 4, 2017

This is a minimal __init__.py reproducer:

from ._globals import ModuleDeprecationWarning, VisibleDeprecationWarning
from ._globals import _NoValue
from . import add_newdocs

My guess is that something added by _ADDDOC in compiled_base.c isn't freed
by CPython.

@mhvk
Copy link
Contributor
mhvk commented Dec 4, 2017

That sounds like nice narrowing down! More out of curiosity (but also to help someone else if they want to take this up): what do you run to produce the above output? And does this mean memory is not even recovered when you exit python?

@seberg
Copy link
Member
seberg commented Dec 4, 2017

Its based on valgrind, in principle you should compile a debug python with valgrind support, other then that there are some suppression files you can use (I can search for mine/what Julian once also uploaded mostly). Then can run things like:

valgrind-py --track-origins=yes --show-leak-kinds=definite --leak-check=full python3 runtests.py --ipython

(valgrind-py is just my silly alias to load the python suppression file as well, always run it first without valgrind, otherwise compliation will never finish ;))

@skrah yeah, I guess ADDDOC is the biggest part, and my guess is we can keep the string objects around and make sure to dealloc them on module destruction. Doesn't sound all that bad.

@seberg
Copy link
Member
seberg commented Dec 4, 2017

@mhvk of course it is recovered on python exit, the OS will make sure of that, its just the point in time where you notice this kind of leak.

@pv
Copy link
Member
pv commented Dec 4, 2017 via email

@pv
Copy link
Member
pv commented Dec 4, 2017 via email

@seberg
Copy link
Member
seberg commented Dec 4, 2017

@pv thanks, really good to know about that PYTHONMALLOC env variable!

@mattip
Copy link
Member
mattip commented Dec 6, 2017

Adding the docstrings differently is on my TODO list, The way tp_doc is hacked after PyTypeReady not only leaks the char* string yanked out of the PyStringObject, it does not work on PyPy.

@mhvk
Copy link
Contributor
mhvk commented Dec 6, 2017

@mattip - if you have a minute, and won't fix it in the next day, maybe raise a separate issue just for this concrete item? (We even have the 24- pypy label to attach to it ;-)

@skrah
Copy link
Contributor Author
skrah commented Dec 27, 2017

#10286 seems to fix the "definitely lost" issue here.

charris added a commit that referenced this issue Dec 30, 2017
twmr added a commit to twmr/numpy that referenced this issue Dec 30, 2017
hanjohn pushed a commit to hanjohn/numpy that referenced this issue Feb 15, 2018
@jakirkham
Copy link
Contributor

Is this resolved? Looks like PR ( #10286 ) went in. Or are there other leaks from importing?

@charris
Copy link
Member
charris commented Mar 9, 2018

I think it was intended to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
0