8000 BUG: Fix leak of void scalar buffer info by seberg · Pull Request #12696 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Fix leak of void scalar buffer info #12696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 10, 2019

Conversation

seberg
Copy link
Member
@seberg seberg commented Jan 8, 2019

The void scalar was missing the clearing logic for the buffer info
cache (or storage).

It should maybe be checked whether we can replace all of this logic
when not supporting Python 2 anymore.


This is a followup on gh-12624. After this, the test suit should be clean in the sense of valgrind definitely lost leaks (except for the outstanding gh-12650 and more tricky gh-12633 for f2py) [1]. I think if we drop python 2, there may be a reorganizing happening here anyway. Also I am not sure if Matti's changes touch similar things (but doubt it). There is a similar path for object, but I do not think those really exists?!

[1] Note that there is still a "leak" (detected, but not true) of one or two buffer info's, because the singletons np.bool_(True) and np.bool_(False) get never cleared, and so their cached buffer_info also does not get cleared.

@eric-wieser
Copy link
Member

This looks correct to me - gentype_dealloc already calls this, so it looks like it was just forgotten.

Perhaps void_dealloc should call gentype_dealloc directly? I'm not sure if that's good practice for C-level inheritance.

@seberg
Copy link
Member Author
seberg commented Jan 8, 2019

Hmmm, that would work too. Can't say I have a preference (on the one hand, the C-level code could be more complex likely. On the other hand, it would have found this issue and probably would just crash hard if there was any reason for not doing it). I somewhat hope this code gets a do-over anyway soon ;).

My valgrind run is still going on, but I don't expect anything new from it (although slightly annoying, my cython is again creating leaks although it seems to be 0.29.2, not sure what is going on there).

@seberg
Copy link
Member Author
seberg commented Jan 8, 2019

Sorry, this is still a work in progress for the minute. I just noticed there seems to be one more similar leak, so maybe some other scalar type is missing this after all.

@seberg seberg added the 25 - WIP label Jan 8, 2019
@seberg seberg added this to the 1.16.1 release milestone Jan 8, 2019
@seberg seberg added the 09 - Backport-Candidate PRs tagged should be backported label Jan 8, 2019
@seberg seberg force-pushed the bufferinfo_cache_dealloc_void branch 2 times, most recently from 95bbbd6 to 996d562 Compare January 8, 2019 17:55
@seberg seberg removed the 25 - WIP label Jan 8, 2019
@seberg
Copy link
Member Author
seberg commented Jan 8, 2019

Should have looked at the error closer, second one was a bit unrelated after all...

@seberg seberg force-pushed the bufferinfo_cache_dealloc_void branch from 996d562 to b49a90c Compare January 8, 2019 17:58
seberg added 2 commits January 8, 2019 20:11
The vois scalar was missing the clearing logic for the buffer info
cache (or storage).

It should maybe be checked whether we can replace all of this logic
when not supporting Python 2 anymore.
When the buffer was created and an error was found during the format
string construction, the buffer was not freed correctly.
@seberg seberg force-pushed the bufferinfo_cache_dealloc_void branch from b49a90c to ae8a979 Compare January 8, 2019 19:11
@seberg
Copy link
Member Author
seberg commented Jan 8, 2019

By the way, I narrowed down the (I think) last definitely lost leak reported by valgrind (except one additional that always shows up, probably due to my python version) to the temporary elide code. I do not think it is a real leak, just if someone ever runs things and wonders what it is:

==23384== 40 bytes in 1 blocks are definitely lost in loss record 569 of 5,989
==23384==    at 0x483777F: malloc (vg_replace_malloc.c:299)
==23384==    by 0x400E665: _dl_map_object_deps (in /usr/lib/ld-2.28.so)
==23384==    by 0x40142A3: dl_open_worker (in /usr/lib/ld-2.28.so)
==23384==    by 0x49E2F56: _dl_catch_exception (in /usr/lib/libc-2.28.so)
==23384==    by 0x4013DFE: _dl_open (in /usr/lib/ld-2.28.so)
==23384==    by 0x4DE7159: ??? (in /usr/lib/libdl-2.28.so)
==23384==    by 0x49E2F56: _dl_catch_exception (in /usr/lib/libc-2.28.so)
==23384==    by 0x49E2FF2: _dl_catch_error (in /usr/lib/libc-2.28.so)
==23384==    by 0x4DE78BE: ??? (in /usr/lib/libdl-2.28.so)
==23384==    by 0x4DE71F9: dlopen (in /usr/lib/libdl-2.28.so)
==23384==    by 0x4C88D70: _PyImport_FindSharedFuncptr (in /usr/lib/libpython3.7m.so.1.0)
==23384==    by 0x4CA11C2: _PyImport_LoadDynamicModuleWithSpec (in /usr/lib/libpython3.7m.so.1.0)

@seberg
Copy link
Member Author
seberg commented Jan 9, 2019

Mixing up things a bit now, can split up again. Rerunning the test, I had apparently missed two losses earlier.

After valgrind, next step might be to try that pytest refcount checker.

@charris
Copy link
Member
charris commented Jan 9, 2019

Close/reopen.

@charris charris closed this Jan 9, 2019
@charris charris reopened this Jan 9, 2019
@seberg
Copy link
Member Author
seberg commented Jan 9, 2019

Btw. after realizing that this manual bisecting was getting seriously annoying. I wrote a hacky pytest plugin: https://github.com/seberg/pytest-valgrind didn't really test it yet, but if anyone ever wants to track down leaks, it should allow to run the full test suit and actually get errors on leaks. It will then tell you which test failed and you can inspect the valgrind log file.

In principle we could even expand it a bit and install this on a CI I guess, but valgrind is pretty slow, so I am not sure it should run too often.

@seberg
Copy link
Member Author
seberg commented Jan 10, 2019

Side remarks about that pytest plugin only:

Btw. apologies if you someone actually cloned that pytest thing already, yesterday evening, I made some silly changes and then forced pushed :(, will avoid that now. It is a bit annoying, but now you can (and should) pass in a valgrind output log file not just to valgrind but also as an additional pytest argument. In that case it will show the actual valgrind errors in the test results.

Plus added markers pytest.mark.valgrind_known_error and pytest.mark.valgrind_known_leak.

I am sure all the pytest stuff is a complete hack, but it works and the docs are not super nice (although I could maybe compare with more existing plugins).

@mattip mattip merged commit 1b144ad into numpy:master Jan 10, 2019
@mattip
Copy link
Member
mattip commented Jan 10, 2019

Thanks @seberg

@seberg seberg deleted the bufferinfo_cache_dealloc_void branch January 10, 2019 21:03
@charris charris added component: numpy._core and removed 09 - Backport-Candidate PRs tagged should be backported labels Jan 17, 2019
@charris charris removed this from the 1.16.1 release milestone Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0