-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Creating and argsorting an array with a structured dtype leaks memory #7860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you debug the cause of the leak? I.e. I'm betting on either the dtype creation or the argsort itself, but it could be the array creation. So if you could move the dtype and array creation in and out of the loop and report in which of the four possible cases the leak persists it would be very helpful. Thanks! |
Removing the dtype creation from the loop appears to stop the leak (as does removing the array creation as well). I don't think removing the array creation from the loop but not the dtype creation is possible. So the only case when it leaks is when they are both in the loop, and when I argsort the array. |
That makes sense, argsort should have a reference leak on the array dtype, in the cases you described, it would only increase the reference count of the dtype by one and thus not leak actual memory, since it is only one dtype that survives. |
So do you think it could be that argsort is leaking a reference to the copy of the dtype each time? It also doesn't appear to happen if I use a simple dtype (i.e. one without named fields to argsort on). |
Yes, a simple dtype will just find an existing dtype and return it from a table I believe |
To confirm that it's the dtype being leaked, you could put the dtype outside the loop and then call |
No, it doesn't appear to; it stays constant at 3. I tried moving the creation of |
Yes, right. What is leaking is not the dtype itself but the |
Okay, now I see what you mean. Even declaring |
This bug has been fixed in python master as part of gh-12624 EDIT: Actually, maybe for argsort it was fixed even earlier, would have to check |
Uh oh!
There was an error while loading. Please reload this page.
The following snippet of code consumes steadily increasing amounts of memory when run (even after the calls to gc.collect every 5000th iteration), as observed from the task manager. I have confirmed that changing the sort
kind
does not affect the issue.I am running numpy 1.11.0 with Python 2.7.11 on 64-bit Windows 7.
The text was updated successfully, but these errors were encountered: