-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: reference cycle in np.vectorize #11977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
My guess would be that the problem is |
df298f9
to
7a57924
Compare
ping |
def test_leaks(self): | ||
# gh-11867 | ||
# np.frompyfunc(<class method>) creates a reference cycle | ||
# that requires a gc collection cycle to break (on CPython 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we worked out why this is the case? Am I right in thinking that this patch changes it from being unbreakable to just requiring a gc to break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I do not really understand the root cause and why only in this specific case
7a57924
to
f055831
Compare
Squashed to a single commit. |
I think the commit message is wrong - you didn't break the reference cycle, you simply allowed the GC to find it (and break it on the python side) |
I wonder if you can use |
a358d0d
to
8f75986
Compare
fixed first commit message, reworked test to focus on |
8f75986
to
0794da8
Compare
Would be good to finish this up |
Fix will be as simple as:
|
merging with master to resolve conflicts |
i will wait for #8955 and then rework this PR |
d1279ce
to
b2c7901
Compare
b2c7901
to
b91a047
Compare
b91a047
to
f73d96d
Compare
rebased at |
return np.int(0) | ||
|
||
def npy_bound(self, a): | ||
return types.MethodType(np.frompyfunc(self.bound, 1, 1), a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very confused by what exactly we're trying to test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.bound
is part of a reference cycle that requires a gc collection cycle to break. I added more clarification in the test docstring. Improvements welcome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good to me, if nobody complains will merge in a bit probably. The test couldbe a bit more obvious to read, or maybe add a trivial one:
def func(a):
return 0
func.vectorized = np.frompyfunc(func)
giving a minimal cycle. The other part would probably be simpler by making it a realistic use case:
class A(object):
def __init__(self):
# Wrap bound_method into a ufunc with `np.frompyfunc`. This creates
# a reference cycle:
# self.method -> self # since it is a bound method (knows "self")
# self -> self.wrapped_method # Set here
# self.wrapped_method -> self.method # Stores self.method to call it
self.wrapped_method = np.frompyfunc(self.method)
def method(self):
return 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess that breaks the parameterization. Although, I do not think you need the types.MethodType
if the just do the self.bound_vectorized = np.frompyfunc(self.bound, 1, 1)
before returning it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following. Where would I add that line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think what confuses me is that I believe you do not need this at all. You can simply remove the npy_bound
and npy_unbound
logic and just make it a.f = np.frompyfunc(func, 1, 1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted
should be squash-merged or I can squash and force-push |
If you want it in 1.16, use the 1.16.1 milestone. |
@eric-wieser ping |
return np.int(0) | ||
|
||
def npy_bound(self, a): | ||
return types.MethodType(np.frompyfunc(self.bound, 1, 1), a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess that breaks the parameterization. Although, I do not think you need the types.MethodType
if the just do the self.bound_vectorized = np.frompyfunc(self.bound, 1, 1)
before returning it?
Thanks @mattip lets put it in before we find something else to nitpick about ;). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test now makes sense to me too - thanks for persisting
Cool, mac OS job is expected to fail. Next thing will be the object arrays ;). |
Issue #11867 demonstrated a leak when using
np.vectorize
on a bound class method in CPython 3. The test demonstrates the issue.Still WIP since I have not found the root cause of the leak.Edit: adding GC_Track to ufunc allows the garbage collector to break the cycle