BUG difference in behaviour for subclass output in ufuncs #4753

mhvk · 2015-06-18T21:11:01Z

If one wants to have the output of a ufunc passed to a ndarray subclass, but the input is a regular array, the behaviour differs depending on whether one passes the output as a positional or as a keyword argument: if passed as positional argument, __numpy_ufunc__ is called, while if passed as a keyword argument, __array_prepare__ and __array_wrap__ are called. I would have expected for __numpy_ufunc__ to be called in both cases.

p.s. If the input is a subclass, it always works as expected.

Example script:

import numpy as np


class MyArray(np.ndarray):
    def __array_prepare__(self, obj, context=None):
        print('prepare:', self, obj, context)
        return super(MyArray, self).__array_prepare__(obj, context)

    def __array_wrap__(self, obj, context):
        print('wrap:', self, obj, context)
        return super(MyArray, self).__array_wrap__(obj, context)

    def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs):
        print('ufunc:', self, ufunc, method, i, inputs, kwargs)
        values = [np.asarray(arg) for arg in inputs]
        if 'out' in kwargs:
            kwargs['out'] = kwargs['out'].view(np.ndarray)
        result = getattr(ufunc, method)(*values, **kwargs) / 2.
        return result.view(MyArray)

a = np.array([-1., -1.]).view(MyArray)
print(repr(np.sin(np.arange(2.), a)))
a = np.array([-1., -1.]).view(MyArray)
print(repr(np.sin(np.arange(2.), out=a)))

pv · 2014-06-02T10:58:26Z

Is this something that was not addressed in gh-4626?

EDIT: apparently yes. this is an oversight

pv · 2014-06-02T11:00:31Z

@cowlicks: overlooked case. Also, what should i be when self is one of the output arrays (probably past the inputs, which is likely ok...)?

mhvk · 2015-01-19T16:06:57Z

Checking 1.10 blockers brought up by my __numpy_ufunc__ trials: In numpy 1.10.0.dev-256d256, this is still an issue.

cowlicks · 2015-01-23T23:09:05Z

I'll take a look this weekend.

cowlicks · 2015-01-26T04:00:06Z

Okay so what currently happens is PyUFunc_CheckOverride is called at every ufunc call. It sets result as the result of the override if one exists, if no override is found result is NULL. Inside PyUFunc_CheckOverride we check positional arguments for __numpy_ufunc__ attribute. If it is not there, set result = NULL and return 0. Clearly we need to check out here, but it will add a little overhead to every ufunc call since we are now checking a dictionary too. I think we need to insert this logic here

njsmith · 2015-01-26T04:09:47Z

Surely there is somewhere in the ufunc logic where the arguments are
normalized. (Or if not there ought to be ;-).) Can't we put this checking
after that happens?

...okay reading the code you linked to I see that the numpy_ufunc code
is apparently already has its own separate reimplementation of ufunc
argument handling. I can see how we might not want to run two separate
copies of the same argument normalization on every ufunc call :-(

On Mon, Jan 26, 2015 at 4:00 AM, Blake Griffith notifications@github.com
wrote:

Okay so what currently happens is PyUFunc_CheckOverride is called at
every ufunc call. It sets result as the result of the override if one
exists, if no override is found result is NULL. Inside
PyUFunc_CheckOverride we check positional arguments for numpy_ufunc
attribute. If it is not there, set result = NULL and return 0. Clearly we
need to check out here, but it will add a little overhead to every ufunc
call since we are now checking a dictionary too. I think we need to insert
this logic here
https://github.com/numpy/numpy/blob/master/numpy/core/src/private/ufunc_override.h#L224

—
Reply to this email directly or view it on GitHub
#4753 (comment).

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

cowlicks · 2015-01-26T04:22:13Z

They are only normalized (to what __numpy_ufunc__ needs) If there is a __numpy_ufunc__ attribute present. I was reluctant to normalize first because that would add even more overhead to every ufunc call. I'm testing my prototype (based on my last comment) now. I can try both approaches and see if the timing vs code complexity tradeoff is significant.

njsmith · 2015-01-26T04:24:48Z

I know it's more work and may spill over into cleaning up other people's
messes, but I think it's worth looking to see if you can move the checks
for the numpy_ufunc attribute to somewhere after the regular ufunc
argument processing, which already has to deal with positional versus
kwargs.
On 26 Jan 2015 04:22, "Blake Griffith" notifications@github.com wrote:

They are only normalized (to what numpy_ufunc needs) If there is a
numpy_ufunc attribute present. I was reluctant to normalize first
because that would add even more overhead to every ufunc call. I'm testing
my prototype (based on my last comment) now. I can try both approaches and
see if the timing vs code complexity tradeoff is significant.

—
Reply to this email directly or view it on GitHub
#4753 (comment).

cowlicks · 2015-01-26T05:01:49Z

Yeah, that is definitely the right way to do this. I'm not sure why I did not do it that way originally. I'll try to find where that happens and check it out... After I get this working.

cowlicks · 2015-01-26T05:46:58Z

@njsmith I found some arg parsing stuff here, but it is just for multiarray stuff? I don't think it is used for ufuncs. I need to look more into this.

My prototype is here (based on my first comment). But it is breaking np.dot and idk why.

mhvk · 2015-01-26T15:07:03Z

Had a look at the prototype and while it looks good generally, I worry it would still break for ufuncs with two outputs (where out is a tuple; see your work in #4171)

cowlicks · 2015-02-01T04:20:38Z

Little update: I rebased on current master and the segfault went away... So it wasn't my fault I guess? That had me stumped for a while. Working on a fix taking @mhvk's comments into consideration.

cowlicks · 2015-02-01T22:04:04Z

This seems to fix it. But I have not refactored to use the general arg processing as suggested by @njsmith. I have not assessed the difficulty of that yet. If it seems hard I'll make an issue with a plan of action.

I think there is still cleaning up to do. And I have not checked that I did not create any memory leaks. I'm not confident in my C coding. So I'll double check this then make a PR.

mhvk · 2015-02-02T14:09:20Z

looks good to me, but am definitely even less of an expert.

cowlicks · 2015-02-02T18:31:49Z

There is a bit of a problem. The __numpy_ufunc__ protocol specifies it has the signature:

def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs)

Where "i is the index of self in inputs" but output arguments get put into kwargs. So what should the i index be in this case? (See the NEP here).

Basically __numpy_ufunc__ does not account for the output arguments being able to override the ufunc. I'm still thinking about the best way to approach this. Suggestions? @pv

mhvk · 2015-02-02T18:44:04Z

FWIW: I think the approach you took, of just continuing to add to i, is the most logical one. It is important for an output to be able to have __numpy_ufunc__ and thus to tell that it cannot handle the result.

cowlicks · 2015-02-02T19:02:51Z

@mhvk But it is breaking this test.

Basically the test uses the "defective" behavior.

mhvk · 2015-02-02T19:09:28Z

Hmm, fortunately, __numpy_ufunc__ has not been part of a release, so still in time to get it right! It seems obvious the implementer will have to be able to deal with either i is None or i >= len(inputs). The latter would carry more information, so seems preferred.

ewmoore · 2015-02-02T19:47:55Z

If it hasn't been released perhaps the signature should be changed. Maybe add an o parameter or make i be input_index and add an output_index and require that one or the other always be None. This seems much clearer than indexing into a list of output arrays if i >= len(inputs).

Is it documented anywhere that passing an output array can effect the ufunc dispatch? Seems kind of surprising that it work this way.

mhvk · 2015-02-02T20:02:12Z

@ewmoore - but the output should not just be treated as an arbitrary ndarray either (indeed, that is what the current __array_prepare__ and __array_wrap__ do well).

pv · 2015-02-02T21:43:00Z

Only the option i >= len(inputs) is compatible with code in Scipy.
That it's out there is somewhat non-ideal situation, but what is done is
done and needs to be considered now. If a different interface is
desired, then __numpy_ufunc__ needs to be renamed to
__numpy_override__ or something similar.

mhvk · 2015-02-02T21:47:04Z

I'd also think i>=len(inputs) is most logical conceptually in the sense that it now effectively indexes *args in the original call for the case that no out keyword was present.

cowlicks · 2015-02-05T17:56:23Z

Okay. I'll correct the test that is failing to check for i >= len(inputs), do some cleanup, and make a PR.

mhvk · 2015-06-11T16:14:04Z

@cowlicks - did you ever get around to fixing this? You had #5566, but you closed that intending to make a different solution...

cowlicks · 2015-06-11T18:15:54Z

No I never got around it. There is a lot of discussion around __numpy_ufunc__ and I have not had the bandwidth to follow it. It seemed like a rewrite was being proposed in #5844? But I don't really have time to read that whole thread.

8000

mhvk · 2015-06-18T21:14:46Z

@cowlicks - I think this bug has little to do with the (long) discussion about what to do with ndarray binops, but rather affects __numpy_ufunc__ more generally. Since we're close to release, I went ahead and reworked your earlier pull request, in a way that I think will address all previous comments. Please have a look. (My own sense: OK, but needs a few tests that i is indeed what it should be.) cc @pv, @njsmith, @jaimefrio, @charris, since they all looked at your original PR (#5566).

…tely for the case where self is among outputs but not among inputs. Ensure it works both out passed on as an argument and with out in a keyword argument.

mhvk · 2015-06-19T17:38:37Z

OK, I added further tests that the index passed on to __numpy_ufunc__ is in fact correct in all cases. I also rebased and merged all commits into one. I note that there was some discussion that it would have been more logical to do this test after normalising the arguments, but that doesn't really matter for the functionality, and at this late stage it seems best to just fix the bug rather than try to overhaul the order.

njsmith · 2015-06-19T20:31:03Z

OK, going to say something unpopular. (Certainly I'm annoyed at myself for saying it, but nonetheless...) I've been debating this with myself for a while, and I've come to the conclusion that we should bite the bullet and drop the index argument from __numpy_ufunc__'s signature. This will require some care to avoid breaking already released versions of scipy.sparse (lesson learned: we should never have released scipy.sparse with __numpy_ufunc__ before it came out in numpy), but that's better than leaving it forever.

Rationale:

the index argument is never useful. The only time it's useful at all is for classes that want to cast self to ndarray and recurse, and even in those cases they can just as well do a loop and handle all objects of their type at once. It'll be faster too. None of the example classes we've written during the binop discussion have used it at all.
the impossibility of having a useful value for output arguments makes there index even more useless. (Sure we can set it to
n > nin, but what's the use of this? Writing code to check for this and do something sensible is more complicated than just finding the array by hand.)
there are almost certainly other arguments we should dispatch on, like where=. What do we set the index to in this case?

Sorry. I'm the one who suggested adding it in the first place, but it was a bad idea :-/

Also while we're at it, we should make the out= argument unconditionally a tuple, rather than sometimes a single ndarray and sometimes a tuple of them. This would simplify argument processing (see: all the times we wrote if not isinstance(out, tuple): out = (out,) on the example binop implementations).

mhvk · 2015-06-19T20:54:26Z

@njsmith - I made your point a new issue #5986, as I think it deserves better discussion than it might get here. Also, this PR is really not so much about what to do with the index as about ensuring that out arguments are able to override __numpy_ufunc__, whether given as an argument or as a keyword argument.

charris · 2015-06-27T23:23:11Z

@mhvk, @njsmith So where does this sit now?

njsmith · 2015-06-28T05:45:02Z

I think the status is: this is an uncontroversial bug, and @mhvk has a plausible patch here, which hasn't been converted into a PR yet.

This isn't necessarily a 1.10 blocker (it would be annoying to release with a bug in __numpy_ufunc__, but it would just be a bug; we could fix it in 1.10.1 if we had to).

As a practical matter it would probably be best to get #6001 and #5986 sorted out first before addressing this, to avoid collisions.

mhvk · 2015-06-28T13:32:45Z

@njsmith - note that my patch did get converted to a PR -- this issue itself has been converted to a PR [1]. I actually don't think it impacts any other the other PR too much. If we remove index, it would effectively be removed again, but that's fine.

[1] See http://docs.astropy.org/en/latest/development/workflow/additional_git_topics.html#converting-a-github-issue-to-a-pull-request

charris · 2015-08-03T02:25:43Z

Moved this to the 1.11.0 release with the rest of the __numpy_ufunc__ stuff.

charris · 2015-10-03T23:57:43Z

@mhvk What need to be done to bring this to completion. If we need to change the interface, now would be a good time to get it decided. @pv IIRC, we are also going to need a rename for scipy sparse.

mhvk · 2015-10-05T15:19:51Z

@charris - I think this PR is all OK for the present __numpy_ufunc__. If we do decide to change that, including removing the index from the call signature, then most of the actual code will likely get removed, but some of the tests will guard against this problem recurring (specifically, https://github.com/numpy/numpy/pull/4753/files#diff-bc1e8d40b8b7758facc5e143be016f99R2515).

So, my feeling is to merge this, in order to let the test case rather than this open PR guard against the problem recurring.

BUG difference in behaviour for subclass output in ufuncs

charris · 2015-10-05T15:53:58Z

@mhvk OK, merging. Thanks.

pv added the Defect label Jun 2, 2014

juliantaylor added this to the 1.9 blockers milestone Jun 2, 2014

mhvk mentioned this pull request Jul 9, 2014

Quantity.__array_ufunc__ astropy/astropy#2583

Merged

juliantaylor modified the milestones: 1.9 blockers, 1.10 blockers Aug 26, 2014

cowlicks mentioned this pull request Feb 13, 2015

BUG: fix for #4753 (out keyword with __numpy_ufunc__) #5566

Closed

rgommers added the component: numpy._core label Mar 8, 2015

mhvk mentioned this pull request May 28, 2015

Decide on what the resolution rules for __op__/__rop__/__numpy_ufunc__ actually are #5844

Closed

Check out kwarg for __nump_ufunc__ override and set index appropria…

288239d

…tely for the case where self is among outputs but not among inputs. Ensure it works both out passed on as an argument and with out in a keyword argument.

mhvk force-pushed the bug-4753 branch from 629e02e to 288239d Compare June 19, 2015 17:36

mhvk mentioned this pull request Jun 19, 2015

Drop index from __numpy_ufunc__ signature? out always a tuple? #5986

Closed

charris modified the milestones: 1.10.0 release, 1.10 blockers Jun 21, 2015

charris modified the milestones: 1.10.0 release, 1.11.0 release Aug 3, 2015

charris added a commit that referenced this pull request Oct 5, 2015

Merge pull request #4753 from mhvk/bug-4753

68e61c2

BUG difference in behaviour for subclass output in ufuncs

charris merged commit 68e61c2 into numpy:master Oct 5, 2015

mhvk deleted the bug-4753 branch October 5, 2015 15:58

Uh oh!

BUG difference in behaviour for subclass output in ufuncs #4753

BUG difference in behaviour for subclass output in ufuncs #4753

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!