BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

njsmith · 2015-06-22T10:39:19Z

[Posting this now for any feedback, though it still needs some work to sort out scalar handling -- CC @pv @mhvk @shoyer @charris]

As per the discussion at gh-5844, and in particular
#5844 (comment)
this commit switches binop dispatch to mostly defer to ufuncs, except
in some specific cases elaborated in a long comment in number.c.

This currently fails on the following code from test_ufunc.py, which
creates an infinite loop and eats memory until killed:

class MyThing(object):
    __array_priority__ = 1000

    rmul_count = 0
    getitem_count = 0

    def __init__(self, shape):
        self.shape = shape

    def __len__(self):
        return self.shape[0]

    def __getitem__(self, i):
        MyThing.getitem_count += 1
        if not isinstance(i, tuple):
            i = (i,)
        if len(i) > len(self.shape):
            raise IndexError("boo")

        return MyThing(self.shape[len(i):])

    def __rmul__(self, other):
        MyThing.rmul_count += 1
        return self

np.float64(5) * MyThing((3, 3))

# Alternative, with same result:

np.float64(5) + MyThing((3, 3))

I don't yet really understand what is going on here. As far as I've
tracked it down:

np.float64.add is scalarmath.c.src:@Oper@_@name@

@Oper@_@name@ calls _@name@_convert2_to_ctypes

_@name@_convert2_to_ctypes just calls _@name@_convert_to_ctype twice

_@name@_convert_to_ctype returns -1 or -2 on various error
  conditions that I don't fully understanding, including in
  particular a -2 if other.__array_priority__ > self.__array_priority__

On -1 or -2 returns, @Oper@@name@ calls one of
PyArray_Type.tp_as_number->nb@Oper@(a, b)
PyGenericArrType_Type.tp_as_number->nb_@oper@(a, b)

The latter is gentype_@name@, which for "add" just calls back to
PyArray_Type.tp_as_number->nb_@name@
so it should be equivalent. For some other cases, things are more
complicated -- in particular gentype_multiply implements sequence
multiplication, which looks suspicious given the definition of
MyThing. Except we crash on float64 + MyThing as well, so that can't
be it.

Actually, I guess the problem must be exactly that we are deferring
to the ufunc; np.add(5, MyThing((3, 3))) also blows up. We shouldn't
be, though, because of the array_priority.

More after I wake up again...

mhvk · 2015-06-22T15:19:44Z

@njsmith - the comments in scalarmath.c.src (https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/scalarmath.c.src#L541)

/* The general strategy for commutative binary operators is to
 *
 * 1) Convert the types to the common type if both are scalars (0 return)
 * 2) If both are not scalars use ufunc machinery (-2 return)
 * 3) If both are scalars but cannot be cast to the right type
 * return NotImplmented (-1 return)
 *
 * 4) Perform the function on the C-type.
 * 5) If an error condition occurred, check to see
 * what the current error-handling is and handle the error.
 *
 * 6) Construct and return the output scalar.
 */

So, you hit (2) here, though I think what we'd really want is:

 * 2) If either is not a scalar use ufunc machinery (-2 return)
 * 3) If both are scalars but cannot be cast to the right type
 *    return NotImplemented (-1 return)
 * 4) If either has __array_priority > NUMPY_PRIORITY, 
 *    return NotImplemented (-3 return)

I.e., my sense is that the priority case should be separated out from the unknown object type, returning NotImplemented more quickly. It looks like all this would need is a change in the return value, from -2 to -3... ... ... Indeed, a quick trial (PR to your branch on its way) solves the problem for the above case (although there are Polynomial failures; not sure if those are related).

mhvk · 2015-06-22T15:49:55Z

FWIW: all polynomial failures seem to be due to division by a np.float16....

And after a bit more investigation, the problem seems to be that in polyutils.as_series, the following happens:

np.common_type(np.array(5., np.float16))
# TypeError: can't get common type for non-numeric array

Which would seem a proper bug on its own...

mhvk · 2015-06-22T16:21:09Z

For the np.common_type bug, see #6004. With that as well as my scalar fix in place, this PR passes all tests!

mhvk · 2015-06-22T17:50:09Z

@njsmith - #6004 is in, so by merging my PR to your branch and rebasing on master, you'll have something that passes all tests. Note that comments in scalarmatch.c.src would need updating, though, to make clear that __array_priority__ > NPY_PRIORITY returns NotImplemented.

pv · 2015-06-23T07:37:55Z

The scipy.sparse checks are done wrong --- you need to check the exact class names, otherwise you are blocking using numpy_ufunc in the future.

njsmith · 2015-06-23T07:43:40Z

Up to you what you prefer. We could have a list of the sparse classes as
well. It doesn't block anything though -- it just gives them first priority
on binops, just like you want to do for all classes :-). They can still
call back to ufuncs or whatever if they want.
On Jun 23, 2015 12:37 AM, "Pauli Virtanen" notifications@github.com wrote:

The scipy.sparse checks are done wrong --- you need to check the exact
class names, otherwise you are blocking using numpy_ufunc in the future.

—
Reply to this email directly or view it on GitHub
#6001 (comment).

pv · 2015-06-23T07:53:00Z

Ok, indeed, I don't know what I was thinking... Still, best to keep the special cases to minimum.

…attr_string.h This is an ugly kluge, but until we merge multiarray.so and umath.so moving stuff into private/*.h serves as a reasonable workaround.

njsmith · 2015-06-24T07:56:33Z

Okay, largely redid this and it now passes tests (though could probably use a few more).

I'm pretty happy with the logic flow now: we now explicitly check whether we are in __binop__ or __rbinop__, and have nice well-defined logic paths in each, and we re-use these for both ndarray methods and scalar methods. (Which is as it should be, since scalars are essentially duck-arrays, and part of the goal here is to get a standard logic that can be used by all duck-arrays.) And I believe that all the other "options" from #5844 could now be implemented just by modifying binop_override_forward_binop_should_defer.

Unfortunately to do this I had to put a bunch of code into header files in private/ (similar to ufunc_override.h), which is a kluge but I think will serve okay as a temporary measure until we merge multiarray.so and umath.so (and much much better than adding any of this stuff to the public C API).

njsmith · 2015-06-24T08:58:51Z

Spoke too soon. Aside from some annoying segfault on 3.2 (doubtless a shallow bug in the 3.2-only unicode handling branch in is_scipy_sparse), there is some very mysterious behaviour that I found after adding more cases to the test suite:

class EmptyObj(object):
    pass

class EmptyObjNegPriority(object):
    __array_priority__ = -1

# These three cases throw TypeError
np.arange(3).__lshift__(EmptyObj())
np.arange(3).__lshift__(EmptyObjNegPriority())
np.arange(3)[0].__lshift__(EmptyObj())
# This case returns NotImplemented
np.arange(3)[0].__lshift__(EmptyObjNegPriority())

This is baffling to me, because (a) as far as I can tell, all the NotImplemented return paths are now shared between arrays and scalars, so it shouldn't be possible for one to return Not 8000 Implemented and the other not to, and (b) a negative __array_priority__ should be the same as not having any __array_priority__ at all!

I mean, it's not not even necessarily wrong -- arguably array scalars should return NotImplemented in more cases than arrays do (though NB that on 1.9, np.int64(2).__lshift__(EmptyObj()) also raises, so it hasn't worked historically either). But I don't know what's going on, so I'll sleep on it and see if anything comes to me.

For some reason, np.generic.__pow__ felt the need to reimplement Python's full binop dispatch logic. The code has been there since the beginning of (git) history, isn't required to pass any tests, and AFAICT doesn't actually do anything interesting. I suspect it was just split apart from the (much more trivial) implementation of all the other np.generic binop methods because nb_power has a slightly different signature than all the others (it takes an extra argument, which we ignore, but still has to be accepted and passed as appropriate), and then got forgotten when the other implementations were simplified long ago. This commit throws away all that code and makes gentype_power a 2-liner, just like all the other gentype_@name@ methods.

As per the discussion at numpygh-5844, and in particular numpy#5844 (comment) this commit switches binop dispatch to mostly defer to ufuncs, except in some specific cases elaborated in a long comment in number.c. The basic strategy is to define a single piece of C code that knows how to handle forward binop overrides, and we put it into private/binop_override.h so that it can be accessed by both the array code in multiarray.so and the scalar code in umath.so.

njsmith · 2015-06-25T00:03:15Z

Okay, I think this is clean and solid. On my machine it now passes both array and scalar tests on at least 2.7, 3.2, and 3.4 -- we'll see what travis says beyond that :-).

I think this is worth merging regardless of the details of how #5844 is finally resolved -- aside from being a complete implementation of "option 1", it also provides all the ground-clearing needed to make it easy to implement "option 2" or "option 3", and in the alternative where we decide to temporarily disable __numpy_ufunc__ in 1.10 then there's nothing here that interferes with that either.

shoyer · 2015-06-25T00:05:04Z

numpy/core/src/private/binop_override.h

+ * dispatch rules where the ndarray binops sometimes return NotImplemented
+ * rather than always dispatching to ufuncs.
+ *
+ * Note that these rules are also implemented by ABCArray, so any changes here


@njsmith This will be your next PR? :)

Oh heh right. I should probably remove that comment and then let the ABCArray PR re-add it.

Any chance I could interest you in putting that one together...? I'm looking at my schedule and beginning to ponder whether it was really wise to plan 3 different major projects for scipy... (the dev meeting + my talk + another short talk for a bof)

Hmm... no commitments, but given that I've already signed up to write BackwardCompatibleABCArray (outside of numpy) I suppose I can give this a try.

njsmith · 2015-06-25T00:07:00Z

@pv: Oh, but I haven't actually narrowed down the scipy.sparse special case any further (mostly because I'm a bit nervous about making an explicit list of scipy.sparse classes, in case I miss one by accident or if any get added in the future). But I could do that if you prefer.

mhvk · 2015-06-26T01:22:25Z

The code looks good, but this does seem to break backwards compatibility for in-place operations where other has higher __array_priority__ and does not define __numpy_ufunc__. I've been convinced that letting a += b becomes a = a+b is not good generally, but wouldn't it be better to preserve strict backwards compatibility for now?

njsmith · 2015-06-26T01:51:00Z

@mhvk: Not sure what you mean -- both before and after this patch, inplace operations look like

static PyObject *
array_inplace_add(PyArrayObject *m1, PyObject *m2)
{
    return PyArray_GenericInplaceBinaryFunction(m1, m2, n_ops.add);
}

and PyArray_GenericInplaceBinaryFunction is

static PyObject *
PyArray_GenericInplaceBinaryFunction(PyArrayObject *m1,
                                     PyObject *m2, PyObject *op)
{
    if (op == NULL) {
        Py_INCREF(Py_NotImplemented);
        return Py_NotImplemented;
    }
    return PyObject_CallFunctionObjArgs(op, m1, m2, m1, NULL);
}

so I didn't touch any such checks here. And in current master:

In [1]: class Foo(object):
   ...:     def __radd__(self, other):
   ...:         print("__radd__!")
   ...:     __array_priority__ = 1e9
   ...:     

In [2]: a = np.arange(3)

In [3]: a + Foo()
__radd__!

In [4]: a += Foo()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-a459d874cb9d> in <module>()
----> 1 a += Foo()

TypeError: ufunc 'add' output (typecode 'O') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

shoyer · 2015-06-26T02:17:29Z

@njsmith apparently the behavior on master was already broken. On NumPy 1.9.2 this works:

In [1]: import numpy as np

In [2]: class Foo(object):
   ...:     def __radd__(self, other):
   ...:         print("__radd__!")
   ...:     __array_priority__ = 1e9
   ...:

In [3]: a = np.arange(3)

In [4]: a + Foo()
__radd__!

In [5]: a += Foo()
__radd__!

In [6]: np.__version__
Out[6]: '1.9.2'

(there may be a lesson here about the hard to understand operator overrides....)

njsmith · 2015-06-26T02:20:09Z

......though, ughh, it looks like #5964 did accidentally change how in-place operations handle array priority, sorry. That's totally orthogonal to this PR, though, so: #6020

njsmith · 2015-06-26T02:22:06Z

@shoyer: jinx

mhvk · 2015-06-26T12:03:50Z

OK, thanks for looking into this; as I noted in #6020, I think it makes most sense not to change the __array_priority__ behaviour until we start to deprecate that.

njsmith · 2015-06-28T02:33:29Z

I'm fine with not deprecating __array_priority__ in 1.10, so I think this PR satisfies everything that's been requested upthread.

mhvk · 2015-06-28T13:21:30Z

Agreed that really the issues are separate. So then the idea is to follow it up with a PR that fully restores the behaviour of __array_priority__ (and thus resolves #6020), as well as one that changes the name and call signatude of __numpy_ufunc__.

njsmith · 2015-07-01T02:10:44Z

@mhvk: Exactly. (The latter being #5986.)

homu · 2015-07-25T19:59:19Z

☔ The latest upstream changes (presumably #6047) made this pull request unmergeable. Please resolve the merge conflicts.

charris · 2017-04-14T16:33:40Z

Closing. I believe this is now part of #8247.

mhvk mentioned this pull request Jun 22, 2015

BUG: np.float16 not recognized in np.common_type #6004

Merged

MAINT: move PyArray_GetAttrString_SuppressException into private/get_…

65cae2b

…attr_string.h This is an ugly kluge, but until we merge multiarray.so and umath.so moving stuff into private/*.h serves as a reasonable workaround.

njsmith force-pushed the numpy-ufunc-binop-provisional branch from 7e4c14c to 7ffa5dd Compare June 24, 2015 07:51

njsmith force-pushed the numpy-ufunc-binop-provisional branch 2 times, most recently from de6d3bb to 878f7e9 Compare June 24, 2015 08:50

njsmith force-pushed the numpy-ufunc-binop-provisional branch from 878f7e9 to 35471fc Compare June 24, 2015 22:42

njsmith changed the title ~~[WIP] BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction~~ BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction Jun 24, 2015

njsmith force-pushed the numpy-ufunc-binop-provisional branch from 35471fc to 336421e Compare June 24, 2015 23:24

njsmith force-pushed the numpy-ufunc-binop-provisional branch from 336421e to c16c003 Compare June 24, 2015 23:58

shoyer reviewed Jun 25, 2015
View reviewed changes

mhvk mentioned this pull request Jun 26, 2015

In-place operators in 1.9 "respected" __array_priority__; in current master they don't #6020

Closed

njsmith mentioned this pull request Jun 28, 2015

Drop index from __numpy_ufunc__ signature? out always a tuple? #5986

Closed

This was referenced Jun 28, 2015

Decide on what the resolution rules for __op__/__rop__/__numpy_ufunc__ actually are #5844

Closed

BUG difference in behaviour for subclass output in ufuncs #4753

Merged

charris added this to the 1.10.0 release milestone Jun 30, 2015

charris removed this from the 1.10.0 release milestone Jul 25, 2015

rgommers added 01 - Enhancement component: numpy._core 03 - Maintenance labels Jan 16, 2016

charris modified the milestones: 1.12.0 release, 1.13.0 release Nov 3, 2016

mhvk mentioned this pull request Mar 11, 2017

ENH: Add __array_ufunc__ #8247

Merged

charris closed this Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!