8000 BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction by njsmith · Pull Request #6001 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction #6001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

njsmith
Copy link
Member
@njsmith njsmith commented Jun 22, 2015

[Posting this now for any feedback, though it still needs some work to sort out scalar handling -- CC @pv @mhvk @shoyer @charris]

As per the discussion at gh-5844, and in particular
#5844 (comment)
this commit switches binop dispatch to mostly defer to ufuncs, except
in some specific cases elaborated in a long comment in number.c.

This currently fails on the following code from test_ufunc.py, which
creates an infinite loop and eats memory until killed:

class MyThing(object):
    __array_priority__ = 1000

    rmul_count = 0
    getitem_count = 0

    def __init__(self, shape):
        self.shape = shape

    def __len__(self):
        return self.shape[0]

    def __getitem__(self, i):
        MyThing.getitem_count += 1
        if not isinstance(i, tuple):
            i = (i,)
        if len(i) > len(self.shape):
            raise IndexError("boo")

        return MyThing(self.shape[len(i):])

    def __rmul__(self, other):
        MyThing.rmul_count += 1
        return self

np.float64(5) * MyThing((3, 3))

# Alternative, with same result:

np.float64(5) + MyThing((3, 3))

I don't yet really understand what is going on here. As far as I've
tracked it down:

np.float64.add is scalarmath.c.src:@Oper@_@name@

@Oper@_@name@ calls _@name@_convert2_to_ctypes

_@name@_convert2_to_ctypes just calls _@name@_convert_to_ctype twice

_@name@_convert_to_ctype returns -1 or -2 on various error
  conditions that I don't fully understanding, including in
  particular a -2 if other.__array_priority__ > self.__array_priority__

On -1 or -2 returns, @Oper@@name@ calls one of
PyArray_Type.tp_as_number->nb
@Oper@(a, b)
PyGenericArrType_Type.tp_as_number->nb_@oper@(a, b)

The latter is gentype_@name@, which for "add" just calls back to
PyArray_Type.tp_as_number->nb_@name@
so it should be equivalent. For some other cases, things are more
complicated -- in particular gentype_multiply implements sequence
multiplication, which looks suspicious given the definition of
MyThing. Except we crash on float64 + MyThing as well, so that can't
be it.

Actually, I guess the problem must be exactly that we are deferring
to the ufunc; np.add(5, MyThing((3, 3))) also blows up. We shouldn't
be, though, because of the array_priority.

More after I wake up again...

@mhvk
Copy link
Contributor
mhvk commented Jun 22, 2015

@njsmith - the comments in scalarmath.c.src (https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/scalarmath.c.src#L541)

/* The general strategy for commutative binary operators is to
 *
 * 1) Convert the types to the common type if both are scalars (0 return)
 * 2) If both are not scalars use ufunc machinery (-2 return)
 * 3) If both are scalars but cannot be cast to the right type
 * return NotImplmented (-1 return)
 *
 * 4) Perform the function on the C-type.
 * 5) If an error condition occurred, check to see
 * what the current error-handling is and handle the error.
 *
 * 6) Construct and return the output scalar.
 */

So, you hit (2) here, though I think what we'd really want is:

 * 2) If either is not a scalar use ufunc machinery (-2 return)
 * 3) If both are scalars but cannot be cast to the right type
 *    return NotImplemented (-1 return)
 * 4) If either has __array_priority > NUMPY_PRIORITY, 
 *    return NotImplemented (-3 return)

I.e., my sense is that the priority case should be separated out from the unknown object type, returning NotImplemented more quickly. It looks like all this would need is a change in the return value, from -2 to -3... ... ... Indeed, a quick trial (PR to your branch on its way) solves the problem for the above case (although there are Polynomial failures; not sure if those are related).

@mhvk
Copy link
Contributor
mhvk commented Jun 22, 2015

FWIW: all polynomial failures seem to be due to division by a np.float16....

And after a bit more investigation, the problem seems to be that in polyutils.as_series, the following happens:

np.common_type(np.array(5., np.float16))
# TypeError: can't get common type for non-numeric array

Which would seem a proper bug on its own...

@mhvk
Copy link
Contributor
mhvk commented Jun 22, 2015

For the np.common_type bug, see #6004. With that as well as my scalar fix in place, this PR passes all tests!

@mhvk
Copy link
Contributor
mhvk commented Jun 22, 2015

@njsmith - #6004 is in, so by merging my PR to your branch and rebasing on master, you'll have something that passes all tests. Note that comments in scalarmatch.c.src would need updating, though, to make clear that __array_priority__ > NPY_PRIORITY returns NotImplemented.

@pv
Copy link
Member
pv commented Jun 23, 2015 via email

@njsmith
Copy link
Member Author
njsmith commented Jun 23, 2015

Up to you what you prefer. We could have a list of the sparse classes as
well. It doesn't block anything though -- it just gives them first priority
on binops, just like you want to do for all classes :-). They can still
call back to ufuncs or whatever if they want.
On Jun 23, 2015 12:37 AM, "Pauli Virtanen" notifications@github.com wrote:

The scipy.sparse checks are done wrong --- you need to check the exact
class names, otherwise you are blocking using numpy_ufunc in the future.


Reply to this email directly or view it on GitHub
#6001 (comment).

@pv
Copy link
Member
pv commented Jun 23, 2015 via email

…attr_string.h

This is an ugly kluge, but until we merge multiarray.so and umath.so
moving stuff into private/*.h serves as a reasonable workaround.
@njsmith njsmith force-pushed the numpy-ufunc-binop-provisional branch from 7e4c14c to 7ffa5dd Compare June 24, 2015 07:51
@njsmith
Copy link
Member Author
njsmith commented Jun 24, 2015

Okay, largely redid this and it now passes tests (though could probably use a few more).

I'm pretty happy with the logic flow now: we now explicitly check whether we are in __binop__ or __rbinop__, and have nice well-defined logic paths in each, and we re-use these for both ndarray methods and scalar methods. (Which is as it should be, since scalars are essentially duck-arrays, and part of the goal here is to get a standard logic that can be used by all duck-arrays.) And I believe that all the other "options" from #5844 could now be implemented just by modifying binop_override_forward_binop_should_defer.

Unfortunately to do this I had to put a bunch of code into header files in private/ (similar to ufunc_override.h), which is a kluge but I think will serve okay as a temporary measure until we merge multiarray.so and umath.so (and much much better than adding any of this stuff to the public C API).

@njsmith njsmith force-pushed the numpy-ufunc-binop-provisional branch 2 times, most recently from de6d3bb to 878f7e9 Compare June 24, 2015 08:50
@njsmith
Copy link
Member Author
njsmith commented Jun 24, 2015

Spoke too soon. Aside from some annoying segfault on 3.2 (doubtless a shallow bug in the 3.2-only unicode handling branch in is_scipy_sparse), there is some very mysterious behaviour that I found after adding more cases to the test suite:

class EmptyObj(object):
    pass

class EmptyObjNegPriority(object):
    __array_priority__ = -1

# These three cases throw TypeError
np.arange(3).__lshift__(EmptyObj())
np.arange(3).__lshift__(EmptyObjNegPriority())
np.arange(3)[0].__lshift__(EmptyObj())
# This case returns NotImplemented
np.arange(3)[0].__lshift__(EmptyObjNegPriority())

This is baffling to me, because (a) as far as I can tell, all the NotImplemented return paths are now shared between arrays and scalars, so it shouldn't be possible for one to return NotImplemented and the other not to, and (b) a negative __array_priority__ should be the same as not having any __array_priority__ at all!

I mean, it's not not even necessarily wrong -- arguably array scalars should return NotImplemented in more cases than arrays do (though NB that on 1.9, np.int64(2).__lshift__(EmptyObj()) also raises, so it hasn't worked historically either). But I don't know what's going on, so I'll sleep on it and see if anything comes to me.

For some reason, np.generic.__pow__ felt the need to reimplement
Python's full binop dispatch logic. The code has been there since the
beginning of (git) history, isn't required to pass any tests, and
AFAICT doesn't actually do anything interesting. I suspect it was just
split apart from the (much more trivial) implementation of all the
other np.generic binop methods because nb_power has a slightly
different signature than all the others (it takes an extra argument,
which we ignore, but still has to be accepted and passed as
appropriate), and then got forgotten when the other implementations
were simplified long ago.

This commit throws away all that code and makes gentype_power a
2-liner, just like all the other gentype_@name@ methods.
@njsmith njsmith force-pushed the numpy-ufunc-binop-provisional branch from 878f7e9 to 35471fc Compare June 24, 2015 22:42
@njsmith njsmith changed the title [WIP] BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction BUG/ENH: Switch to simplified __numpy_ufunc__/binop interaction Jun 24, 2015
@njsmith njsmith force-pushed the numpy-ufunc-binop-provisional branch from 35471fc to 336421e Compare June 24, 2015 23:24
As per the discussion at numpygh-5844, and in particular
   numpy#5844 (comment)
this commit switches binop dispatch to mostly defer to ufuncs, except
in some specific cases elaborated in a long comment in number.c.

The basic strategy is to define a single piece of C code that knows
how to handle forward binop overrides, and we put it into
private/binop_override.h so that it can be accessed by both the array
code in multiarray.so and the scalar code in umath.so.
@njsmith njsmith force-pushed the numpy-ufunc-binop-provisional branch from 336421e to c16c003 Compare June 24, 2015 23:58
@njsmith
Copy link
Member Author
njsmith commented Jun 25, 2015

Okay, I think this is clean and solid. On my machine it now passes both array and scalar tests on at least 2.7, 3.2, and 3.4 -- we'll see what travis says beyond that :-).

I think this is worth merging regardless of the details of how #5844 is finally resolved -- aside from being a complete implementation of "option 1", it also provides all the ground-clearing needed to make it easy to implement "option 2" or "option 3", and in the alternative where we decide to temporarily disable __numpy_ufunc__ in 1.10 then there's nothing here that interferes with that either.

* dispatch rules where the ndarray binops sometimes return NotImplemented
* rather than always dispatching to ufuncs.
*
* Note that these rules are also implemented by ABCArray, so any changes here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njsmith This will be your next PR? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh heh right. I should probably remove that comment and then let the ABCArray PR re-add it.

Any chance I could interest you in putting that one together...? I'm looking at my schedule and beginning to ponder whether it was really wise to plan 3 different major projects for scipy... (the dev meeting + my talk + another short talk for a bof)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... no commitments, but given that I've already signed up to write BackwardCompatibleABCArray (outside of numpy) I suppose I can give this a try.

@njsmith
Copy link
Member Author
njsmith commented Jun 25, 2015

@pv: Oh, but I haven't actually narrowed down the scipy.sparse special case any further (mostly because I'm a bit nervous about making an explicit list of scipy.sparse classes, in case I miss one by accident or if any get added in the future). But I could do that if you prefer.

@mhvk
Copy link
Contributor
mhvk commented Jun 26, 2015

The code looks good, but this does seem to break backwards compatibility for in-place operations where other has higher __array_priority__ and does not define __numpy_ufunc__. I've been convinced that letting a += b becomes a = a+b is not good generally, but wouldn't it be better to preserve strict backwards compatibility for now?

@njsmith
Copy link
Member Author
njsmith commented Jun 26, 2015

@mhvk: Not sure what you mean -- both before and after this patch, inplace operations look like

static PyObject *
array_inplace_add(PyArrayObject *m1, PyObject *m2)
{
    return PyArray_GenericInplaceBinaryFunction(m1, m2, n_ops.add);
}

and PyArray_GenericInplaceBinaryFunction is

static PyObject *
PyArray_GenericInplaceBinaryFunction(PyArrayObject *m1,
                                     PyObject *m2, PyObject *op)
{
    if (op == NULL) {
        Py_INCREF(Py_NotImplemented);
        return Py_NotImplemented;
    }
    return PyObject_CallFunctionObjArgs(op, m1, m2, m1, NULL);
}

so I didn't touch any such checks here. And in current master:

In [1]: class Foo(object):
   ...:     def __radd__(self, other):
   ...:         print("__radd__!")
   ...:     __array_priority__ = 1e9
   ...:     

In [2]: a = np.arange(3)

In [3]: a + Foo()
__radd__!

In [4]: a += Foo()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-a459d874cb9d> in <module>()
----> 1 a += Foo()

TypeError: ufunc 'add' output (typecode 'O') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

@shoyer
Copy link
Member
shoyer commented Jun 26, 2015

@njsmith apparently the behavior on master was already broken. On NumPy 1.9.2 this works:

In [1]: import numpy as np

In [2]: class Foo(object):
   ...:     def __radd__(self, other):
   ...:         print("__radd__!")
   ...:     __array_priority__ = 1e9
   ...:

In [3]: a = np.arange(3)

In [4]: a + Foo()
__radd__!

In [5]: a += Foo()
__radd__!

In [6]: np.__version__
Out[6]: '1.9.2'

(there may be a lesson here about the hard to understand operator overrides....)

@njsmith
Copy link
Member Author
njsmith commented Jun 26, 2015

......though, ughh, it looks like #5964 did accidentally change how in-place operations handle array priority, sorry. That's totally orthogonal to this PR, though, so: #6020

@njsmith
Copy link
Member Author
njsmith commented Jun 26, 2015

@shoyer: jinx

@mhvk
Copy link
Contributor
mhvk commented Jun 26, 2015

OK, thanks for looking into this; as I noted in #6020, I think it makes most sense not to change the __array_priority__ behaviour until we start to deprecate that.

@njsmith
Copy link
Member Author
njsmith commented Jun 28, 2015

I'm fine with not deprecating __array_priority__ in 1.10, so I think this PR satisfies everything that's been requested upthread.

@mhvk
Copy link
Contributor
mhvk commented Jun 28, 2015

Agreed that really the issues are separate. So then the idea is to follow it up with a PR that fully restores the behaviour of __array_priority__ (and thus resolves #6020), as well as one that changes the name and call signatude of __numpy_ufunc__.

@charris charris added this to the 1.10.0 release milestone Jun 30, 2015
@njsmith
Copy link
Member Author
njsmith commented Jul 1, 2015

@mhvk: Exactly. (The latter being #5986.)

@charris charris removed this from the 1.10.0 release milestone Jul 25, 2015
@homu
Copy link
Contributor
homu commented Jul 25, 2015

☔ The latest upstream changes (presumably #6047) made this pull request unmergeable. Please resolve the merge conflicts.

@charris
Copy link
Member
charris commented Apr 14, 2017

Closing. I believe this is now part of #8247.

@charris charris closed this Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0