8000 ENH: Add `__array_ufunc__` by charris · Pull Request #8247 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Add __array_ufunc__ #8247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 43 commits into from
Apr 27, 2017
Merged
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
4fd7e84
ENH: Revert "Temporarily disable __numpy_ufunc__"
charris Nov 6, 2016
fcd11d2
ENH: Rename __numpy_ufunc__ to __array_ufunc__.
charris Nov 9, 2016
c7b25e2
ENH: Remove position arg from __array_ufunc__.
charris Nov 12, 2016
8a9e790
MAINT: Put PyArray_GetAttrString_SuppressException in get_attr_string.h
njsmith Jun 24, 2015
4dd5380
MAINT: dike out a bunch of weird old code implementing scalar power
njsmith Jun 24, 2015
7d9bc2f
BUG/ENH: Switch to simplified __array_ufunc__/binop interaction
njsmith Jun 22, 2015
e4b5163
MAINT: allow __array_ufunc__ = None to force binops to defer.
mhvk Mar 12, 2017
2e6d8c0
MAINT: Split out C code in ufunc_override.h to .c file.
mhvk Mar 15, 2017
d5c5ac1
MAINT: Add NPY_NO_EXPORT modifier to PyUFunc_CheckOverride.
charris Mar 16, 2017
3124e96
MAINT: for __array_ufunc__ pass inputs as *args, ensure out is tuple.
mhvk Mar 14, 2017
6a3ca31
DOC: describe current implementation of __array_ufunc__.
mhvk Mar 14, 2017
79bb733
DOC: Style and sphinx fixes for arrays.classes.rst.
charris Mar 23, 2017
7c3dc5a
TST: test that gufuncs are also overridden by __array_ufunc__.
mhvk Mar 25, 2017
71201d2
DOC: Describe __array_func__ in subclassing
mhvk Mar 15, 2017
3041710
ENH: implement ndarray.__array_ufunc__
mhvk Mar 13, 2017
5fe6fc6
DOC Update NEP to reflect actual implementation.
mhvk Mar 31, 2017 8000
e092823
MAINT: let ndarray.__array_ufunc__ bail if any overrides are in place.
mhvk Apr 2, 2017
1147894
MAINT: Update array_ufunc NEP.
pv Apr 1, 2017
e325a10
DOC: Document behavior of ufuncs with default ndarray.__array_ufunc__
pv Apr 1, 2017
39c2273
DOC: Update ndarray.__array_ufunc__ documentation vs. review comments
pv Apr 2, 2017
6b41d11
DOC: clarify use of super and getattr
mhvk Apr 2, 2017
0ede0e9
DOC: update NEP again.
mhvk Apr 2, 2017
5f9252c
DOC: implement many smaller and bigger changes suggested in review.
mhvk Apr 4, 2017
8cc2f71
BUG,MAINT: ensure out=None is never passed on to __array_ufunc__.
mhvk Apr 4, 2017
856da73
DOC: remove left-over piece discussing binops
mhvk Apr 5, 2017
2b6c7fd
REVERT: remove __array_ufunc__ override for np.dot and ndarray.dot.
mhvk Apr 6, 2017
36e8494
REVERT: remove __array_ufunc__ override for np.matmul.
mhvk Apr 6, 2017
55500b9
MAINT: simplify now that __array_ufunc__ overrides ufuncs only.
mhvk Apr 6, 2017
25e973d
MAINT: split out umath-specific part of ufunc_override.
mhvk Apr 6, 2017
b1fa10a
BUG: ensure subclass of override class doesn't segfault.
mhvk Apr 8, 2017
1de8f5a
DOC: Mention `__array_ufunc__` in the 1.13.0 release notes.
charris Apr 8, 2017
a460015
DOC: ufunc-overrides: sync the discussion vs. current implementation
pv Apr 9, 2017
cd2e42c
DOC: ufunc-overrides: revise hierarchy discussion
pv Apr 9, 2017
ff628f1
BUG: Add back removed elision code.
charris Apr 9, 2017
1fc6e63
DOC,TST: clarify example of ndarray subclass using __array_ufunc__
mhvk Apr 10, 2017
a431743
BUG: Support nout == 0 and at method
eric-wieser Apr 12, 2017
1e460b7
DOC,MAINT: small corrections to NEP following Stephan's comments.
mhvk Apr 12, 2017
02600d3
ENH: Add NDArrayOperatorsMixin mixin class.
shoyer Apr 21, 2017
d3ff023
DOC: clarify recommendations for subclasses, deprecations.
mhvk Apr 21, 2017
b9359f1
MAINT: remove unnecessary checks, wrong code for 'outer', cleanup.
mhvk Apr 21, 2017
256a8ae
BUG: Fix ArrayLike(NDArrayOperatorsMixin) operations with object()
shoyer Apr 23, 2017
3272a86
ENH: Better error message for __array_ufunc__ not implemented
shoyer Apr 24, 2017
32221df
ENH: NDArrayOperatorsMixin calls ufuncs directly, like ndarray
shoyer Apr 27, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
DOC Update NEP to reflect actual implementation.
  • Loading branch information
mhvk authored and charris committed Apr 27, 2017
commit 5fe6fc640d752fe9e4a9a51635bf070b503aa85e
262 changes: 151 additions & 111 deletions doc/neps/ufunc-overrides.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
A Mechanism for Overriding Ufuncs
=================================

.. currentmodule:: numpy

:Author: Blake Griffith
:Contact: blake.g@utexas.edu
:Date: 2013-07-10
Expand All @@ -10,25 +12,32 @@ A Mechanism for Overriding Ufuncs

:Author: Nathaniel Smith

:Author: Marten van Kerkwijk
:Date: 2017-03-31

Executive summary
=================

NumPy's universal functions (ufuncs) currently have some limited
functionality for operating on user defined subclasses of ndarray using
``__array_prepare__`` and ``__array_wrap__`` [1]_, and there is little
to no support for arbitrary objects. e.g. SciPy's sparse matrices [2]_
[3]_.
functionality for operating on user defined subclasses of
:class:`ndarray` using ``__array_prepare__`` and ``__array_wrap__``
[1]_, and there is little to no support for arbitrary
objects. e.g. SciPy's sparse matrices [2]_ [3]_.

Here we propose adding a mechanism to override ufuncs based on the ufunc
checking each of it's arguments for a ``__numpy_ufunc__`` method.
On discovery of ``__numpy_ufunc__`` the ufunc will hand off the
checking each of it's arguments for a ``__array_ufunc__`` method.
On discovery of ``__array_ufunc__`` the ufunc will hand off the
operation to the method.

This covers some of the same ground as Travis Oliphant's proposal to
retro-fit NumPy with multi-methods [4]_, which would solve the same
problem. The mechanism here follows more closely the way Python enables
classes to override ``__mul__`` and other binary operations.
classes to override ``__mul__`` and other binary operations. It also
specifically addresses how binary operators and ufuncs should interact.

.. note:: In earlier iterations, the override was called
``__numpy_ufunc__``. An implementation was made, but had not
quite the right behaviour, hence the change in name.

.. [1] http://docs.python.org/doc/numpy/user/basics.subclassing.html
.. [2] https://github.com/scipy/scipy/issues/2123
Expand All @@ -41,13 +50,14 @@ Motivation

The current machinery for dispatching Ufuncs is generally agreed to be
insufficient. There have been lengthy discussions and other proposed
solutions [5]_.
solutions [5]_, [6]_.

Using ufuncs with subclasses of ndarray is limited to ``__array_prepare__`` and
``__array_wrap__`` to prepare the arguments, but these don't allow you to for
example change the shape or the data of the arguments. Trying to ufunc things
that don't subclass ndarray is even more difficult, as the input arguments tend
to be cast to object arrays, which ends up producing surprising results.
Using ufuncs with subclasses of :class:`ndarray` is limited to
``__array_prepare__`` and ``__array_wrap__`` to prepare the arguments,
but these don't allow you to for example change the shape or the data of
the arguments. Trying to ufunc things that don't subclass
:class:`ndarray` is even more difficult, as the input arguments tend to
be cast to object arrays, which ends up producing surprising results.

Take this example of ufuncs interoperability with sparse matrices.::

Expand Down Expand Up @@ -81,7 +91,7 @@ Take this example of ufuncs interoperability with sparse matrices.::
In [5]: np.multiply(a, bsp) # Returns NotImplemented to user, bad!
Out[5]: NotImplemted

Returning ``NotImplemented`` to user should not happen. Moreover::
Returning :obj:`NotImplemented` to user should not happen. Moreover::

In [6]: np.multiply(asp, b)
Out[6]: array([[ <3x3 sparse matrix of type '<class 'numpy.int64'>'
Expand All @@ -106,163 +116,193 @@ Returning ``NotImplemented`` to user should not happen. Moreover::
Here, it appears that the sparse matrix was converted to an object array
scalar, which was then multiplied with all elements of the ``b`` array.
However, this behavior is more confusing than useful, and having a
``TypeError`` would be preferable.
:exc:`TypeError` would be preferable.

Adding the ``__numpy_ufunc__`` functionality fixes this and would
Adding the ``__array_ufunc__`` functionality fixes this and would
deprecate the other ufunc modifying functions.

.. [5] http://mail.python.org/pipermail/numpy-discussion/2011-June/056945.html

.. [6] https://github.com/numpy/numpy/issues/5844

Proposed interface
==================

Objects that want to override Ufuncs can define a ``__numpy_ufunc__`` method.
The method signature is::
The standard array class :class:`ndarray` gains an ``__array_ufunc__``
method and objects can override Ufuncs by overriding this method (if
they are :class:`ndarray` subclasses) or defining their own. The method
signature is::

def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs)
def __array_ufunc__(self, ufunc, method, *inputs, **kwargs)

Here:

- *ufunc* is the ufunc object that was called.
- *method* is a string indicating which Ufunc method was called
(one of ``"__call__"``, ``"reduce"``, ``"reduceat"``,
``"accumulate"``, ``"outer"``, ``"inner"``).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a ufunc method "inner"? I didn't see it here. Also what about the "at" method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inner in this context would just be the __call__ method itself. And, yes, at is missing, good catch!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, did not read right: there is no inner that I know of, i.e., another good catch!

- *i* is the index of *self* in *inputs*.
- *inputs* is a tuple of the input arguments to the ``ufunc``
- *kwargs* are the keyword arguments passed to the function. The ``out``
arguments are always contained in *kwargs*, how positional variables
are passed is discussed below.

The ufunc's arguments are first normalized into a tuple of input data
(``inputs``), and dict of keyword arguments. If there are output
arguments they are handled as follows:

- One positional output variable x is passed in the kwargs dict as ``out :
x``.
- Multiple positional output variables ``x0, x1, ...`` are passed as a tuple
in the kwargs dict as ``out : (x0, x1, ...)``.
- Keyword output variables like ``out = x`` and ``out = (x0, x1, ...)`` are
passed unchanged to the kwargs dict like ``out : x`` and ``out : (x0, x1,
...)`` respectively.
- Combinations of positional and keyword output variables are not
supported.
arguments are always contained as a tuple in *kwargs*.

Hence, the arguments are normalized: only the input data (``inputs``)
are passed on as positional arguments, all the others are passed on as a
dict of keyword arguments (``kwargs``). In particular, if there are
output arguments, positional are otherwise, they are passed on as a
tuple in the ``out`` keyword argument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are not output arguments, does kwargs['out'] = ()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out=() would only happen for a ufunc that doesn't even produce outputs, which is possible in theory but very rare in practice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if out is not passed to the ufunc, then a base numpy output array is created of the (assumed" right shape & dtype and passed in kwargs['out']? It seems like one of the common reasons array_ufunc exists is to create an non obvious default output array, so that might be wasted effort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that is not the case -- either out=None is passed or out is not provided at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the tests and language in the subclassing docs make it quite clear: out is not passed at all if it is not set explicitly.

Copy link
Member
@eric-wieser eric-wieser Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'd have to get out out of kwargs anyway

If out is always in kwargs (which I'm suggesting it should be), then you can put it straight in the parameter definition if you so choose, and then forward it with out=out

There is an advantage to knowing that "all is default" by having an empty kwargs

I don't know if I agree with that - there's the disadvantage that the user now has to remember what the default is for a missing argument, if they want to try and be compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are arguments in favour of both cases, and as I don't feel like having to change all the documentation and tests again, I propose sticking with the current signature unless there is a swell of support for setting everything to its default value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually lean towards preserving the current behavior. out=(None,)*ufunc.nout would make it much harder to defensively check for a default out value in an implementation that doesn't even try to handle the strange cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern would be that we shouldn't encourage such a partial implementation to exist, and that the consumer should be forced to think about exactly which cases they want to handle.

I propose sticking with the current signature unless there is a swell of support for setting everything to its default value.

Right now, does out=None get normalized to removing the out argument entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, does out=None get normalized to removing the out argument entirely?

Yes. As does out=(None,). You need to pass at least one non-None argument for it to go through.


The function dispatch proceeds as follows:

- If one of the input arguments implements ``__numpy_ufunc__`` it is
- If one of the input arguments implements ``__array_ufunc__`` it is
executed instead of the Ufunc.

- If more than one of the input arguments implements ``__numpy_ufunc__``,
- If more than one of the input arguments implements ``__array_ufunc__``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input arguments -> input or output arguments

they are tried in the following order: subclasses before superclasses,
otherwise left to right. The first ``__numpy_ufunc__`` method returning
something else than ``NotImplemented`` determines the return value of
otherwise left to right. The first ``__array_ufunc__`` method returning
something else than :obj:`NotImplemented` determines the return value of
the Ufunc.

- If all ``__numpy_ufunc__`` methods of the input arguments return
``NotImplemented``, a ``TypeError`` is raised.
- If all ``__array_ufunc__`` methods of the input arguments return
:obj:`NotImplemented`, a :exc:`TypeError` is raised.

- If a ``__numpy_ufunc__`` method raises an error, the error is propagated
- If a ``__array_ufunc__`` method raises an error, the error is propagated
immediately.

If none of the input arguments has a ``__numpy_ufunc__`` method, the
If none of the input arguments has an ``__array_ufunc__`` method, the
execution falls back on the default ufunc behaviour.

Subclass hierarchies
--------------------

Hierarchies of such containers (say, a masked quantity), are most easily
constructed if methods consistently use :func:`super` to pass through
the class hierarchy [7]_. To support this, :class:`ndarray` has its own
``__array_ufunc__`` method (which is equivalent to ``getattr(ufunc,
method)(*inputs, **kwargs)``, i.e., if any of the (adjusted) inputs
still defines ``__array_ufunc__`` that will be called in turn). This
should be particularly useful for container-like subclasses of
:class:`ndarray`, which add an attribute like a unit or mask to a
regular :class:`ndarray`. Such classes can do possible adjustment of the
arguments relevant to their own class, pass on to another class in the
hierarchy using :func:`super` until the Ufunc is actually done, and then
do possible adjustments of the outputs.

Turning Ufuncs off
------------------

For some classes, Ufuncs make no sense, and, like for other special
methods [8]_, one can indicate Ufuncs are not available by setting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests that this works similarly to special methods like __add__, but that isn't the case:

In [5]: class LHS:
   ...:     __add__ = None
   ...:

In [6]: LHS() + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-f23b802d7362> in <module>()
----> 1 LHS() + 1

TypeError: 'NoneType' object is not callable

(Instead, you're not supposed to write __add__ at all if you class doesn't know how to be added.)

So let's qualify "for other special methods" as "for some other special method" and add an explanation of why it's necessary to define __array_ufunc__ = None instead of simply not writing a __array_ufunc__ method You need to define __array_ufunc__ if you also define arithmetic methods like __add__ and want to stop NumPy from treating your class as a scalar and automatically vectorizing arithmetic operations over each element of the array.

``__array_ufunc__`` to :obj:`None`. Inside a Ufunc, this is
equivalent to unconditionally return :obj:`NotImplemented`, and thus
will lead to a :exc:`TypeError` (unless another operand implements
``__array_ufunc__`` and knows how to deal with the class).

.. [7] https://rhettinger.wordpress.com/2011/05/26/super-considered-super/

.. [8] https://docs.python.org/3/reference/datamodel.html#specialnames

In combination with Python's binary operations
----------------------------------------------

The ``__numpy_ufunc__`` mechanism is fully independent of Python's
The ``__array_ufunc__`` mechanism is fully independent of Python's
standard operator override mechanism, and the two do not interact
directly.

They however have indirect interactions, because NumPy's ``ndarray``
type implements its binary operations via Ufuncs. Effectively, we have::

class ndarray(object):
They have indirect interactions, however, because NumPy's
:class:`ndarray` type implements its binary operations via Ufuncs. For
most numerical classes, the easiest way to override binary operations is
thus to define ``__array_ufunc__`` and override the corresponding
Ufunc. The class can then, like :class:`ndarray` itself, define the
binary operators in terms of Ufuncs. Here, one has to take some care.
E.g., the simplest implementation would be::

class ArrayLike(object):
def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
...
return result
...
def __mul__(self, other):
return np.multiply(self, other)
return self.__array_ufunc__(np.multiply, '__call__', self, other)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention the other obvious implementation, which calls the ufunc directly instead of __array_ufunc__:

class ArrayLike(object):
    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        ...
    def __mul__(self, other):
        return np.multiply(self, other)

This has similar issues handling "opt-out" classes, but there's also a fix:

# this lookup table should probably be distributed with numpy
OPERATOR_LOOKUP = {np.multiply: operator.multiply, ...}

class ArrayLike(object):
    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        func = getattr(ufunc, method)
        if method == '__call__' and 'out' not in kwargs:
            func = OPERATOR_LOOKUP.get(ufunc, func)
        # avoid infinite recursion
        inputs = [np.asarray(x) if x is self else x for x in inputs]
        return func(*inputs)

I like this approach better because it doesn't involve calling private methods, which entails the need to reimplement some of the rules of __array_ufunc__ in your own methods. But I think both can be made to work.

From an API perspective, the difference is that np.multiply(array_like, opt_out) is well defined (equivalent to array_like * opt_out) instead of raising a TypeError, even though opt_out.__array_ufunc__ is None. But I'm actually OK with either option (even for numpy.ndarray): things only get terribly complex if it's possible for np.multiply and * to do different things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it's useful to remember Sage matrices: not only * but also + works differently, as scalars are promoted to diagonal matrices. Presumably, np.add and np.multiply should retain the Numpy definition or not be defined at all if operating on them.

Copy link
Member
@shoyer shoyer Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe def __mul__(self, other): return np.multiply(self, other) is not the right solution. That's OK, but we should mention it anyways (and why it's wrong).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling self.__array_ufunc__ doesn't necessarily seem correct to me, because out is not populated. Is the expectation that out is always available in __array_ufunc__ even if it was implicit to the ufunc call?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clarify whether numpy will always call __array_ufunc__ with out or not (with a value of None to indicate no pre-allocated output).

Even if we do always provide out, whether we need to include out in __array_ufunc__ depends on how we wrote our __array_ufunc__ method, e.g., if it looks like

def __array_ufunc__(self, ufunc, method, *inputs, out, **kwargs):   # python 3 only
     ...

Copy link
Member
@pv pv Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, return self.__array_ufunc__(...) vs. if getattr(other, '__array_ufunc__', False) is None: return NotImplemented; else return np.multiply(self, other) in __mul__ correspond to different opt-out approaches in the previous discussion. The former is the laissez-faire "option 3" that allows defining __array_ufunc__ while doing opt-out in binops, whereas latter does not allow opt-out and __array_ufunc__ simultaneously ("option 2", IIRC).

What to recommend for subclasses in the NEP should follow what ndarray itself does --- IIUC, this PR decided to do "option 2" since there's discussion of the None special value.

The other options should be discussed in a separate section "Other proposals considered", and only give one way to do it as an example...


Suppose now we have a second class::
Suppose now, however, that ``other`` is class that does not know how to
deal with arrays and ufuncs, but does know how to do multiplication::

class MyObject(object):
def __numpy_ufunc__(self, *a, **kw):
return "ufunc"
__array_ufunc__ = None
def __mul__(self, other):
return 1234
def __rmul__(self, other):
return 4321

In this case, standard Python override rules combined with the above
discussion imply::
discussion would imply::

a = MyObject()
b = np.array([0])
mine = MyObject()
arr = ArrayLike([0])

a * b # == 1234 OK
b * a # == "ufunc" surprising
mine * arr # == 1234 OK
arr * mine # TypeError surprising

This is not what would be naively expected, and is therefore somewhat
undesirable behavior.
The reason why this would occur is: because ``MyObject`` is not an
``ArrayLike`` subclass, Python resolves the expression ``arr * mine`` by
calling first ``arr.__mul__``. In the above implementation, this would
just call the Ufunc, which would see that ``mine.__array_ufunc__`` is
:obj:`None` and raise a :exc:`TypeError`. (Note that if ``MyObject``
is a subclass of :class:`ndarray`, Python calls ``mine.__rmul__`` first.)

The reason why this occurs is: because ``MyObject`` is not an ndarray
subclass, Python resolves the expression ``b * a`` by calling first
``b.__mul__``. Since NumPy implements this via an Ufunc, the call is
forwarded to ``__numpy_ufunc__`` and not to ``__rmul__``. Note that if
``MyObject`` is a subclass of ``ndarray``, Python calls ``a.__rmul__``
first. The issue is therefore that ``__numpy_ufunc__`` implements
"virtual subclassing" of ndarray behavior, without actual subclassing.
So, a better implementation of the binary operators would check whether
the other class can be dealt with in ``__array_ufunc__`` and, if not,
return :obj:`NotImplemented`::

This issue can be resolved by a modification of the binary operation
methods in NumPy::

class ndarray(object):
class ArrayLike(object):
...
def __mul__(self, other):
if (not isinstance(other, self.__class__)
and hasattr(other, '__numpy_ufunc__')
and hasattr(other, '__rmul__')):
return NotImplemented
return np.multiply(self, other)

def __imul__(self, other):
if (other.__class__ is not self.__class__
and hasattr(other, '__numpy_ufunc__')
and hasattr(other, '__rmul__')):
if getattr(other, '__array_ufunc__', False) is None:
Copy link
Member
@shoyer shoyer Apr 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we recommend this approach (and I'm not sure we should), we need to include either an exclusion for type(self) is type(other) or show that __rmul__ excludes this check. Otherwise, we end up an infinite loop with ArrayLike() * ArrayLike().

(nevermind, this is wrong!)

return NotImplemented
return np.multiply(self, other, out=self)

b * a # == 4321 OK

The rationale here is the following: since the user class explicitly
defines both ``__numpy_ufunc__`` and ``__rmul__``, the implementor has
very likely made sure that the ``__rmul__`` method can process ndarrays.
If not, the special case is simple to deal with (just call
``np.multiply``).

The exclusion of subclasses of self can be made because Python itself
calls the right-hand method first in this case. Moreover, it is
desirable that ndarray subclasses are able to inherit the right-hand
binary operation methods from ndarray.

The same priority shuffling needs to be done also for the in-place
operations, so that ``MyObject.__rmul__`` is prioritized over
``ndarray.__imul__``.

return self.__array_ufunc__(np.multiply, '__call__', self, other)

arr = ArrayLike([0])

arr * mine # == 4321 OK

Indeed, after long discussion about whether it might make more sense to
ask classes like ``ArrayLike`` to implement a full ``__array_ufunc__``
[6]_, the same design as the above was agreed on for :class:`ndarray`
itself.

.. note:: The above holds for regular operators. For in-place
operators, :class:`ndarray` never returns
:obj:`NotImplemented`, i.e., ``ndarr *= mine`` would always
lead to a :exc:`TypeError`. This is because for arrays
in-place operations cannot generically be replaced by a simple
reverse operation. For instance, sticking to the above
example, what would ``ndarr[:] *= mine`` imply? Assuming it
means ``ndarr[:] = ndarr[:] * mine``, as python does by
default, is likely to be wrong.

Extension to other numpy functions
----------------------------------

The ``__array_ufunc__`` method is used to override :func:`~numpy.dot`
and :func:`~numpy.matmul` as well, since while these functions are not
(yet) implemented as (generalized) Ufuncs, they are very similar. For
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matmul is a potential gufunc, but dot is not: the result gets outer-product of axes not involved in the operation, instead of broadcasting them. So I would actually be happier if we could drop np.dot from dispatching to __array_ufunc__.

other functions, such as :func:`~numpy.median`, :func:`~numpy.min`,
etc., implementations as (generalized) Ufuncs may well be possible and
logical as well, in which case it will become possible to override these
as well.

Demo
====

A pull request[6]_ has been made including the changes proposed in this NEP.
Here is a demo highlighting the functionality.::
A pull request [8]_ has been made including the changes and revisions
proposed in this NEP. Here is a demo highlighting the functionality.::

In [1]: import numpy as np;

In [2]: a = np.array([1])

In [3]: class B():
...: def __numpy_ufunc__(self, func, method, pos, inputs, **kwargs):
...: def __array_ufunc__(self, func, method, pos, inputs, **kwargs):
...: return "B"
...:

Expand All @@ -274,24 +314,24 @@ Here is a demo highlighting the functionality.::
In [6]: np.multiply(a, b)
Out[6]: 'B'

A simple ``__numpy_ufunc__`` has been added to SciPy's sparse matrices
Currently this only handles ``np.dot`` and ``np.multiply`` because it was the
two most common cases where users would attempt to use sparse matrices with ufuncs.
The method is defined below::
As a simple example, one could add the following ``__array_ufunc__`` to
SciPy's sparse matrices (just for ``np.dot`` and ``np.multiply`` as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we update the example to use np.matmul instead of np.dot?

Actually, I'm no longer sure this example makes sense, because it would be a backwards incompatible change for scipy.sparse, which already defines * as matrix multiplication like @, even for numpy arrays, e.g.,

In [20]: np.ones((3, 3)) @ scipy.sparse.csc_matrix(np.ones((3, 3)))
Out[20]:
array([[ 3.,  3.,  3.],
       [ 3.,  3.,  3.],
       [ 3.,  3.,  3.]])

Also, @ (which I think post-dates this NEP) already provides a generic interface for matrix multiplication, so the example is a good deal less compelling.

I guess the best we could do is show how np.multiply could do element-wise multiplication on sparse matrices? But even that would break backwards compatibility:

In [29]: x = scipy.sparse.csc_matrix(np.ones((3, 3)))

In [30]: np.multiply(x, x).A
Out[30]:
array([[ 3.,  3.,  3.],
       [ 3.,  3.,  3.],
       [ 3.,  3.,  3.]])

these are the two most common cases where users would attempt to use
sparse matrices with ufuncs)::

def __numpy_ufunc__(self, func, method, pos, inputs, **kwargs):
def __array_ufunc__(self, func, method, pos, inputs, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this implementation is correct. What if the caller provided the output array? That looks like it gets dropped. A bigger complication I think might be lurking is if out was provided and is some funky, nonstandard class it could potentially break anything. That would be another thing this method would need to check for, and would apply to almost any implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That documentation is incorrect. The signature is

    def __array_ufunc__(self, func, method, *inputs, **kwargs)

Out(s) always ends up in kwargs in a tuple. Currently, if no out arrays supplied then no out keyword. As for funky classes, I think we should require them to clean up their act ;)

Copy link
Member Author
@charris charris Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that the arguments are "normalized" before the call to __array_ufunc__ so that only the input arrays are in inputs, everything else is put into kwargs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. However my point is that kwargs is never used. It is only part of the function definition. That's what I mean by "it gets dropped".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattharrigan - for this part, best to look at the new version of this NEP... charris#11 (I think I got all your other commens; thanks!).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhvk I don't think you got my comment, sorry it wasn't clear enough. In your example implementation kwargs is part of the method signature but it is never used. I think kwargs must be passed on to np.multiply and np.dot. For instance, doubt this will work as desired: "np.multiply(asp, a, out=b)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not looking carefully enough -- somehow I had not quite seen that you were commenting on the demo example -- I'm afraid this is left-over from the original PR, which is also why there still is this pos argument; another thing that should change here is that the example is with matmul, and that the whole setup follows "best practice" (which we still need to agree on...).

"""Method for compatibility with NumPy's ufuncs and dot
functions.
"""

without_self = list(inputs)
del without_self[pos]
without_self.pop(self)
without_self = tuple(without_self)

if func == np.multiply:
if func is np.multiply:
return self.multiply(*without_self)

elif func == np.dot:
elif func is np.dot:
if pos == 0:
return self.__mul__(inputs[1])
if pos == 1:
Expand Down
0