8000 ENH: Add `__array_ufunc__` by charris · Pull Request #8247 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Add __array_ufunc__ #8247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 43 commits into from
Apr 27, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
4fd7e84
ENH: Revert "Temporarily disable __numpy_ufunc__"
charris Nov 6, 2016
fcd11d2
ENH: Rename __numpy_ufunc__ to __array_ufunc__.
charris Nov 9, 2016
c7b25e2
ENH: Remove position arg from __array_ufunc__.
charris Nov 12, 2016
8a9e790
MAINT: Put PyArray_GetAttrString_SuppressException in get_attr_string.h
njsmith Jun 24, 2015
4dd5380
MAINT: dike out a bunch of weird old code implementing scalar power
njsmith Jun 24, 2015
7d9bc2f
BUG/ENH: Switch to simplified __array_ufunc__/binop interaction
njsmith Jun 22, 2015
e4b5163
MAINT: allow __array_ufunc__ = None to force binops to defer.
mhvk Mar 12, 2017
2e6d8c0
MAINT: Split out C code in ufunc_override.h to .c file.
mhvk Mar 15, 2017
d5c5ac1
MAINT: Add NPY_NO_EXPORT modifier to PyUFunc_CheckOverride.
charris Mar 16, 2017
3124e96
MAINT: for __array_ufunc__ pass inputs as *args, ensure out is tuple.
mhvk Mar 14, 2017
6a3ca31
DOC: describe current implementation of __array_ufunc__.
mhvk Mar 14, 2017
79bb733
DOC: Style and sphinx fixes for arrays.classes.rst.
charris Mar 23, 2017
7c3dc5a
TST: test that gufuncs are also overridden by __array_ufunc__.
mhvk Mar 25, 2017
71201d2
DOC: Describe __array_func__ in subclassing
mhvk Mar 15, 2017
3041710
ENH: implement ndarray.__array_ufunc__
mhvk Mar 13, 2017
5fe6fc6
DOC Update NEP to reflect actual implementation.
mhvk Mar 31, 2017
e092823
MAINT: let ndarray.__array_ufunc__ bail if any overrides are in place.
mhvk Apr 2, 2017
1147894
MAINT: Update array_ufunc NEP.
pv Apr 1, 2017
e325a10
DOC: Document behavior of ufuncs with default ndarray.__array_ufunc__
pv Apr 1, 2017
39c2273
DOC: Update ndarray.__array_ufunc__ documentation vs. review comments
pv Apr 2, 2017
6b41d11
DOC: clarify use of super and getattr
mhvk Apr 2, 2017
0ede0e9
DOC: update NEP again.
mhvk Apr 2, 2017
5f9252c
DOC: implement many smaller and bigger changes suggested in review.
mhvk Apr 4, 2017
8cc2f71
BUG,MAINT: ensure out=None is never passed on to __array_ufunc__.
mhvk Apr 4, 2017
856da73
DOC: remove left-over piece discussing binops
mhvk Apr 5, 2017
2b6c7fd
REVERT: remove __array_ufunc__ override for np.dot and ndarray.dot.
mhvk Apr 6, 2017
36e8494
REVERT: remove __array_ufunc__ override for np.matmul.
mhvk Apr 6, 2017
55500b9
MAINT: simplify now that __array_ufunc__ overrides ufuncs only.
mhvk Apr 6, 2017
25e973d
MAINT: split out umath-specific part of ufunc_override.
mhvk Apr 6, 2017
b1fa10a
BUG: ensure subclass of override class doesn't segfault.
mhvk Apr 8, 2017
1de8f5a
DOC: Mention `__array_ufunc__` in the 1.13.0 release notes.
charris Apr 8, 2017
a460015
DOC: ufunc-overrides: sync the discussion vs. current implementation
pv Apr 9, 2017
cd2e42c
DOC: ufunc-overrides: revise hierarchy discussion
pv Apr 9, 2017
ff628f1
BUG: Add back removed elision code.
charris Apr 9, 2017
1fc6e63
DOC,TST: clarify example of ndarray subclass using __array_ufunc__
mhvk Apr 10, 2017
a431743
BUG: Support nout == 0 and at method
eric-wieser Apr 12, 2017
1e460b7
DOC,MAINT: small corrections to NEP following Stephan's comments.
mhvk Apr 12, 2017
02600d3
ENH: Add NDArrayOperatorsMixin mixin class.
shoyer Apr 21, 2017
d3ff023
DOC: clarify recommendations for subclasses, deprecations.
mhvk Apr 21, 2017
b9359f1
MAINT: remove unnecessary checks, wrong code for 'outer', cleanup.
mhvk Apr 21, 2017
256a8ae
BUG: Fix ArrayLike(NDArrayOperatorsMixin) operations with object()
shoyer Apr 23, 2017
3272a86
ENH: Better error message for __array_ufunc__ not implemented
shoyer Apr 24, 2017
32221df
ENH: NDArrayOperatorsMixin calls ufuncs directly, like ndarray
shoyer Apr 27, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
MAINT: Update array_ufunc NEP.
Bring into compliance with current ndarray.__array_ufunc__
implementation and type casting hierarchy.
  • Loading branch information
pv authored and charris committed Apr 27, 2017
commit 114789495535bf9243d2935977aed60fdc842203
177 changes: 158 additions & 19 deletions doc/neps/ufunc-overrides.rst
Original file line number Diff line number Diff line change
Expand Up @@ -171,31 +171,161 @@ The function dispatch proceeds as follows:
If none of the input arguments has an ``__array_ufunc__`` method, the
execution falls back on the default ufunc behaviour.


Type casting hierarchy
----------------------

Similarly to the Python operator dispatch mechanism, writing ufunc
dispatch methods requires some discipline in order to achieve
predictable results.

In particular, it is useful to maintain a clear idea of what types can
be upcast to others, possibly indirectly (i.e. A->B->C is implemented
but direct A->C not). Moreover, one should make sure the implementations of
``__array_ufunc__``, which implicitly define the type casting hierarchy,
don't contradict this.

The following rules should be followed:

1. The ``__array_ufunc__`` for type A should either return
`NotImplemented`, or return an output of type A (unless an
``out=`` argument was given, in which case ``out`` is returned).

2. For any two different types *A*, *B*, the relation "A can handle B"
defined as::

a.__array_ufunc__(..., b, ...) is not NotImplemented

for instances *a* and *b* of *A* and *B*, defines the
edges B->A of a graph.

This graph must be a directed acyclic graph.

Under these conditions, the transitive closure of the "can handle"
relation defines a strict partial ordering of the types -- that is, the
type casting hierarchy.

In other words, for any given class A, all other classes that define
``__array_ufunc__`` must belong to exactly one of the groups:

- *Above A*: their ``__array_ufunc__`` can handle class A or some
member of the "above A" classes. In other words, these are the types
that A can be (indirectly) upcast to in ufuncs.

- *Below A*: they can be handled by the ``__array_ufunc__`` of class A
or the ``__array_ufunc__`` of some member of the "below A" classes. In
other words, these are the types that can be (indirectly) upcast to A
in ufuncs.

- *Incompatible*: neither above nor below A; types for which no
(indirect) upcasting is possible.

This guarantees that expressions involving ufuncs either raise a
`TypeError`, or the result type is independent of what ufuncs were
called, what order they were called in, and what order their arguments
were in. Moreover, which ``__array_ufunc__`` payload code runs at each
step is independent of the order of arguments of the ufuncs.

Note also that while converting inputs that don't have
``__array_ufunc__`` to `ndarray` via `np.asarray` is consistent with the
type casting hierarchy, also returning `NotImplemented` is
consistent. However, the numpy ufunc (legacy) behavior is to try to
convert unknown objects to ndarrays.


.. admonition:: Example

Type casting hierarchy

.. graphviz::

digraph array_ufuncs {
rankdir=BT;
A -> C;
B -> C;
D -> B;
ndarray -> A;
ndarray -> B;
}

The ``__array_ufunc__`` of type A can handle ndarrays, B can handle ndarray and D,
and C can handle A and B but not ndarrays or D. The resulting graph is a DAG,
and defines a type casting hierarchy, with relations ``C > A >
ndarray``, ``C > B > ndarray``, ``C > B > D``. The type B is incompatible
relative to A and vice versa, and A and ndarray are incompatible relative to D.
Ufunc expressions involving these classes produce results of the highest type
involved or raise a TypeError.


Subclass hierarchies
--------------------

Hierarchies of such containers (say, a masked quantity), are most easily
constructed if methods consistently use :func:`super` to pass through
the class hierarchy [7]_. To support this, :class:`ndarray` has its own
``__array_ufunc__`` method (which is equivalent to ``getattr(ufunc,
method)(*inputs, **kwargs)``, i.e., if any of the (adjusted) inputs
still defines ``__array_ufunc__`` that will be called in turn). This
should be particularly useful for container-like subclasses of
:class:`ndarray`, which add an attribute like a unit or mask to a
regular :class:`ndarray`. Such classes can do possible adjustment of the
arguments relevant to their own class, pass on to another class in the
hierarchy using :func:`super` until the Ufunc is actually done, and then
do possible adjustments of the outputs.
Generally, it is desirable to mirror the class hierarchy in the ufunc
type casting hierarchy. The recommendation is that an
``__array_ufunc__`` implementation of a class should generally return
`NotImplemented` unless the inputs are instances of the same class or
superclasses. This guarantees that in the type casting hierarchy,
superclasses are below, subclasses above, and other classes are
incompatible. Exceptions to this need to check they respect the
implicit type casting hierarchy.

Subclasses can be easily constructed if methods consistently use
:func:`super` to pass through the class hierarchy [7]_. To support
this, :class:`ndarray` has its own ``__array_ufunc__`` method,
equivalent to::

def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
out = kwargs.pop('out', None)
out_tuple = out if out is not None else ()

# Handle items of type(self), superclasses, and items
# without __array_ufunc__. Bail out in other cases.
items = []
for item in inputs + out_tuple:
if isinstance(self, type(item)) or not hasattr(item, '__array_ufunc__'):
# Cast to plain ndarrays
items.append(np.asarray(item))
else:
return NotImplemented

# Perform ufunc on the underlying ndarrays (no __array_ufunc__ dispatch)
result = getattr(ufunc, method)(*items, **kwargs)

# Cast output to type(self), unless `out` specified
if out is not None:
return result

if isinstance(result, tuple):
return tuple(x.view(type(self)) for x in result)
else:
return result.view(type(self))

Note that, as a special case, the ufunc dispatch mechanism does not call
the `__array_ufunc__` method for inputs of `ndarray` type. As a
consequence, calling `ndarray.__array_ufunc__` will not result to a
nested ufunc dispatch cycle. Custom implementations of
`__array_ufunc__` should generally avoid nested dispatch cycles.

This should be particularly useful for subclasses of :class:`ndarray`,
which only add an attribute like a unit or mask to a regular
:class:`ndarray`. In their `__array_ufunc__` implementation, such
classes can do possible adjustment of the arguments relevant to their
own class, and pass on to superclass implementation using :func:`super`
until the ufunc is actually done, and then do possible adjustments of
the outputs.

Turning Ufuncs off
------------------

For some classes, Ufuncs make no sense, and, like for other special
methods [8]_, one can indicate Ufuncs are not available by setting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests that this works similarly to special methods like __add__, but that isn't the case:

In [5]: class LHS:
   ...:     __add__ = None
   ...:

In [6]: LHS() + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-f23b802d7362> in <module>()
----> 1 LHS() + 1

TypeError: 'NoneType' object is not callable

(Instead, you're not supposed to write __add__ at all if you class doesn't know how to be added.)

So let's qualify "for other special methods" as "for some other special method" and add an explanation of why it's necessary to define __array_ufunc__ = None instead of simply not writing a __array_ufunc__ method You need to define __array_ufunc__ if you also define arithmetic methods like __add__ and want to stop NumPy from treating your class as a scalar and automatically vectorizing arithmetic operations over each element of the array.

``__array_ufunc__`` to :obj:`None`. Inside a Ufunc, this is
equivalent to unconditionally return :obj:`NotImplemented`, and thus
equivalent to unconditionally returning :obj:`NotImplemented`, and thus
will lead to a :exc:`TypeError` (unless another operand implements
``__array_ufunc__`` and knows how to deal with the class).
``__array_ufunc__`` and specifically knows how to deal with the class).

In the type casting hierarchy, this makes the type incompatible relative
to `ndarray`.

.. [7] https://rhettinger.wordpress.com/2011/05/26/super-considered-super/

Expand All @@ -217,10 +347,11 @@ binary operators in terms of Ufuncs. Here, one has to take some care.
E.g., the simplest implementation would be::

class ArrayLike(object):
...
def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
...
return result
...

def __mul__(self, other):
return self.__array_ufunc__(np.multiply, '__call__', self, other)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention the other obvious implementation, which calls the ufunc directly instead of __array_ufunc__:

class ArrayLike(object):
    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        ...
    def __mul__(self, other):
        return np.multiply(self, other)

This has similar issues handling "opt-out" classes, but there's also a fix:

# this lookup table should probably be distributed with numpy
OPERATOR_LOOKUP = {np.multiply: operator.multiply, ...}

class ArrayLike(object):
    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        func = getattr(ufunc, method)
        if method == '__call__' and 'out' not in kwargs:
            func = OPERATOR_LOOKUP.get(ufunc, func)
        # avoid infinite recursion
        inputs = [np.asarray(x) if x is self else x for x in inputs]
        return func(*inputs)

I like this approach better because it doesn't involve calling private methods, which entails the need to reimplement some of the rules of __array_ufunc__ in your own methods. But I think both can be made to work.

From an API perspective, the difference is that np.multiply(array_like, opt_out) is well defined (equivalent to array_like * opt_out) instead of raising a TypeError, even though opt_out.__array_ufunc__ is None. But I'm actually OK with either option (even for numpy.ndarray): things only get terribly complex if it's possible for np.multiply and * to do different things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it's useful to remember Sage matrices: not only * but also + works differently, as scalars are promoted to diagonal matrices. Presumably, np.add and np.multiply should retain the Numpy definition or not be defined at all if operating on them.

Copy link
Member
@shoyer shoyer Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe def __mul__(self, other): return np.multiply(self, other) is not the right solution. That's OK, but we should mention it anyways (and why it's wrong).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling self.__array_ufunc__ doesn't necessarily seem correct to me, because out is not populated. Is the expectation that out is always available in __array_ufunc__ even if it was implicit to the ufunc call?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clarify whether numpy will always call __array_ufunc__ with out or not (with a value of None to indicate no pre-allocated output).

Even if we do always provide out, whether we need to include out in __array_ufunc__ depends on how we wrote our __array_ufunc__ method, e.g., if it looks like

def __array_ufunc__(self, ufunc, method, *inputs, out, **kwargs):   # python 3 only
     ...

Copy link
Member
@pv pv Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, return self.__array_ufunc__(...) vs. if getattr(other, '__array_ufunc__', False) is None: return NotImplemented; else return np.multiply(self, other) in __mul__ correspond to different opt-out approaches in the previous discussion. The former is the laissez-faire "option 3" that allows defining __array_ufunc__ while doing opt-out in binops, whereas latter does not allow opt-out and __array_ufunc__ simultaneously ("option 2", IIRC).

What to recommend for subclasses in the NEP should follow what ndarray itself does --- IIUC, this PR decided to do "option 2" since there's discussion of the None special value.

The other options should be discussed in a separate section "Other proposals considered", and only give one way to do it as an example...


Expand All @@ -229,20 +360,28 @@ deal with arrays and ufuncs, but does know how to do multiplication::

class MyObject(object):
__array_ufunc__ = None
def __init__(self, value):
self.value = value
def __repr__(self):
return "MyObject({!r})".format(self.value)
def __mul__(self, other):
return 1234
return MyObject(1234)
def __rmul__(self, other):
return 4321
return MyObject(4321)

In this case, standard Python override rules combined with the above
discussion would imply::

mine = MyObject()
mine = MyObject(0)
arr = ArrayLike([0])

mine * arr # == 1234 OK
mine * arr # == MyObject(1234) OK
arr * mine # TypeError surprising

XXX: but it doesn't raise a TypeError, because `__mul__` calls
directly `__array_ufunc__`, which sees the `__array_ufunc__ == None`, and
bails out with `NotImplemented`?

The reason why this would occur is: because ``MyObject`` is not an
``ArrayLike`` subclass, Python resolves the expression ``arr * mine`` by
calling first ``arr.__mul__``. In the above implementation, this would
Expand Down
1 change: 1 addition & 0 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.pngmath', 'numpydoc',
'sphinx.ext.intersphinx', 'sphinx.ext.coverage',
'sphinx.ext.doctest', 'sphinx.ext.autosummary',
'sphinx.ext.graphviz',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we use this for within this patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The graphs depicting type hierarchy in ufunc-overrides.rst; e.g., https://github.com/numpy/numpy/pull/8247/files#diff-6aa7a114acf37c97c40cba1b3fe76900R261

'matplotlib.sphinxext.plot_directive']

# Add any paths that contain templates here, relative to this directory.
Expand Down
0