8000 Merge pull request #13624 from shoyer/nep-18-revert · numpy/numpy@d222a70 · GitHub
[go: up one dir, main page]

Skip to content

Commit d222a70

Browse files
authored
Merge pull request #13624 from shoyer/nep-18-revert
DOC: revert __skip_array_function__ from NEP-18
2 parents 95aacf5 + d4214b9 commit d222a70

File tree

1 file changed

+88
-172
lines changed

1 file changed

+88
-172
lines changed

doc/neps/nep-0018-array-function-protocol.rst

Lines changed: 88 additions & 172 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ NEP 18 — A dispatch mechanism for NumPy's high level array functions
1010
:Status: Provisional
1111
:Type: Standards Track
1212
:Created: 2018-05-29
13-
:Updated: 2019-04-11
13+
:Updated: 2019-05-25
1414
:Resolution: https://mail.python.org/pipermail/numpy-discussion/2018-August/078493.html
1515

1616
Abstact
@@ -98,12 +98,15 @@ A prototype implementation can be found in
9898

9999
.. note::
100100

101-
Dispatch with the ``__array_function__`` protocol has been implemented on
102-
NumPy's master branch but is not yet enabled by default. In NumPy 1.16,
103-
you will need to set the environment variable
104-
``NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1`` before importing NumPy to test
105-
NumPy function overrides. We anticipate the protocol will be enabled by
106-
default in NumPy 1.17.
101+
Dispatch with the ``__array_function__`` protocol has been implemented but is
102+
not yet enabled by default:
103+
104+
- In NumPy 1.16, you need to set the environment variable
105+
``NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1`` before importing NumPy to test
106+
NumPy function overrides.
107+
- In NumPy 1.17, the protocol will be enabled by default, but can be disabled
108+
with ``NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=0``.
109+
- Eventually, expect to ``__array_function__`` to always be enabled.
107110

108111
The interface
109112
~~~~~~~~~~~~~
@@ -208,75 +211,6 @@ were explicitly used in the NumPy function call.
208211
be impossible to correctly override NumPy functions from another object
209212
if the operation also includes one of your objects.
210213

211-
Avoiding nested ``__array_function__`` overrides
212-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
213-
214-
The special ``__skip_array_function__`` attribute found on NumPy functions that
215-
support overrides with ``__array_function__`` allows for calling these
216-
functions without any override checks.
217-
218-
``__skip_array_function__`` always points back to the original NumPy-array
219-
specific implementation of a function. These functions do not check for
220-
``__array_function__`` overrides, and instead usually coerce all of their
221-
array-like arguments to NumPy arrays.
222-
223-
.. note::
224-
225-
``__skip_array_function__`` was not included as part of the initial
226-
opt-in-only preview of ``__array_function__`` in NumPy 1.16.
227-
228-
Defaulting to NumPy's coercive implementations
229-
''''''''''''''''''''''''''''''''''''''''''''''
230-
231-
Some projects may prefer to default to NumPy's implementation, rather than
232-
explicitly defining implementing a supported API. This allows for incrementally
233-
overriding NumPy's API in projects that already support it implicitly by
234-
allowing their objects to be converted into NumPy arrays (e.g., because they
235-
implemented special methods such as ``__array__``). We don't recommend this
236-
for most new projects ("Explicit is better than implicit"), but in some cases
237-
it is the most expedient optio 67E6 n.
238-
239-
Adapting the previous example:
240-
241-
.. code:: python
242-
243-
class MyArray:
244-
def __array_function__(self, func, types, args, kwargs):
245-
# It is still best practice to defer to unrecognized types
246-
if not all(issubclass(t, (MyArray, np.ndarray)) for t in types):
247-
return NotImplemented
248-
249-
my_func = HANDLED_FUNCTIONS.get(func)
250-
if my_func is None:
251-
return func.__skip_array_function__(*args, **kwargs)
252-
return my_func(*args, **kwargs)
253-
254-
def __array__(self, dtype):
255-
# convert this object into a NumPy array
256-
257-
Now, if a NumPy function that isn't explicitly handled is called on
258-
``MyArray`` object, the operation will act (almost) as if MyArray's
259-
``__array_function__`` method never existed.
260-
261-
Explicitly reusing NumPy's implementation
262-
'''''''''''''''''''''''''''''''''''''''''
263-
264-
``__skip_array_function__`` is also convenient for cases where an explicit
265-
set of NumPy functions should still use NumPy's implementation, by
266-
calling ``func.__skip__array_function__(*args, **kwargs)`` inside
267-
``__array_function__`` instead of ``func(*args, **kwargs)`` (which would
268-
lead to infinite recursion). For example, to explicitly reuse NumPy's
269-
``array_repr()`` function on a custom array type:
270-
271-
.. code:: python
272-
273-
class MyArray:
274-
def __array_function__(self, func, types, args, kwargs):
275-
...
276-
if func is np.array_repr:
277-
return np.array_repr.__skip_array_function__(*args, **kwargs)
278-
...
279-
280214
Necessary changes within the NumPy codebase itself
281215
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
282216

@@ -400,20 +334,18 @@ The ``__array_function__`` method on ``numpy.ndarray``
400334

401335
The use cases for subclasses with ``__array_function__`` are the same as those
402336
with ``__array_ufunc__``, so ``numpy.ndarray`` also defines a
403-
``__array_function__`` method.
404-
405-
``ndarray.__array_function__`` is a trivial case of the "Defaulting to NumPy's
406-
implementation" strategy described above: *every* NumPy function on NumPy
407-
arrays is defined by calling NumPy's own implementation if there are other
408-
overrides:
337+
``__array_function__`` method:
409338

410339
.. code:: python
411340
412341
def __array_function__(self, func, types, args, kwargs):
413342
if not all(issubclass(t, ndarray) for t in types):
414343
# Defer to any non-subclasses that implement __array_function__
415344
return NotImplemented
416-
return func.__skip_array_function__(*args, **kwargs)
345+
346+
# Use NumPy's private implementation without __array_function__
347+
# dispatching
348+
return func._implementation(*args, **kwargs)
417349
418350
This method matches NumPy's dispatching rules, so for most part it is
419351
possible to pretend that ``ndarray.__array_function__`` does not exist.
@@ -427,9 +359,9 @@ returns ``NotImplemented``, NumPy's implementation of the function will be
427359
called instead of raising an exception. This is appropriate since subclasses
428360
are `expected to be substitutable <https://en.wikipedia.org/wiki/Liskov_substitution_principle>`_.
429361

430-
Notice that the ``__skip_array_function__`` function attribute allows us
431-
to avoid the special cases for NumPy arrays that were needed in the
432-
``__array_ufunc__`` protocol.
362+
Note that the private ``_implementation`` attribute, defined below in the
363+
``array_function_dispatch`` decorator, allows us to avoid the special cases for
364+
NumPy arrays that were needed in the ``__array_ufunc__`` protocol.
433365

434366
Changes within NumPy functions
435367
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -441,9 +373,8 @@ but of fairly simple and innocuous code that should complete quickly and
441373
without effect if no arguments implement the ``__array_function__``
442374
protocol.
443375

444-
In most cases, these functions should written using the
445-
``array_function_dispatch`` decorator. Error checking aside, here's what the
446-
core implementation looks like:
376+
To achieve this, we define a ``array_function_dispatch`` decorator to rewrite
377+
NumPy functions. The basic implementation is as follows:
447378

448379
.. code:: python
449380
@@ -457,25 +388,27 @@ core implementation looks like:
457388
implementation, public_api, relevant_args, args, kwargs)
458389
if module is not None:
459390
public_api.__module__ = module
460-
public_api.__skip_array_function__ = implementation
391+
# for ndarray.__array_function__
392+
public_api._implementation = implementation
461393
return public_api
462394
return decorator
463395
464396
# example usage
465-
def broadcast_to(array, shape, subok=None):
397+
def _broadcast_to_dispatcher(array, shape, subok=None):
466398
return (array,)
467399
468-
@array_function_dispatch(broadcast_to, module='numpy')
400+
@array_function_dispatch(_broadcast_to_dispatcher, module='numpy')
469401
def broadcast_to(array, shape, subok=False):
470402
... # existing definition of np.broadcast_to
471403
472404
Using a decorator is great! We don't need to change the definitions of
473405
existing NumPy functions, and only need to write a few additional lines
474-
to define dispatcher function. We originally thought that we might want to
475-
implement dispatching for some NumPy functions without the decorator, but
476-
so far it seems to cover every case.
406+
for the dispatcher function. We could even reuse a single dispatcher for
407+
families of functions with the same signature (e.g., ``sum`` and ``prod``).
408+
For such functions, the largest change could be adding a few lines to the
409+
docstring to note which arguments are checked for overloads.
477410

478-
Within NumPy's implementation, it's worth calling out the decorator's use of
411+
It's particularly worth calling out the decorator's use of
479412
``functools.wraps``:
480413

481414
- This ensures that the wrapped function has the same name and docstring as
@@ -489,14 +422,6 @@ Within NumPy's implementation, it's worth calling out the decorator's use of
489422
The example usage illustrates several best practices for writing dispatchers
490423
relevant to NumPy contributors:
491424

492-
- We gave the "dispatcher" function ``broadcast_to`` the exact same name and
493-
arguments as the "implementation" function. The matching arguments are
494-
required, because the function generated by ``array_function_dispatch`` will
495-
call the dispatcher in *exactly* the same way as it was called. The matching
496-
function name isn't strictly necessary, but ensures that Python reports the
497-
original function name in error messages if invalid arguments are used, e.g.,
498-
``TypeError: broadcast_to() got an unexpected keyword argument``.
499-
500425
- We passed the ``module`` argument, which in turn sets the ``__module__``
501426
attribute on the generated function. This is for the benefit of better error
502427
messages, here for errors raised internally by NumPy when no implementation
@@ -600,36 +525,6 @@ concerned about performance differences measured in microsecond(s) on NumPy
600525
functions, because it's difficult to do *anything* in Python in less than a
601526
microsecond.
602527

603-
For rare cases where NumPy functions are called in performance critical inner
604-
loops on small arrays or scalars, it is possible to avoid the overhead of
605-
dispatching by calling the versions of NumPy functions skipping
606-
``__array_function__`` checks available in the ``__skip_array_function__``
607-
attribute. For example:
608-
609-
.. code:: python
610-
611-
dot = getattr(np.dot, '__skip_array_function__', np.dot)
612-
613-
def naive_matrix_power(x, n):
614-
x = np.array(x)
615-
for _ in range(n):
616-
dot(x, x, out=x)
617-
return x
618-
619-
NumPy will use this internally to minimize overhead for NumPy functions
620-
defined in terms of other NumPy functions, but
621-
**we do not recommend it for most users**:
622-
623-
- The specific implementation of overrides is still provisional, so the
624-
``__skip_array_function__`` attribute on particular functions could be
625-
removed in any NumPy release without warning.
626-
For this reason, access to ``__skip_array_function__`` attribute outside of
627-
``__array_function__`` methods should *always* be guarded by using
628-
``getattr()`` with a default value.
629-
- In cases where this makes a difference, you will get far greater speed-ups
630-
rewriting your inner loops in a compiled language, e.g., with Cython or
631-
Numba.
632-
633528
Use outside of NumPy
634529
~~~~~~~~~~~~~~~~~~~~
635530

@@ -809,48 +704,60 @@ nearly every public function in NumPy's API. This does not preclude the future
809704
possibility of rewriting NumPy functions in terms of simplified core
810705
functionality with ``__array_function__`` and a protocol and/or base class for
811706
ensuring that arrays expose methods and properties like ``numpy.ndarray``.
707+
However, to work well this would require the possibility of implementing
708+
*some* but not all functions with ``__array_function__``, e.g., as described
709+
in the next section.
812710

813-
Coercion to a NumPy array as a catch-all fallback
814-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
711+
Partial implementation of NumPy's API
712+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
815713

816714
With the current design, classes that implement ``__array_function__``
817-
to overload at least one function can opt-out of overriding other functions
818-
by using the ``__skip_array_function__`` function, as described above under
819-
"Defaulting to NumPy's implementation."
820-
821-
However, this still results in different behavior than not implementing
822-
``__array_function__`` in at least one edge case. If multiple objects implement
823-
``__array_function__`` but don't know about each other NumPy will raise
824-
``TypeError`` if all methods return ``NotImplemented``, whereas if no arguments
825-
defined ``__array_function__`` methods it would attempt to coerce all of them
826-
to NumPy arrays.
827-
828-
Alternatively, this could be "fixed" by writing a ``__array_function__``
829-
method that always calls ``__skip_array_function__()`` instead of returning
830-
``NotImplemented`` for some functions, but that would result in a type
831-
whose implementation cannot be overriden by over argumetns -- like NumPy
832-
arrays themselves prior to the introduction of this protocol.
833-
834-
Either way, it is not possible to *exactly* maintain the current behavior of
835-
all NumPy functions if at least one more function is overriden. If preserving
836-
this behavior is important, we could potentially solve it by changing the
837-
handling of return values in ``__array_function__`` in either of two ways:
838-
839-
1. Change the meaning of all arguments returning ``NotImplemented`` to indicate
840-
that all arguments should be coerced to NumPy arrays and the operation
841-
should be retried. However, many array libraries (e.g., scipy.sparse) really
842-
don't want implicit conversions to NumPy arrays, and often avoid implementing
843-
``__array__`` for exactly this reason. Implicit conversions can result in
844-
silent bugs and performance degradation.
715+
to overload at least one function implicitly declare an intent to
716+
implement the entire NumPy API. It's not possible to implement *only*
717+
``np.concatenate()`` on a type, but fall back to NumPy's default
718+
behavior of casting with ``np.asarray()`` for all other functions.
719+
720+
This could present a backwards compatibility concern that would
721+
discourage libraries from adopting ``__array_function__`` in an
722+
incremental fashion. For example, currently most numpy functions will
723+
implicitly convert ``pandas.Series`` objects into NumPy arrays, behavior
724+
that assuredly many pandas users rely on. If pandas implemented
725+
``__array_function__`` only for ``np.concatenate``, unrelated NumPy
726+
functions like ``np.nanmean`` would suddenly break on pandas objects by
727+
raising TypeError.
728+
729+
Even libraries that reimplement most of NumPy's public API sometimes rely upon
730+
using utility functions from NumPy without a wrapper. For example, both CuPy
731+
and JAX simply `use an alias <https://github.com/numpy/numpy/issues/12974>`_ to
732+
``np.result_type``, which already supports duck-types with a ``dtype``
733+
attribute.
734+
735+
With ``__array_ufunc__``, it's possible to alleviate this concern by
736+
casting all arguments to numpy arrays and re-calling the ufunc, but the
737+
heterogeneous function signatures supported by ``__array_function__``
738+
make it impossible to implement this generic fallback behavior for
739+
``__array_function__``.
740+
741+
We considered three possible ways to resolve this issue, but none were
742+
entirely satisfactory:
743+
744+
1. Change the meaning of all arguments returning ``NotImplemented`` from
745+
``__array_function__`` to indicate that all arguments should be coerced to
746+
NumPy arrays and the operation should be retried. However, many array
747+
libraries (e.g., scipy.sparse) really don't want implicit conversions to
748+
NumPy arrays, and often avoid implementing ``__array__`` for exactly this
749+
reason. Implicit conversions can result in silent bugs and performance
750+
degradation.
845751

846752
Potentially, we could enable this behavior only for types that implement
847753
``__array__``, which would resolve the most problematic cases like
848754
scipy.sparse. But in practice, a large fraction of classes that present a
849755
high level API like NumPy arrays already implement ``__array__``. This would
850756
preclude reliable use of NumPy's high level API on these objects.
757+
851758
2. Use another sentinel value of some sort, e.g.,
852-
``np.NotImplementedButCoercible``, to indicate that a class implementing part
853-
of NumPy's higher level array API is coercible as a fallback. If all
759+
``np.NotImplementedButCoercible``, to indicate that a class implementing
760+
part of NumPy's higher level array API is coercible as a fallback. If all
854761
arguments return ``NotImplementedButCoercible``, arguments would be coerced
855762
and the operation would be retried.
856763

@@ -863,10 +770,20 @@ handling of return values in ``__array_function__`` in either of two ways:
863770
logic an arbitrary number of times. Either way, the dispatching rules would
864771
definitely get more complex and harder to reason about.
865772

866-
At present, neither of these alternatives looks like a good idea. Reusing
867-
``__skip_array_function__()`` looks like it should suffice for most purposes.
868-
Arguably this loss in flexibility is a virtue: fallback implementations often
869-
result in unpredictable and undesired behavior.
773+
3. Allow access to NumPy's implementation of functions, e.g., in the form of
774+
a publicly exposed ``__skip_array_function__`` attribute on the NumPy
775+
functions. This would allow for falling back to NumPy's implementation by
776+
using ``func.__skip_array_function__`` inside ``__array_function__``
777+
methods, and could also potentially be used to be used to avoid the
778+
overhead of dispatching. However, it runs the risk of potentially exposing
779+
details of NumPy's implementations for NumPy functions that do not call
780+
``np.asarray()`` internally. See
781+
`this note <https://mail.python.org/pipermail/numpy-discussion/2019-May/079541.html>`_
782+
for a summary of the full discussion.
783+
784+
These solutions would solve real use cases, but at the cost of additional
785+
complexity. We would like to gain experience with how ``__array_function__`` is
786+
actually used before making decisions that would be difficult to roll back.
870787

871788
A magic decorator that inspects type annotations
872789
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -965,8 +882,7 @@ There are two other arguments that we think *might* be important to pass to
965882
- Access to the non-dispatched implementation (i.e., before wrapping with
966883
``array_function_dispatch``) in ``ndarray.__array_function__`` would allow
967884
us to drop special case logic for that method from
968-
``implement_array_function``. *Update: This has been implemented, as the
969-
``__skip_array_function__`` attributes.*
885+
``implement_array_function``.
970886
- Access to the ``dispatcher`` function passed into
971887
``array_function_dispatch()`` would allow ``__array_function__``
972888
implementations to determine the list of "array-like" arguments in a generic

0 commit comments

Comments
 (0)
0