8000 DO NOT MERGE: Add keepdims argument for generalized ufuncs. by mhvk · Pull Request #11019 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

DO NOT MERGE: Add keepdims argument for generalized ufuncs. #11019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions doc/release/1.15.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ combining these 5 compiled builds products into a single "fat" binary.
``return_indices`` keyword added for ``np.intersect1d``
-------------------------------------------------------
New keyword ``return_indices`` returns the indices of the two input arrays
that correspond to the common elements.
that correspond to the common elements.

``np.quantile`` and ``np.nanquantile``
--------------------------------------
Expand Down Expand Up @@ -359,8 +359,8 @@ Increased performance in ``random.permutation`` for multidimensional arrays
``permutation`` uses the fast path in ``random.shuffle`` for all input
array dimensions. Previously the fast path was only used for 1-d arrays.

Generalized ufuncs now accept ``axes`` and ``keepdims`` arguments
-----------------------------------------------------------------
Generalized ufuncs now accept ``axes``, ``axis`` and ``keepdims`` arguments
---------------------------------------------------------------------------
One can control over which axes a generalized ufunc operates by passing in an
``axes`` argument, a list of tuples with indices of particular axes. For
instance, for a signature of ``(i,j),(j,k)->(i,k)`` appropriate for matrix
Expand All @@ -376,12 +376,19 @@ tuples can be omitted. Hence, for a signature of ``(i),(i)->()`` appropriate
for an inner product, one could pass in ``axes=[0, 0]`` to indicate that the
vectors are stored in the first dimensions of the two inputs arguments.

As a short-cut for generalized ufuncs that are similar to reductions, i.e.,
that act on a single, shared core dimension such as the inner product example
above, one can pass an ``axis`` argument. This is equivalent to passing in
``axes`` with identical entries for all arguments with that core dimension
(e.g., for the example above, ``axes=[(axis,), (axis,)]``).

Furthermore, like for reductions, for generalized ufuncs that have inputs that
all have the same number of core dimensions and outputs with no core dimension,
one can pass in ``keepdims`` to leave a dimension with size 1 in the outputs,
thus allowing proper broadcasting against the original inputs. The location of
the extra dimension can be controlled with ``axes``. For instance, for the
inner-product example, ``keepdims=True, axes=[-2, -2, -2]`` would act on the
inner-product example, ``keepdims=True, axis=-2`` would act on the
one-but-last dimension of the input arguments, and leave a size 1 dimension in
that place in the output.

Expand All @@ -401,6 +408,9 @@ is the same as::
``np.put_along_axis`` acts as the dual operation for writing to these indices
within an array.

.. note:: Implementations of ``__array_ufunc__`` should ensure that they can
handle either ``axis`` or ``axes``. In future, we may convert
``axis`` to ``axes`` before passing it on.

Changes
=======
13 changes: 12 additions & 1 deletion doc/source/reference/ufuncs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,17 @@ advanced usage and will not typically be used.
and for generalized ufuncs for which all outputs are scalars, the output
tuples can be omitted.

*axis*

.. versionadded:: 1.15

A single axis over which a generalized ufunc should operate. This is a
short-cut for ufuncs that operate over a single, shared core dimension,
equivalent to passing in ``axes`` with entries of ``(axis,)`` for each
single-core-dimension argument and ``()`` for all others. For instance,
for a signature ``(i),(i)->()``, it is equivalent to passing in
``axes=[(axis,), (axis,), ()]``.

*keepdims*

.. versionadded:: 1.15
Expand All @@ -370,7 +381,7 @@ advanced usage and will not typically be used.
ufuncs that operate on inputs that all have the same number of core
dimensions and with outputs that have no core dimensions , i.e., with
signatures like ``(i),(i)->()`` or ``(m,m)->()``. If used, the location of
the dimensions in the output can be controlled with ``axes``.
the dimensions in the output can be controlled with ``axes`` and ``axis``.

*casting*

Expand Down
12 changes: 10 additions & 2 deletions numpy/core/src/umath/override.c
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ normalize___call___args(PyUFuncObject *ufunc, PyObject *args,
npy_intp nin = ufunc->nin;
npy_intp nout = ufunc->nout;
npy_intp nargs = PyTuple_GET_SIZE(args);
npy_intp nkwds = PyDict_Size(*normal_kwds);
PyObject *obj;

if (nargs < nin) {
Expand All @@ -74,7 +75,7 @@ normalize___call___args(PyUFuncObject *ufunc, PyObject *args,

/* If we have more args than nin, they must be the output variables.*/
if (nargs > nin) {
if(PyDict_GetItemString(*normal_kwds, "out")) {
if(nkwds > 0 && PyDict_GetItemString(*normal_kwds, "out")) {
PyErr_Format(PyExc_TypeError,
"argument given by name ('out') and position "
"(%"NPY_INTP_FMT")", nin);
Expand Down Expand Up @@ -112,8 +113,15 @@ normalize___call___args(PyUFuncObject *ufunc, PyObject *args,
Py_DECREF(obj);
}
}
/* gufuncs accept either 'axes' or 'axis', but not both */
if (nkwds >= 2 && (PyDict_GetItemString(*normal_kwds, "axis") &&
PyDict_GetItemString(*normal_kwds, "axes"))) {
PyErr_SetString(PyExc_TypeError,
"cannot specify both 'axis' and 'axes'");
return -1;
}
/* finally, ufuncs accept 'sig' or 'signature' normalize to 'signature' */
return normalize_signature_keyword(*normal_kwds);
return nkwds == 0 ? 0 : normalize_signature_keyword(*normal_kwds);
}

static int
Expand Down
138 changes: 121 additions & 17 deletions numpy/core/src/umath/ufunc_object.c
Original file line number Diff line number Diff line change
Expand Up @@ -565,11 +565,12 @@ get_ufunc_arguments(PyUFuncObject *ufunc,
NPY_ORDER *out_order,
NPY_CASTING *out_casting,
PyObject **out_extobj,
PyObject **out_typetup,
int *out_subok,
PyArrayObject **out_wheremask,
PyObject **out_axes,
int *out_keepdims)
PyObject **out_typetup, /* type: Tuple[np.dtype] */
int *out_subok, /* bool */
PyArrayObject **out_wheremask, /* PyArray of bool */
PyObject **out_axes, /* type: List[Tuple[T]] */
PyObject **out_axis, /* type: T */
int *out_keepdims) /* bool */
{
int i, nargs;
int nin = ufunc->nin;
Expand Down Expand Up @@ -826,8 +827,21 @@ get_ufunc_arguments(PyUFuncObject *ufunc,
case 'a':
/* possible axes argument for generalized ufunc */
if (out_axes != NULL && strcmp(str, "axes") == 0) {
if (out_axis != NULL && *out_axis != NULL) {
PyErr_SetString(PyExc_RuntimeError,
"cannot specify both 'axis' and 'axes'");
goto fail;
}
*out_axes = value;

bad_arg = 0;
}
else if (out_axis != NULL && strcmp(str, "axis") == 0) {
if (out_axes != NULL && *out_axes != NULL) {
PyErr_SetString(PyExc_RuntimeError,
"cannot specify both 'axis' and 'axes'");
goto fail;
}
*out_axis = value;
bad_arg = 0;
}
break;
Expand Down Expand Up @@ -1884,6 +1898,27 @@ _has_output_coredims(PyUFuncObject *ufunc) {
return 0;
}

/*
* Check whether the gufunc can be used with axis, i.e., that there is only
* a single, shared core dimension (which means that operands either have
* that dimension, or have no core dimensions). Returns 0 if all is fine,
* and sets an error and returns -1 if not.
*/
static int
_check_axis_support(PyUFuncObject *ufunc) {
if (ufunc->core_num_dim_ix != 1) {
PyErr_Format(PyExc_TypeError,
"%s: axis can only be used with a single shared core "
"dimension, not with the %d distinct ones implied by "
"signature %s.",
ufunc_get_name_cstr(ufunc),
ufunc->core_num_dim_ix,
ufunc->core_signature);
return -1;
}
return 0;
}

/*
* Check whether the gufunc can be used with keepdims, i.e., that all its
* input arguments have the same number of core dimension, and all output
Expand All @@ -1899,7 +1934,7 @@ _check_keepdims_support(PyUFuncObject *ufunc) {
if (ufunc->core_num_dims[i] != (i < nin ? input_core_dims : 0)) {
PyErr_Format(PyExc_TypeError,
"%s does not support keepdims: its signature %s requires "
"that %s %d has %d core dimensions, but keepdims can only "
"%s %d to have %d core dimensions, but keepdims can only "
"be used when all inputs have the same number of core "
"dimensions and all outputs have no core dimensions.",
ufunc_get_name_cstr(ufunc),
Expand All @@ -1925,8 +1960,7 @@ static int
_parse_axes_arg(PyUFuncObject *ufunc, int core_num_dims[], PyObject *axes,
PyArrayObject **op, int broadcast_ndim, int **remap_axis) {
int nin = ufunc->nin;
int nout = ufunc->nout;
int nop = nin + nout;
int nop = ufunc->nargs;
int iop, list_size;

if (!PyList_Check(axes)) {
Expand Down Expand Up @@ -2044,6 +2078,59 @@ _parse_axes_arg(PyUFuncObject *ufunc, int core_num_dims[], PyObject *axes,
return 0;
}

/*
* Simplified version of the above, using axis to fill the remap_axis
* array, which maps default to actual axes for each operand, indexed as
* as remap_axis[iop][iaxis]. The default axis order has first all broadcast
* axes and then the core axes the gufunc operates on.
*
* Returns 0 on success, and -1 on failure
*/
static int
_parse_axis_arg(PyUFuncObject *ufunc, int core_num_dims[], PyObject *axis,
PyArrayObject **op, int broadcast_ndim, int **remap_axis) {
int nop = ufunc->nargs;
int iop, axis_int;

axis_int = PyArray_PyIntAsInt(axis);
if (error_converting(axis_int)) {
return -1;
}

for (iop = 0; iop < nop; ++iop) {
int axis, op_ndim, op_axis;

/* _check_axis_support ensures core_num_dims is 0 or 1 */
if (core_num_dims[iop] == 0) {
remap_axis[iop] = NULL;
continue;
}
if (op[iop]) {
op_ndim = PyArray_NDIM(op[iop]);
}
else {
op_ndim = broadcast_ndim + 1;
}
op_axis = axis_int; /* ensure we don't modify axis_int */
if (check_and_adjust_axis(&op_axis, op_ndim) < 0) {
return -1;
}
/* Are we actually remapping away from last axis? */
if (op_axis == op_ndim - 1) {
remap_axis[iop] = NULL;
continue;
}
remap_axis[iop][op_ndim - 1] = op_axis;
for (axis = 0; axis < op_axis; axis++) {
remap_axis[iop][axis] = axis;
}
for (axis = op_axis; axis < op_ndim - 1; axis++) {
remap_axis[iop][axis] = axis + 1;
}
} /* end of for(iop) loop over operands */
return 0;
}

#define REMAP_AXIS(iop, axis) ((remap_axis != NULL && \
remap_axis[iop] != NULL)? \
remap_axis[iop][axis] : axis)
Expand Down Expand Up @@ -2239,8 +2326,8 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
NPY_ORDER order = NPY_KEEPORDER;
/* Use the default assignment casting rule */
NPY_CASTING casting = NPY_DEFAULT_ASSIGN_CASTING;
/* When provided, extobj, typetup, and axes contain borrowed references */
PyObject *extobj = NULL, *type_tup = NULL, *axes = NULL;
/* other possible keyword arguments */
PyObject *extobj = NULL, *type_tup = NULL, *axes = NULL, *axis = NULL;
int keepdims = -1;

if (ufunc == NULL) {
Expand All @@ -2265,10 +2352,15 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,

NPY_UF_DBG_PRINT("Getting arguments\n");

/* Get all the arguments */
/*
* Get all the arguments.
*
* Here, when provided, extobj, typetup, axes, and axis contain borrowed
* references.
*/
retval = get_ufunc_arguments(ufunc, args, kwds,
op, &order, &casting, &extobj,
&type_tup, &subok, NULL, &axes, &keepdims);
&type_tup, &subok, NULL, &axes, &axis, &keepdims);
if (retval < 0) {
goto fail;
}
Expand All @@ -2283,6 +2375,12 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
goto fail;
}
}
if (axis != NULL) {
retval = _check_axis_support(ufunc);
if (retval < 0) {
goto fail;
}
}
/*
* If keepdims is set and true, signal all dimensions will be the same.
*/
Expand Down Expand Up @@ -2351,7 +2449,7 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
}

/* Possibly remap axes. */
if (axes) {
if (axes != NULL || axis != NULL) {
remap_axis = PyArray_malloc(sizeof(remap_axis[0]) * nop);
remap_axis_memory = PyArray_malloc(sizeof(remap_axis_memory[0]) *
nop * NPY_MAXDIMS);
Expand All @@ -2362,8 +2460,14 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
for (i=0; i < nop; i++) {
remap_axis[i] = remap_axis_memory + i * NPY_MAXDIMS;
}
retval = _parse_axes_arg(ufunc, core_num_dims, axes, op, broadcast_ndim,
remap_axis);
if (axis) {
retval = _parse_axis_arg(ufunc, core_num_dims, axis, op,
broadcast_ndim, remap_axis);
}
else {
retval = _parse_axes_arg(ufunc, core_num_dims, axes, op,
broadcast_ndim, remap_axis);
}
if(retval < 0) {
goto fail;
}
Expand Down Expand Up @@ -2804,7 +2908,7 @@ PyUFunc_GenericFunction(PyUFuncObject *ufunc,
/* Get all the arguments */
retval = get_ufunc_arguments(ufunc, args, kwds,
op, &order, &casting, &extobj,
&type_tup, &subok, &wheremask, NULL, NULL);
&type_tup, &subok, &wheremask, NULL, NULL, NULL);
if (retval < 0) {
goto fail;
}
Expand Down
Loading
0