-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: added axis param for np.count_nonzero #7177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Closes gh-391.
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ | |
import sys | ||
import warnings | ||
|
||
import numpy as np | ||
from . import multiarray | ||
from .multiarray import ( | ||
_fastCopyAndTranspose as fastCopyAndTranspose, ALLOW_THREADS, | ||
|
@@ -376,6 +377,89 @@ def extend_all(module): | |
__all__.append(a) | ||
|
||
|
||
def count_nonzero(a, axis=None): | ||
""" | ||
Counts the number of non-zero values in the array ``a``. | ||
|
||
The word "non-zero" is in reference to the Python 2.x | ||
built-in method ``__nonzero__()`` (renamed ``__bool__()`` | ||
in Python 3.x) of Python objects that tests an object's | ||
"truthfulness". For example, any number is considered | ||
truthful if it is nonzero, whereas any string is considered | ||
truthful if it is not the empty string. Thus, this function | ||
(recursively) counts how many elements in ``a`` (and in | ||
sub-arrays thereof) have their ``__nonzero__()`` or ``__bool__()`` | ||
method evaluated to ``True``. | ||
|
||
Parameters | ||
---------- | ||
a : array_like | ||
The array for which to count non-zeros. | ||
axis : int or tuple, optional | ||
Axis or tuple of axes along which to count non-zeros. | ||
Default is None, meaning that non-zeros will be counted | ||
along a flattened version of ``a``. | ||
|
||
.. versionadded:: 1.12.0 | ||
|
||
Returns | ||
------- | ||
count : int or array of int | ||
Number of non-zero values in the array along a given axis. | ||
Otherwise, the total number of non-zero values in the array | ||
is returned. | ||
|
||
See Also | ||
-------- | ||
nonzero : Return the coordinates of all the non-zero values. | ||
|
||
Examples | ||
-------- | ||
>>> np.count_nonzero(np.eye(4)) | ||
4 | ||
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]]) | ||
5 | ||
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=0) | ||
array([1, 1, 1, 1, 1]) | ||
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=1) | ||
array([2, 3]) | ||
|
||
""" | ||
if axis is None or axis == (): | ||
return multiarray.count_nonzero(a) | ||
|
||
a = asanyarray(a) | ||
|
||
if a.dtype == bool: | ||
return a.sum(axis=axis, dtype=np.intp) | ||
|
||
if issubdtype(a.dtype, np.number): | ||
return (a != 0).sum(axis=axis, dtype=np.intp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This allocates a new boolean array of the same shape as the original. I thought the whole point was to avoid doing that... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @madphysicist: When was that the whole point? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought wrong apparently. It just seems a bit hacky to do that with a function that is implemented in C exactly to avoid such an operation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hacky, a bit, but it does get the job done without too much sadness. |
||
|
||
if (issubdtype(a.dtype, np.string_) or | ||
issubdtype(a.dtype, np.unicode_)): | ||
nullstr = a.dtype.type('') | ||
return (a != nullstr).sum(axis=axis, dtype=np.intp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This allocates as well... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For future reference: see conversation above |
||
|
||
axis = asarray(_validate_axis(axis, a.ndim, 'axis')) | ||
counts = np.apply_along_axis(multiarray.count_nonzero, axis[0], a) | ||
|
||
if axis.size == 1: | ||
return counts | ||
else: | ||
# for subsequent axis numbers, that number decreases | ||
# by one in this new 'counts' array if it was larger | ||
# than the first axis upon which 'count_nonzero' was | ||
# applied but remains unchanged if that number was | ||
# smaller than that first axis | ||
# | ||
# this trick enables us to perform counts on object-like | ||
# elements across multiple axes very quickly because integer | ||
# addition is very well optimized | ||
return counts.sum(axis=tuple(axis[1:] - ( | ||
axis[1:] > axis[0])), dtype=np.intp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if this is what you want, but have you considered just applying There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have, and unfortunately, it doesn't quite do what I want (i.e. I get test failures). My other objection is that it would make things difficult for expansion (e.g. add an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That may be the way to go now that I think about it. Perhaps writing this as a ufunc will be easier since the existing infrastructure will provide all the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although on second thought a reduction function may not be appropriate as a ufunc... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly, though additional |
||
|
||
|
||
def asarray(a, dtype=None, order=None): | ||
"""Convert the input to an array. | ||
|
||
|
@@ -891,7 +975,7 @@ def correlate(a, v, mode='valid'): | |
return multiarray.correlate2(a, v, mode) | ||
|
||
|
||
def convolve(a,v,mode='full'): | ||
def convolve(a, v, mode='full'): | ||
""" | ||
Returns the discrete, linear convolution of two one-dimensional sequences. | ||
|
||
|
@@ -1752,7 +1836,7 @@ def cross(a, b, axisa=-1, axisb=-1, axisc=-1, axis=None): | |
return rollaxis(cp, -1, axisc) | ||
|
||
|
||
#Use numarray's printing function | ||
# Use numarray's printing function | ||
from .arrayprint import array2string, get_printoptions, set_printoptions | ||
|
||
|
||
|
@@ -2283,6 +2367,7 @@ def load(file): | |
# These are all essentially abbreviations | ||
# These might wind up in a special abbreviations module | ||
|
||
|
||
def _maketup(descr, val): | ||
dt = dtype(descr) | ||
# Place val in all scalar tuples: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1980,16 +1980,10 @@ array_zeros(PyObject *NPY_UNUSED(ignored), PyObject *args, PyObject *kwds) | |
static PyObject * | ||
array_count_nonzero(PyObject *NPY_UNUSED(self), PyObject *args, PyObject *kwds) | ||
{ | ||
PyObject *array_in; | ||
F987 | PyArrayObject *array; | |
npy_intp count; | ||
|
||
if (!PyArg_ParseTuple(args, "O", &array_in)) { | ||
return NULL; | ||
} | ||
|
||
array = (PyArrayObject *)PyArray_FromAny(array_in, NULL, 0, 0, 0, NULL); | ||
if (array == NULL) { | ||
if (!PyArg_ParseTuple(args, "O&", PyArray_Converter, &array)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If only you could add an additional axis parameter here... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah...much, much harder said than done. :( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. I'm beginning to see that. I will keep trying though. Even if I succeed eventually, your solution should probably accepted since it provides the correct functionality. A C drop-in replacement implementation should not change the API you propose. |
||
return NULL; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider this a bug, described in #9728.