8000 Adding isin function for multidimensional arrays by brsr · Pull Request #8423 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Adding isin function for multidimensional arrays #8423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
May 5, 2017
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
17faf5a
Adding isin function for multidimensional arrays
brsr Dec 28, 2016
34a40da
Fix comments on pull request
brsr Dec 31, 2016
2b4a81b
keep in1d mostly the same, changes in isin now
brsr Feb 15, 2017
db5e5fd
screwed up the whitespace
brsr Feb 15, 2017
f63cf31
docs, convert elements to array in case it isn't already
brsr Feb 15, 2017
a9bce34
Merge branch 'master' into master
charris Feb 22, 2017
f06ed40
Merge branch 'master' into master
charris Feb 22, 2017
0938763
Merge branch 'master' into master
brsr Mar 25, 2017
e10ee1e
removing extra line
brsr Mar 25, 2017
0ec089a
support iterables that aren't array_like
brsr Mar 27, 2017
ba10c98
it's hasattr not has_attr
brsr Mar 27, 2017
e72b686
check for __array__ instead
brsr Mar 27, 2017
818337d
replace list comprehension
brsr Mar 27, 2017
6ace52e
add comment
brsr Mar 27, 2017
7712179
Responding to seberg's comments
brsr Apr 8, 2017
a2c9b6c
Removing special handling for sets
brsr Apr 9, 2017
552a193
Renaming elements to element
brsr Apr 10, 2017
fa0b0be
eric-wieser's comments
brsr Apr 12, 2017
0ff6be4
Fixes to tests
brsr Apr 12, 2017
545df63
Docstrings, further expanding test
brsr Apr 13, 2017
41e5b0b
spacing
brsr Apr 13, 2017
521d517
Actual zero-d array
brsr Apr 13, 2017
8805bbb
More docstring changes
brsr Apr 13, 2017
4d3f67c
clean up function listing
brsr Apr 13, 2017
0395f39
Update 1.13.0-notes.rst
brsr Apr 13, 2017
3d809a6
Merge branch 'master' into master
brsr Apr 28, 2017
d22cafc
discouraged, not deprecated
brsr Apr 28, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions doc/release/1.13.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,12 @@ In an N-dimensional array, the user can now choose the axis along which to look
for duplicate N-1-dimensional elements using ``numpy.unique``. The original
behaviour is recovered if ``axis=None`` (default).

``isin`` function, improving on ``in1d``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The new function ``isin`` tests whether elements in an array are also present
in another array, preserving the shape of the first array. It builds on
the existing ``in1d`` routine.

``np.gradient`` now supports unevenly spaced data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Users can now specify a not-constant spacing for data.
Expand Down
1 change: 1 addition & 0 deletions doc/source/reference/routines.set.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Boolean operations

in1d
intersect1d
isin
setdiff1d
setxor1d
union1d
2 changes: 1 addition & 1 deletion numpy/add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -1500,7 +1500,7 @@ def luf(lamdaexpr, *args, **kwargs):
Find the indices of elements of `x` that are in `goodvalues`.

>>> goodvalues = [3, 4, 7]
>>> ix = np.in1d(x.ravel(), goodvalues).reshape(x.shape)
>>> ix = np.isin(x, goodvalues)
>>> ix
array([[False, False, False],
[ True, True, False],
Expand Down
84 changes: 81 additions & 3 deletions numpy/lib/arraysetops.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
"""
Set operations for 1D numeric arrays based on sorting.
Set operations for arrays based on sorting.

:Contains:
ediff1d,
unique,
isin,
ediff1d,
intersect1d,
setxor1d,
in1d,
Expand Down Expand Up @@ -31,7 +32,7 @@

__all__ = [
'ediff1d', 'intersect1d', 'setxor1d', 'union1d', 'setdiff1d', 'unique',
'in1d'
'in1d', 'isin'
]


Expand Down Expand Up @@ -380,13 +381,18 @@ def setxor1d(ar1, ar2, assume_unique=False):
flag2 = flag[1:] == flag[:-1]
return aux[flag2]


def in1d(ar1, ar2, assume_unique=False, invert=False):
"""
Test whether each element of a 1-D array is also present in a second array.

Returns a boolean array the same length as `ar1` that is True
where an element of `ar1` is in `ar2` and False otherwise.

This function has been deprecated: use `isin` instead.

Deprecated since version 1.13.0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here


Parameters
----------
ar1 : (M,) array_like
Expand All @@ -411,6 +417,8 @@ def in1d(ar1, ar2, assume_unique=False, invert=False):

See Also
--------
isin : Version of this function that preserves the
shape of ar1.
numpy.lib.arraysetops : Module with a number of other functions for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"See also: The module you're looking at right now" is a little weird. @brsr, you don't need to worry about it in this PR though.

performing set operations on arrays.

Expand Down Expand Up @@ -481,6 +489,76 @@ def in1d(ar1, ar2, assume_unique=False, invert=False):
else:
return ret[rev_idx]


def isin(element, test_elements, assume_unique=False, invert=False):
"""
Calculates `element in test_elements`, broadcasting over `element` only.
Returns a boolean array of the same shape as `elements` that is True
where an element of `elements` is in `test_elements` and False otherwise.

Parameters
----------
element : array_like
Input array.
test_elements : array_like
The values against which to test each value of `elements`.
This argument is flattened if it is an array or array_like.
See notes for behavior with non-array-like parameters.
assume_unique : bool, optional
If True, the input arrays are both assumed to be unique, which
can speed up the calculation. Default is False.
invert : bool, optional
If True, the values in the returned array are inverted (that is,
False where an element of `ar1` is in `ar2` and True otherwise).
Copy link
Member
@eric-wieser eric-wieser Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names are out of date. I'd probably ditch this wording all together, and explain it in terms of not in.

Default is False. ``np.isin(a, b, invert=True)`` is equivalent
to (but is faster than) ``np.invert(np.isin(a, b))``.

Returns
-------
isin : ndarray, bool
Has the same shape as `element`. The values `elements[isin]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed an s here

are in `test_elements`.

See Also
--------
in1d : Flattened version of this function.
numpy.lib.arraysetops : Module with a number of other functions for
performing set operations on arrays.
Notes
-----

`isin` is an element-wise function version of the python keyword `in`.
``isin(a, b)`` is roughly equivalent to
``np.array([item in b for item in a])`` if `a` and `b` are 1-D sequences.

If `test_elements` is a set (or other non-sequence collection) it will
be converted to an object array with one element, rather than an array
of the values contained in `test_elements`. Converting the set to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth pointing out that this is just the np.array constructor doing its thing, and is not something weird this function chose to do. Mentioning "array constructor" somewhere should do the job

a list usually gives the desired behavior.

.. versionadded:: 1.13.0

Examples
--------
>>> element = np.array([[0, 2], [4, 6]])
>>> test_elements = [1, 2, 4, 8]
>>> mask = np.isin(element, test_elements)
>>> mask
array([[ False, True],
[ True, False]], dtype=bool)
>>> element[mask]
array([2, 4])
>>> mask = np.isin(element, test_elements, invert=True)
>>> mask
array([[ True, False],
[ False, True]], dtype=bool)
>>> element[mask]
array([0, 6])"""
Copy link
Member
@eric-wieser eric-wieser Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: needs an extra newline in this docstring

element = np.array(element)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be asarray, or asanyarray if you think this will worked for masked arrays

return in1d(element, test_elements, assume_unique=assume_unique,
invert=invert).reshape(element.shape)


def union1d(ar1, ar2):
"""
Find the union of two arrays.
Expand Down
2 changes: 2 additions & 0 deletions numpy/lib/info.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,8 @@
setxor1d Set exclusive-or of 1D arrays with unique elements.
in1d Test whether elements in a 1D array are also present in
another array.
isin Test whether elements in an array are also present in
another array, preserving shape of first array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "test whether each element of one array is present anywhere within another"?

No strong feelings here

union1d Union of 1D arrays with unique elements.
setdiff1d Set difference of 1D arrays with unique elements.
================ ===================
Expand Down
19 changes: 18 additions & 1 deletion numpy/lib/tests/test_arraysetops.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
run_module_suite, TestCase, assert_array_equal, assert_equal, assert_raises
)
from numpy.lib.arraysetops import (
ediff1d, intersect1d, setxor1d, union1d, setdiff1d, unique, in1d
ediff1d, intersect1d, setxor1d, union1d, setdiff1d, unique, in1d, isin
)


Expand Down Expand Up @@ -77,6 +77,23 @@ def test_ediff1d(self):
assert(isinstance(ediff1d(np.matrix(1)), np.matrix))
assert(isinstance(ediff1d(np.matrix(1), to_begin=1), np.matrix))

def test_isin(self):
# the tests for in1d cover most of isin's behavior
# if in1d is deprecated, would need to change those tests to test
# isin instead.
a = np.arange(24).reshape([2, 3, 4])
b = [0, 10, 20, 30, 1, 3, 11, 22, 33]
ec = np.zeros((2, 3, 4), dtype=bool)
ec[0, 0, 0] = True
ec[0, 0, 1] = True
ec[0, 0, 3] = True
ec[0, 2, 2] = True
ec[0, 2, 3] = True
ec[1, 2, 0] = True
ec[1, 2, 2] = True
c = isin(a, b)
assert_array_equal(c, ec)

Copy link
Member
@eric-wieser eric-wieser Apr 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test for 0d a here

def test_in1d(self):
# we use two different sizes for the b array here to test the
# two different paths in in1d().
Expand Down
32 changes: 31 additions & 1 deletion numpy/ma/extras.py
2851
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
'column_stack', 'compress_cols', 'compress_nd', 'compress_rowcols',
'compress_rows', 'count_masked', 'corrcoef', 'cov', 'diagflat', 'dot',
'dstack', 'ediff1d', 'flatnotmasked_contiguous', 'flatnotmasked_edges',
'hsplit', 'hstack', 'in1d', 'intersect1d', 'mask_cols', 'mask_rowcols',
'hsplit', 'hstack', 'isin', 'in1d', 'intersect1d', 'mask_cols', 'mask_rowcols',
'mask_rows', 'masked_all', 'masked_all_like', 'median', 'mr_',
'notmasked_contiguous', 'notmasked_edges', 'polyfit', 'row_stack',
'setdiff1d', 'setxor1d', 'unique', 'union1d', 'vander', 'vstack',
Expand Down Expand Up @@ -1137,15 +1137,22 @@ def setxor1d(ar1, ar2, assume_unique=False):
flag2 = (flag[1:] == flag[:-1])
return aux[flag2]


def in1d(ar1, ar2, assume_unique=False, invert=False):
"""
Test whether each element of an array is also present in a second
array.

The output is always a masked array. See `numpy.in1d` for more details.

This function has been deprecated: use `isin` instead.

Deprecated since version 1.13.0.
Copy link
Member
@eric-wieser eric-wieser Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use the .. deprecated:: 1.13.0 syntax. Look around at other files to see how things are worded around it. Some files use .. note as well, I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, numpy/HOWTO_DOCUMENT.rst.txt says something different about deprecation warnings.



See Also
--------
isin : Version of this function that preserves the shape of ar1.
numpy.in1d : Equivalent function for ndarrays.

Notes
Expand Down Expand Up @@ -1176,6 +1183,29 @@ def in1d(ar1, ar2, assume_unique=False, invert=False):
return flag[indx][rev_idx]


def isin(element, test_elements, assume_unique=False, invert=False):
"""
Calculates `element in test_elements`, broadcasting over
`element` only.

The output is always a masked array of the same shape as `element`.
See `numpy.isin` for more details.

See Also
--------
in1d : Flattened version of this function.
numpy.isin : Equivalent function for ndarrays.

Notes
-----
.. versionadded:: 1.13.0

"""
element = ma.array(element)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, asarray or asanyarray

return in1d(element, test_elements, assume_unique=assume_unique,
invert=invert).reshape(element.shape)


def union1d(ar1, ar2):
"""
Union of two arrays.
Expand Down
20 changes: 19 additions & 1 deletion numpy/ma/tests/test_extras.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
median, average, unique, setxor1d, setdiff1d, union1d, intersect1d, in1d,
ediff1d, apply_over_axes, apply_along_axis, compress_nd, compress_rowcols,
mask_rowcols, clump_masked, clump_unmasked, flatnotmasked_contiguous,
notmasked_contiguous, notmasked_edges, masked_all, masked_all_like,
notmasked_contiguous, notmasked_edges, masked_all, masked_all_like, isin,
diagflat
)
import numpy.ma.extras as mae
Expand Down Expand Up @@ -1435,6 +1435,24 @@ def test_setxor1d(self):
#
assert_array_equal([], setxor1d([], []))

def test_isin(self):
# the tests for in1d cover most of isin's behavior
# if in1d is deprecated, would need to change those tests to test
# isin instead.
a = np.arange(24).reshape([2, 3, 4])
mask = np.zeros([2, 3, 4])
mask[1, 2, 0] = 1
a = array(a, mask=mask)
#masked: N Y N Y N Y N Y N
b = array([0, 10, 20, 30, 1, 3, 11, 22, 33],
mask=[0, 1, 0, 1, 0, 1, 0, 1, 0])
ec = zeros((2, 3, 4), dtype=bool)
ec[0, 0, 0] = True
ec[0, 0, 1] = True
ec[0, 2, 3] = True
c = isin(a, b)
assert_array_equal(c, ec)
Copy link
Member
@eric-wieser eric-wieser Apr 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test here for type(c)? Should it be MaskedArray or array?

Also, a test of what happens if you call np.isin instead of np.ma.isin might be interesting as well.


def test_in1d(self):
# Test in1d
a = array([1, 2, 5, 7, -1], mask=[0, 0, 0, 0, 1])
Expand Down
0