BUG: Fix numpy.isin for timedelta dtype #21860

MilesCranmer · 2022-06-27T18:02:01Z

This PR fixes the issue discussed on #12065 and #21843 where 'timedelta64' was noted to be a subtype of numpy.integer. This in principle should detect any cases where int(np.min(ar2)) fails. This PR also adds unittests for these.

@seberg @adeak could one of you have a look at this?

Thanks,
Miles

seberg

A few comments, I would lean towards using a .dtype.kind check, although that may require an update to not use in "ui?", but in ("u", "i", "?") for new dtypes soon...

numpy/lib/arraysetops.py

MilesCranmer · 2022-06-27T19:19:23Z

One potential issue: ar2 will be converted to uint8 if kind=None, and can still end up being passed to the kind="sort" method. Whereas if kind="sort", it will not be converted beforehand.

However, this never ends up happening in practice, since kind="table" will always handle boolean arrays, as their memory consumption is always under the limit.

MilesCranmer · 2022-06-27T19:24:10Z

Actually, it could happen, if ar1 is a bool and ar2 an integer array. ar2 could be over the range limit, and then ar1 will get converted to a uint8 if and only if kind=None, not kind="sort". But it would always use the sort method... Is that problematic or fine?

seberg · 2022-06-27T19:27:34Z

Its fine, but it would be great if you could make sure we have tests "documenting" how these are expected to behave (parameterizing the methods again ideally).

MilesCranmer · 2022-06-27T19:33:25Z

Sounds good.

Just so I understand, is np.in1d([True], [1]) == [True] the desired behavior, or would you plan to have it throw an error in the future?

seberg · 2022-06-27T19:36:03Z

It does not matter right now. We just make the test pass with whatever behavior current (or previous) NumPy had. If you think it should maybe fail, add a comment to that effect. But the day where this fails is probably also the day where np.concatenate() will fail for it. So I don't think a comment matters much (it would be a big, very intentional change anyway).

seberg

Thanks, going to apply those tweaks and merge in a bit. Especially thanks for adding those tests. More tests area always good, and now looking at it tests e.g. for datetime arrays with NaT might be nice (OTOH, those should be fine, since they use sorting/unique, which in turn should be fine!)

numpy/lib/arraysetops.py

This is probably really mainly my personal opinion...

MilesCranmer added 3 commits June 27, 2022 17:38

TST: Create in1d test for timedelta input

00e090e

MAINT: fix in1d for timedelta input

730718c

TST: in1d raise ValueError for timedelta input

7664d81

github-actions bot added the 00 - Bug label Jun 27, 2022

seberg reviewed Jun 27, 2022

View reviewed changes

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

MAINT: Clean up type checking for isin kind="table"

b4128a9

MilesCranmer requested a review from seberg June 27, 2022 19:14

seberg reviewed Jun 27, 2022

View reviewed changes

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

MilesCranmer added 2 commits June 27, 2022 21:04

TST: Add test for mixed boolean/integer in1d

c25f0c7

MAINT: Increase readability of in1d type checking

e6a076c

MilesCranmer requested a review from seberg June 27, 2022 21:13

seberg approved these changes Jun 29, 2022

View reviewed changes

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

numpy/lib/arraysetops.py Outdated Show resolved Hide resolved

STY: Apply small code style tweaks

cee8e0a

This is probably really mainly my personal opinion...

seberg added the 06 - Regression label Jun 29, 2022

seberg merged commit f9bed20 into numpy:main Jun 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix numpy.isin for timedelta dtype #21860

BUG: Fix numpy.isin for timedelta dtype #21860

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Fix numpy.isin for timedelta dtype #21860

BUG: Fix numpy.isin for timedelta dtype #21860

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!