8000 BUG: Fix numpy.isin for timedelta dtype by MilesCranmer · Pull Request #21860 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Fix numpy.isin for timedelta dtype #21860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 29, 2022
Merged

Conversation

MilesCranmer
Copy link
Contributor

This PR fixes the issue discussed on #12065 and #21843 where 'timedelta64' was noted to be a subtype of numpy.integer. This in principle should detect any cases where int(np.min(ar2)) fails. This PR also adds unittests for these.

@seberg @adeak could one of you have a look at this?

Thanks,
Miles

Copy link
Member
@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, I would lean towards using a .dtype.kind check, although that may require an update to not use in "ui?", but in ("u", "i", "?") for new dtypes soon...

@MilesCranmer MilesCranmer requested a review from seberg June 27, 2022 19:14
@MilesCranmer
Copy link
Contributor Author

One potential issue: ar2 will be converted to uint8 if kind=None, and can still end up being passed to the kind="sort" method. Whereas if kind="sort", it will not be converted beforehand.

However, this never ends up happening in practice, since kind="table" will always handle boolean arrays, as their memory consumption is always under the limit.

@MilesCranmer
Copy link
Contributor Author

Actually, it could happen, if ar1 is a bool and ar2 an integer array. ar2 could be over the range limit, and then ar1 will get converted to a uint8 if and only if kind=None, not kind="sort". But it would always use the sort method... Is that problematic or fine?

@seberg
Copy link
Member
seberg commented Jun 27, 2022

Its fine, but it would be great if you could make sure we have tests "documenting" how these are expected to behave (parameterizing the methods again ideally).

@MilesCranmer
Copy link
Contributor Author

Sounds good.

Just so I understand, is np.in1d([True], [1]) == [True] the desired behavior, or would you plan to have it throw an error in the future?

@seberg
Copy link
Member
seberg commented Jun 27, 2022

It does not matter right now. We just make the test pass with whatever behavior current (or previous) NumPy had. If you think it should maybe fail, add a comment to that effect. But the day where this fails is probably also the day where np.concatenate() will fail for it. So I don't think a comment matters much (it would be a big, very intentional change anyway).

@MilesCranmer MilesCranmer requested a review from seberg June 27, 2022 21:13
Copy link
Member
@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, going to apply those tweaks and merge in a bit. Especially thanks for adding those tests. More tests area always good, and now looking at it tests e.g. for datetime arrays with NaT might be nice (OTOH, those should be fine, since they use sorting/unique, which in turn should be fine!)

This is probably really mainly my personal opinion...
@seberg seberg merged commit f9bed20 into numpy:main Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Second contribution
Development

Successfully merging this pull request may close these issues.

2 participants
0