-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Fix return shape of inverse_indices in unique_inverse #25553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Should we also just change this in the normal version? It does look much like a bug-fix when |
I don't think we should make this change in
|
No strong opinion, although it seems so odd that I think we can do it, and no time like a 2.0 release to change such a thing that looks like an oversight to begin with. (If users worked around via |
Sure, I see what you're saying. I'm happy to update the PR to modify the original (and add appropriate docs/changelog) if you'd prefer that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to fix this!
Like @seberg, I think we should ideally fix this in np.unique
itself, since it is clearly a bug.
That said, np.unique
has an axis
argument, and with that the simple reshape is no longer generally correct. It would need something like the following:
arr = np.array([[[1, 2, 3], [2, 3, 1]]]*2 +[[[1, 3, 2], [2, 1, 3]]])
axis = 1 # for testing
a, i = np.unique(arr, axis=axis, return_inverse=T
8000
rue)
if axis is None:
i.shape = arr.shape
else:
i.shape = tuple((sh if ax == axis else 1) for ax, sh in enumerate(arr.shape))
# we probably want this too, to make the indices similar to argsort.
i = np.broadcast_to(i, arr.shape)
# check result is correct, in way also suggested for np.argsort, etc.
np.all(np.take_along_axis(a, i, axis=axis) == arr)
# all true
I think this change to |
In the triage meeting we reached a consensus to call the wrong 1d shape a bug, and fix it without any deprecation. It should be mentioned in the release notes. |
Thanks - I will update this PR to change the output shape for |
fbcedb6
to
6903f6c
Compare
OK, I changed the implementation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Only a silly nitpick from me. Approving.
Note there is a linting error, so you can fix the versionchanged at the same time... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again!
Code was written to expect a 1-D array with the inverse indices, and that got changed in NumPy in numpy/numpy#25553. Closes scipygh-19867 [skip cirrus] [skip circle]
This broke SciPy in two places, in ways that were a little nontrivial to diagnose: scipy/scipy#19868. The fix does make sense, but more complaints may roll in - this probably broke most usages of the returned reverse indices. |
Hmm, @jakevdp was clearly right to worry about breaking things. Arguably in cases like this we should run the scipy tests... Thanks for fixing! |
@rgommers do you think this change should be reverted? I do think the change in the |
I don't know, I'd give it a week or so - if no issues turn up for other libraries that test against numpy nightlies, it's probably okay to keep the change. |
One more data point: this broke some code in an unreleased project of mine ( NumPy 1.26.4:
NumPy 2.0.0rc2
I can easily fix the issue, and since we haven't heard of any other problems related to this, this doesn't open up again the question of reverting the change. Just FYI. |
Well, I thought the reason why this was different in the Array API was because the current behavior doesn't make sense... If that was the case, I would not want to revert this. But, unfortunately, that impression was seems wrong. There is no big reason for EDIT: Sorry, forgot to cross-ref gh-26738 |
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
Just FYI, this is now reverted except for |
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
Fixes #25552