8000 Python nditer should reinsert removed axes into its iteration results · Issue #9808 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Python nditer should reinsert removed axes into its iteration results #9808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eric-wieser opened this issue Oct 2, 2017 · 11 comments
Open

Comments

@eric-wieser
Copy link
Member
eric-wieser commented Oct 2, 2017

When iterating using C, all that's returned is a pointer to the start of the array, and it's the users job to index it with their own memory of its strides and dimensions (like what happens in the gufunc inner loops).

However, the python nditer[i] API always returns 0d arrays, and does not produce a view over the axes that were removed.

My feeling is that this:

a = np.array([[1, 2], [3, 4]])
b = np.array([[10, 20], [30, 40]])

it = np.nditer((a, b), op_axes=[[0], [0]])

while not it.finished:
    print(i.value)
    it.iternext()

should output

(array([1, 2]), array([10, 20]))
(array([3, 4]), array([30, 40]))

whereas it outputs at the moment

(array(1), array(10))
(array(3), array(30))

Can someone with a better understanding of NdIter confirm that my intuition is correct here?

@seberg
Copy link
Member
seberg commented Oct 2, 2017

In some sense, I don't think that the python side API was ever really aimed for series usage outside of testing, understanding the C-side and some stuff in the direction of numexpr. In some sense I guess it just mirrors the C-API too much.
In this case, yes, one could try to change it, does not even change the only real thing of interest here in some sense (the pointer to the data), so it might even be safe for someone actually using it....

But overall, I think I would rather have a ndarray.iteraxis(axis=None) or whatever new iterator then try to make nditer nice enough to actually advertise. But I suppose you want to do some advanced stuff inside numpy itself, so not sure what you need.

@mhvk
Copy link
Contributor
mhvk commented Oct 2, 2017

Hmm, it may not be intended, but we're definitely using nditer in astropy. The cases I can think of all iterate over all axes, so it is not an issue, but @eric-wieser's expectation seems reasonable.

@seberg
Copy link
Member
seberg commented Oct 2, 2017

I think it is not unintended to use it from python, or bad habit, but I personally believe it has too many quirks to call it a reasonable python API. So it is a very advanced feature and probably not used a lot.

@eric-wieser
Copy link
Member Author

I don't think that the python side API was ever really aimed for series usage outside of ... understanding the C-side

This was my use case, and I found it very confusing that the removed axes just seemed to dissapear.

@WillAyd
Copy link
Contributor
WillAyd commented May 20, 2020

AFAICT this behavior can be traced to here:

I've noticed if hard-coding

        innerloopsize = 2;
        innerstride = 8;
        /* If the iterator is going over every element, return array scalars */
        ret_ndim = 1;

You get the expected result above. Obviously that's not a real solution...tried to leverage the innerloopsizeptr and innerstrides but they appear to segfault on access from the !has_external_loop branch so I assume there's some state management that those are coupled with.

Will try to dig further and see if there is a solution here, just sharing above in the interim in case its of use to anyone else

@seberg
Copy link
Member
seberg commented May 20, 2020

That looks right, if op_axes is used, this could be modified. Maybe we should just do that with a big release notes warning? The current method is pretty useless after all, and if you use np.stride_tricks or similar on the (assumed 0D) output array to get the old one back, then it would not actually break if we change this here...

@eric-wieser
Copy link
Member Author

I think changing this is fine. I think I remember trying to fix this in the past, and not doing so because it was hard, not because it might break compatibility.

@WillAyd
Copy link
Contributor
WillAyd commented May 21, 2020

Cool I'll give it a look over the next few days and see if I can push a PR

@WillAyd
Copy link
Contributor
WillAyd commented May 29, 2020

Any pointers on how to introspect the size of the operands at this point during iteration? I think the stride information is accessible through the **dtypes struct member of NewNpyArrayIterObject_tag

@seberg
Copy link
Member
seberg commented May 29, 2020

It is a bit tricky, but basically, I do not think the NpyIter will help you at all. I do not even see that it holds on to the op_axes and you do not really need to know the iteration order.

What I think you have to do is extract where the -1 elements are, and store their dimensions and strides and the number... Probably on the python side, unless you want to add this as a helper to NpyIter itself (which is maybe not implausible).

@eric-wieser
Copy link
Member Author

Any pointers on how to introspect the size of the operands at this point during iteration? I think the stride information is accessible through the **dtypes struct member of NewNpyArrayIterObject_tag

This sounds like the issue that stumped me. I think my conclusion was that tracking that state was the caller's problem, but I don't remember too well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3EAD
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0