ENH: ndrange, like range, but multidimensional #12094

hmaarrfk · 2018-10-05T16:26:26Z

I've been using numpy arrays as a way to store collections of items in 2D. Mostly because of the powerful slicing numpy offers.

It became useful to iterate through the collection in various ways, often wanting to use pythonic tools like range.

I wrote what I think is a cool ndrange class that behaves like range (I hope) but in ND.

If this is in the scope of Numpy, I would appreciate feedback on how this might be merged in.

Performance concerns

@ahaldane asked on the mailing list if it would be better to implement this on top of nditer like ndindex is currently implemented for performance reasons.

It seems that itertools.product + range (proposed ndrange implementation) is faster than ndinter + multi_index (current ndindex implementation)

In [3]: %%timeit 
   ...: for it in np.ndrange((100, 100)): 
   ...:     pass 
   ...:      
   ...:                                                                         
238 µs ± 1.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: %%timeit 
   ...: for it in np.ndindex((100, 100)): 
   ...:     pass 
   ...:      
   ...:                                                                         
3.67 ms ± 44.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

It seems that ndindex can be sped up by introducing

    def __init__(self, *shape):
        # [...]
        # defining a generator expression here as opposed to doing this
        # in the __next__ seems to speed ndindex by a factor of 1.7
        self._index_iter = (self._it.multi_index for _ in self._it)
    def __next__(self):
         return next(self._index_iter)

This seems like a bad optimization for CPython though.
More details can be found here https://gist.github.com/hmaarrfk/a273b3d77cbec6e7b0d1c4f33ea65dd0

Previous attempts

I tried to extend the ndindex class, but it just became very ugly,
https://github.com/hmaarrfk/numpy/pull/1/files#diff-1bd953557a98073031ce66d05dbde3c8R661
and containment was strange because ndindex could be consumed itself. What would it mean to check if (0, 0, 0) was in ndindex after a few items had been consumed.

~~This just won't fly in python 2 because range or xrange objects cannot be sliced into the way they are in python 3.~~ I enabled python 2.7 support, but it isn't as elegant as 3.X support.

Todo:

Add docstrings
harden tests for failers
decide on how to handle lists as inputs
Remove development comments
Finish the rest of testing for the properties of ndrange
Add tests for ravel
Are flat and ravel required to return standard numpy classes?
Squash the PRs into ENH, TST (once the fate of ravel and flat is determined).
Enforce that step cannot be provided as a parameter when start and stop aren't both specified. ndrange(5, step=2) should be illegal.

Fixes: #6393
12094-before_rebase.patch.txt

~~Patch from 2018 before rebase~~ The conflicts were minor, i just rebased...

mattip · 2018-10-07T07:09:16Z

This should go to the mailing list for more general discussion. Please describe there the motivation and a general overview of the way you chose to implement it.

numpy/lib/index_tricks.py

eric-wieser · 2018-10-08T10:03:22Z

I'm not sure that ndrange should have a concept of an order - that sounds like a property of its iterator, just like a list object does not have a reversed attribute.

hmaarrfk · 2018-10-08T12:02:13Z

interesting idea, so the syntax would be:

for i in iter(ndrange(arr.shape)[::2], order='F'):
    print(i)

I've been debating getting rid of the order parameter all together. I included it because conceptually, some things are easier if you "do them along the first column first".

Edit:
This syntax doesn't even work. Oh well... I'm slowly growing in favour of just not having order and simply letting people do this:

for i in np.ndrange(arr.shape[::-1]):
    i = i[::-1]

to iterate in F order

hmaarrfk · 2018-10-10T03:53:57Z

Not sure if we need a release note, but this is a draft

``ndrange`` a multi-dimensional range-like object
-------------------------------------------------

``ndrange`` now supersedes the ``ndindex`` object for generating multi-index
iterators. ``ndrange`` behaves much like the Python 3 ``range`` object and
allows multi-dimensional slicing, iteration, reversal and efficient containment
lookup. ``ndindex`` will continue to be maintained for backward compatibility.

hmaarrfk · 2018-10-10T04:02:20Z

And here is the patch for fotran ordering. I no longer think it is useful, but maybe somebody else wants a go at it. The tests might be useful.

ndrange_fortran_ordering_patch_and_tests.patch.txt

numpy/lib/index_tricks.py

ahaldane · 2018-10-10T16:55:42Z

numpy/lib/index_tricks.py

+        return all(i in r for i, r in zip(index, self._ranges))
+
+    def __getitem__(self, sl):
+        # TODO: what is the correct way to handle non-tuple inputs?


The behavior here seems good. But maybe we could add a nicer error message? Currently if you pass in a list, you get:

>>> np.ndrange((1,2,3))[[0,0,0]] TypeError: range indices must be integers or slices, not list

It would be nice if that said something about how the index can be a tuple.

Related: #9686. We want to encourage people to index with tuples instead of lists.

I don't know how to get the not list part. Checking for types in python is hard....

ahaldane · 2018-10-10T17:02:49Z

numpy/lib/index_tricks.py

+    """
+    An N-dimensional range object that returns an iterator (or reversed
+    iterator) tthat can produce a sequence of tuples from
+    start (inclusive tuple) to stop (exclusive tuple) by step.


Some of this isn't quite accurate. It doesn't return an iterator, rather, it can be iterated.

What about more closely adapting the range docstring?

I tried. Let me know what you think.

ahaldane · 2018-10-10T17:03:29Z

numpy/lib/index_tricks.py

+
+    Unlike ``ndindex``, ``ndrange`` is not an iterator.
+    You should prefer this ``ndrange`` object over ``ndindex`` as ``ndrange``
+    allows for multi-dimensional slicing.


I think this comment belongs in the nditer docstring, but isn't needed in the ndrange one.

ok. adapted.

numpy/lib/index_tricks.py

mattip · 2024-09-04T18:39:59Z

Did this ever reach the mailing list as an API change?

mattip · 2024-09-04T18:41:31Z

Yes, it did https://mail.python.org/archives/list/numpy-discussion@python.org/message/G6JAONQ4BKT4FNQIBHOAHM6BC3CZ7VO2/

hmaarrfk · 2024-09-04T20:21:27Z

lets see how well 6 year old code survives....

hmaarrfk · 2024-09-04T20:32:56Z

I would love to run my benchmark again in 2024, but I keep running into:

Using pip 24.2 from /home/mark/miniforge3/envs/np/lib/python3.12/site-packages/pip (python 3.12)
Non-user install because site-packages writeable
Created temporary directory: /tmp/pip-build-tracker-5gzlhngk
Initialized build tracking at /tmp/pip-build-tracker-5gzlhngk
Created build tracker: /tmp/pip-build-tracker-5gzlhngk
Entered build tracker: /tmp/pip-build-tracker-5gzlhngk
Created temporary directory: /tmp/pip-install-6hyb4gpy
Created temporary directory: /tmp/pip-ephem-wheel-cache-579d6re2
Obtaining file:///home/mark/git/numpy
  Added file:///home/mark/git/numpy to build tracker '/tmp/pip-build-tracker-5gzlhngk'
  Running command Checking if build backend supports build_editable
  Checking if build backend supports build_editable ... done
  Created temporary directory: /tmp/pip-modern-metadata-z_g4s_0q
  Running command Preparing editable metadata (pyproject.toml)

  meson-python: error: Could not find meson version 0.63.3 or newer, found .
  error: subprocess-exited-with-error

I installed my environment with:

# using conda-forge.
mamba create --name np python=3.12 meson-python ninja pkg-config python-build cytho
n libblas libcblas liblapack  compilers

tips appreciated

hmaarrfk · 2024-09-04T21:11:42Z

I used the power of copy and paste to obtain the following local benchmarks:

In [11]: In [3]: %%timeit
    ...:    ...: for it in np.ndrange((100, 100)):
    ...:    ...:     pass
    ...:
143 μs ± 2.51 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [12]:
    ...: In [4]: %%timeit
    ...:    ...: for it in np.ndindex((100, 100)):
    ...:    ...:     pass
    ...:    ...:

1.15 ms ± 16.4 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [13]:

In [13]: np.__version__
'2.1.1'

In [14]: import sys

In [15]: sys.version
'3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]'

hmaarrfk · 2024-09-04T22:02:28Z

numpy/lib/_index_tricks_impl.py

+
+    Notes
+    -----
+    .. versionadded:: 2.2.0


is this correct??

Sure, assuming we review and merge before the 2.2 release :)

hmaarrfk · 2024-09-04T23:48:06Z

And green

mattip · 2024-09-05T03:44:55Z

@ahaldane @shoyer @eric-wieser as people involved in the original mailing list thread, any thoughts?

shoyer · 2024-09-05T05:14:04Z

In the many years since that mailing list discussion, I've gotten a bit more conservative in my taste for API design.

It is still not clear to me who would use this feature. It feels like a solution that exists for the sake of "elegance" or "symmetry," not actual user needs.

If the intended users are library authors doing indexing manipulations for arrays, then @asmeurer's ndindex is a much more complete solution.

hmaarrfk · 2024-09-05T11:44:25Z

There are two places where we use it:

to iterate faster than ndindex. When you add a time dimensions to your problem, you can start to have one of the dimensions be in the 1000s quite easily.
Ndindex feels just feature incomplete ( no skipping ). Due to this it makes it hard to use generically.

I understand if this feels niche. I can try to see how I can take the concepts of here and to make PRs to improve the performance of ndindex. Historically python 2.7 made it difficult. I couldn’t figure out how to allow for slicing and the speed up’s without breaking functionality, so I proposed ndrange.

we use it to iterate over specific dimensions of our xarray all the time.

ahaldane · 2024-09-05T22:56:13Z

Looking at the old discussion, this again seems like a hard PR to decide what to do with. I think a more active maintainer should make a decision BDFL-style.

I'd be 60/40 in favor of accepting. The downside is the clutter of having two functions for largely the same thing, kind of like np.ones and np.ones_like. As in that case, ndrange here would be the updated way of doing what ndindex does now, with some more flexibility. The upside is that maybe a couple people besides @hmaarrfk will find it useful. Just like many of the other obscure functions in numpy/lib, it's hard to judge the cost/benefit since since one should aim to use vectorized operations instead of ndindex anyway.

@shoyer pointed to the quantsight nditer project, which is new since this PR was opened. There's a lot of overlap, but while that project looks great it doesn't seem to have exactly the performance or convenience of an "nd version of python3's range" like ndindex or ndrange here.

hmaarrfk · 2024-09-05T23:01:53Z

I’ll close this leaving it as a good idea for one user of jumpy.

If an other user thinks it is good and causes this discussion to be revived. Then great!

If nditer or an other project takes off. All the better.

Keeping this as an internal pure python function in our own codebase is also fine.

Thank you all for your time and review.

I’m much more excited about spending our collective on performance improvement ideas I have ;)

ahaldane · 2024-09-06T16:52:03Z

Fair enough. It's a nice idea on its own, too bad ndindex needs backcompat.

If you have enough interest, you might consider proposing it to the quantsight project, or just put it up on github as a standalone. That might show people using it enough to upstream here.

hmaarrfk force-pushed the ndrange branch 3 times, most recently from 78fe0c4 to f4a8950 Compare October 5, 2018 17:00

tylerjereddy added the 25 - WIP label Oct 5, 2018

hmaarrfk force-pushed the ndrange branch 3 times, most recently from 64fcf43 to 109aef9 Compare October 6, 2018 05:46

hmaarrfk changed the title ~~WIP: Feedback request: FEAT: ndrange, like range, but multidimensiontal~~ WIP: Feedback request: FEAT: Py3 only: ndrange, like range, but multidimensiontal Oct 6, 2018

hmaarrfk changed the title ~~WIP: Feedback request: FEAT: Py3 only: ndrange, like range, but multidimensiontal~~ WIP: Feedback request: FEAT: ndrange, like range, but multidimensiontal Oct 7, 2018

hmaarrfk force-pushed the ndrange branch from 2227dcf to aee3746 Compare October 7, 2018 06:09

hmaarrfk changed the title ~~WIP: Feedback request: FEAT: ndrange, like range, but multidimensiontal~~ FEAT: ndrange, like range, but multidimensiontal Oct 7, 2018

mattip changed the title ~~FEAT: ndrange, like range, but multidimensiontal~~ ENH: ndrange, like range, but multidimensiontal Oct 7, 2018

hmaarrfk changed the title ~~ENH: ndrange, like range, but multidimensiontal~~ WIP: ENH: ndrange, like range, but multidimensiontal Oct 7, 2018

hmaarrfk changed the title ~~WIP: ENH: ndrange, like range, but multidimensiontal~~ ENH: ndrange, like range, but multidimensiontal Oct 7, 2018

hmaarrfk force-pushed the ndrange branch 2 times, most recently from 7973163 to e60bd10 Compare October 7, 2018 15:32

eric-wieser reviewed Oct 8, 2018

View reviewed changes

numpy/lib/index_tricks.py Outdated Show resolved Hide resolved

hmaarrfk force-pushed the ndrange branch from 98050eb to 22f33da Compare October 9, 2018 04:57

hmaarrfk force-pushed the ndrange branch from 4188a9b to f67f966 Compare October 10, 2018 03:54

hmaarrfk force-pushed the ndrange branch from f67f966 to ded479c Compare October 10, 2018 04:02

ahaldane reviewed Oct 10, 2018

View reviewed changes

numpy/lib/index_tricks.py Outdated Show resolved Hide resolved

ahaldane reviewed Oct 10, 2018

View reviewed changes

numpy/lib/index_tricks.py Outdated Show resolved Hide resolved

hmaarrfk force-pushed the ndrange branch from b286286 to e6b37a7 Compare September 4, 2024 20:19

hmaarrfk force-pushed the ndrange branch 2 times, most recently from 38e3c75 to 9902492 Compare September 4, 2024 20:29

hmaarrfk force-pushed the ndrange branch from 9902492 to ebc6fc9 Compare September 4, 2024 20:34

DOC: Add a release note about ndrange

3aaef83

hmaarrfk force-pushed the ndrange branch from ebc6fc9 to 1b0e332 Compare September 4, 2024 21:01

hmaarrfk force-pushed the ndrange branch 2 times, most recently from 23ffe4c to 69f83af Compare September 4, 2024 21:21

hmaarrfk commented Sep 4, 2024

View reviewed changes

hmaarrfk force-pushed the ndrange branch 2 times, most recently from f4e868c to 49055b5 Compare September 4, 2024 22:04

hmaarrfk added 2 commits September 4, 2024 18:22

ENH: Add a new type of iterator generator, ndrange

df70982

DOC: add ndrange to the documented routines

6ebdb86

hmaarrfk force-pushed the ndrange branch from 49055b5 to 6ebdb86 Compare September 4, 2024 22:23

mattip added triage review Issue/PR to be discussed at the next triage meeting and removed 52 - Inactive Pending author response labels Sep 5, 2024

hmaarrfk closed this Sep 5, 2024

InessaPawson removed this from Inactive PR Management Apr 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: ndrange, like range, but multidimensional #12094

ENH: ndrange, like range, but multidimensional #12094

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ENH: ndrange, like range, but multidimensional #12094

ENH: ndrange, like range, but multidimensional #12094

Uh oh!

Conversation

Uh oh!

Performance concerns

Previous attempts

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!