Allow rolling multiple axes at the same time. #7438

anntzer · 2016-03-20T01:56:46Z

Also switched the error message to match the one of ufuncs, because the
axis can actually also be negative.

seberg · 2016-03-20T10:48:50Z

Sounds like a good idea and straightforward generalization. Two comments:

I am not sure I like the use of broadcast, on the one hand broadcasting them makes sense, on the other hand though broadcast as used is much more general. And I am not sure the broadcasting itself is big gain here.
The single axis case probably got quite a lot slower here (no idea how much, but I would not be surprised by 3-10 times). It is unfortunate, since oindex would solve this generally, but.... Maybe we should keep a one axis special case?

seberg · 2016-03-20T11:58:27Z

numpy/core/numeric.py

+    shift : int or tuple of ints
+        The number of places by which elements are shifted.  If a tuple,
+        then `axis` must be a tuple of the same size, and each of the
+        given axes is shifted by the corresponding number.  If an int


Would be good to add a ..versionadded:: thingy here for the sequence of ints (also slightly prefer sequence over tuple).

I guess it should really be "array-like of ints"? I kept tuple mostly for consistency with e.g. np.sum.

Oh, OK, no I prefer tuple or sequence of ints. Thought we used sequence mostly, but if we use tuple as well, just keep it as it is. It currently is more an "array-like of ints" in implementation, but I don't like that too much to be honest (you could put in a 2x2 array...).

So to be clear. If you don't mind just ignoring the broadcasting for now, I think I would prefer something like:

try: axes = tuple(axis) except: axes = (axis,)

or the inverted try: operator.index(axis). But if the broadcasting is important to you, maybe as is, is just simpler, hmmm.

Well, I did not know that iterating over a broadcast object would result in nd-iteration (i.e. a 2x2 axis would get flattened out at that point).
I still like the broadcasting behavior, though.

I guess you could just check broadcast(...).shape to be 0 or 1D and raise an error otherwise, that would seem acceptable to me. Or multiply the length of the tuples manually if they are length 1, but probably that gets annoying.

Coming back to the tuple-vs-sequence issue, I just noticed that sum, for example, would error out if a non-tuple sequence is passed as the axis argument. Not sure if that means sum should be improved or we should only accept tuples to avoid weird edge cases...

I think it does not matter. In most of the python side stuff, we will allow any sequence and sometimes iterable (tuple(iterable) works). I don't think we have to strife for perfect consistency when it comes to tuples vs. sequences vs. iterables for the this type of arguments.

If you prefer, you can make it a strict tuple here as well, but basically I would pick whatever is easiest.

It seems we got side tracked here and forgot about the versionadded tag. It would be good to mention somewhere that roll along multiple axes was added with this tag. IIRC it could either go here, or also into the Notes section.

anntzer · 2016-03-20T15:20:58Z

The first implementation (one-axis-at-a-time, 79aae0ad4a31cc86a18a36f7c8028428f560a9cc) does not suffer from the slowdown (which is around 3 to 4-fold), but loses on multidimensional rolling (it rolls each axis one at a time). What is the status of oindex? If it's "it's going to get merged at some distant but finite point in the future" then I may as well wait for oindex to get in and then rewrite this accordingly, as there isn't any real hurry there.

seberg · 2016-03-20T15:38:14Z

numpy/core/tests/test_numeric.py

+
+        x2r = np.roll(x2, (1, 1), axis=(0, 1))
+        assert_equal(x2r, np.array([[9, 5, 6, 7, 8], [4, 0, 1, 2, 3]]))
+


I would like some tests here that actually do roll at least two dimensions at the same time. And now that I think about it, what happens if you roll the same dimension twice ;)? Some simple error cases would be good, too.

Oh, the last case does that kind of roll I guess. Negative roll might be interesting.

seberg · 2016-03-30T07:57:07Z

Yeah, it probably is about that. But there is no definite schedule, so I think you also might as well put something in here now ;). Also there is a bit of a problem with subclass handeling, which might create problems. The other option would be to actually do the manual assignments and only use slicing. In some sense I think it is the best way, but probably also tedious.

anntzer · 2016-03-30T16:34:25Z

What's wrong with subclass handling? If a subclass messes up fancy indexing I can't do much about it...

seberg · 2016-03-30T16:42:47Z

I think I would prefer to make oindex break with subclasses that have a __getitem__ but not themself implement oindex, so if we use oindex to implement it, we could break such a subclass.

seberg · 2016-03-30T16:43:08Z

(unless this function casts them to an array in any case, but I do not think it does)

anntzer · 2016-03-30T18:36:26Z

It casts using asanyarray, which seems right in this case.

seberg · 2016-03-30T19:17:19Z

Yeah, which should be fine mostly, there are some wonky cases, but those are probably limited to np.matrix ;). Likely over engineered, but what about such an approach?:

rolls = [((slice(None),)*2,)] * arr.ndim

for ax, offset in zip(axis, shift):  # assuming unique and normalized
    if offset == 0:
        continue
    # first chunk original arr and result:
    chunk_one = (slice(None, -offset), slice(offset, None))
    # second chunk original arr and result:
    chunk_two = slice(-offset, None), slice(None, offset)
    rolls[ax] = (chunk_one, chunk_two)

result = np.empty_like(arr)
for indices in itertools.product(*rolls):
    arr_index, res_index = zip(*indices)
    result[res_index] = arr[arr_index]

anntzer · 2016-03-30T21:17:26Z

Neat, and faster than even the original (1D) implementation.

seberg · 2016-03-31T06:33:31Z

Yeah, should be faster, except maybe for small arrays. Can you add a test for large rolls (rolls larger than the dimension)? Since I think there is a modulo operation missing as is and the test should have noticed that.

seberg · 2016-03-31T08:21:31Z

Another point might be difference for very weird subclasses, hmmmmm.

anntzer · 2016-03-31T08:31:54Z

Added the missing modulo on the shift, and test cases.
I'm sure you can always come up with subclasses that break some behavior of ndarray indexing but what can you do about it?

seberg · 2016-03-31T08:43:06Z

I was thinking about array wrap now. I forgot about things such as astropy units, which might be broken by the empty_like call compared to the indexing calls that were used before.

anntzer · 2016-03-31T09:01:43Z

Astropy units seem to handle this fine. Masked arrays too.

seberg · 2016-03-31T17:25:50Z

OK, if units work, I think we can give it a try as is. Did not go over the code again right now, but I think it should be fine, so will merge in a bit if no-one else has comments (or I notice something later).

seberg · 2016-04-02T19:15:56Z

numpy/core/numeric.py

-    return res
+        broadcasted = broadcast(shift, axis)
+        if len(broadcasted.shape) > 1:
+            raise ValueError("'axis' should be a scalar or a 1D sequence")


should maybe mention shift and axis.

seberg · 2016-04-02T19:17:29Z

Could you please squash the commits when done?

@seberg

A quick test suggests that this implementation from @seberg, relying on slices rather than index arrays, is 1.5~3x faster than the previous (1D) roll (depending on the axis). Also switched the error message for invalid inputs to match the one of ufuncs, because the axis can actually also be negative.

anntzer · 2016-04-02T19:23:39Z

Done.

seberg · 2016-04-02T19:32:27Z

Thanks! lets give it a shot.

seberg · 2016-04-02T19:32:46Z

Ah, will wait for the tests, so might not do it tonight ;).

seberg · 2016-04-02T20:07:00Z

Thanks again.

jakirkham · 2016-10-12T19:44:31Z

Nice was just looking for something like this to dedup some utility code. Is this going in 1.12.0 then?

rgommers · 2016-10-13T07:53:24Z

Is this going in 1.12.0 then?

yep

anntzer force-pushed the multidim-roll branch from 71796df to 0082082 Compare March 20, 2016 03:52

seberg reviewed Mar 20, 2016
View reviewed changes

seberg added 01 - Enhancement component: numpy.lib component: numpy._core and removed component: numpy.lib labels Mar 21, 2016

anntzer force-pushed the multidim-roll branch from de64ecf to 79906d8 Compare March 31, 2016 08:30

seberg reviewed Apr 2, 2016
View reviewed changes

anntzer force-pushed the multidim-roll branch from cbebfcd< 8000 /code>to d3204c8 Compare April 2, 2016 19:17

anntzer force-pushed the multidim-roll branch from d3204c8 to 612caca Compare April 2, 2016 19:23

seberg merged commit 4b228a5 into numpy:master Apr 2, 2016

homu mentioned this pull request Apr 2, 2016

Added pathlib support for several functions #6660

Merged

anntzer deleted the multidim-roll branch April 2, 2016 21:08

rgommers added this to the 1.12.0 release milestone Oct 13, 2016


		x2r = np.roll(x2, (1, 1), axis=(0, 1))
		assert_equal(x2r, np.array([[9, 5, 6, 7, 8], [4, 0, 1, 2, 3]]))

Uh oh!

Allow rolling multiple axes at the same time. #7438

Allow rolling multiple axes at the same time. #7438

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!