MAINT: refactor "for ... in range(len(" statements #19781

mwtoews · 2021-08-30T01:12:02Z

Refactor for i in range(len(items)) to for item in items, which is generally regarded to be more Pythonic.

Other instances refactored to use either zip or enumerate.

Cases were identified using git grep " in range(len(", although many other cases not suitable for refactoring.

mattip · 2021-08-30T07:08:13Z

There are three patterns of replacement:

replace for i in range(len(x)): with for item in x:
replace for i in range(len(x)): with for i, item in enumerate(x):
replace for i in range(len(x)): with for itemx, itemy in zip(x, y):

I think the first one is a clear win. I am not so excited about the other two. Is there a clear preference in our style guide? I think I would prefer to treat them as we do other code cleanups: only touch them if we need to do so for other reasons.

eric-wieser · 2021-08-30T08:03:52Z

IMO all three are overall wins, but won't comment on whether the wins are strong enough to justify the review time. From a quick skim I think this is some nice cleanup.

rkern · 2021-08-30T08:40:48Z

When replacing with zip(), especially in the test utilities, be sure to handle differing lengths correctly.

mwtoews · 2021-08-30T10:36:02Z

The motivation behind this PR is part style and performance. The for statement in Python is really a foreach statement, so it's not the same as C. Iterating over the simplest statements yields the best performance when scaled up (using Python 3.9.6):

ar = np.arange(10000, dtype=float)

%timeit for x in ar: _ = x
# 334 µs ± 75
8000
2 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit for i, x in enumerate(ar): _ = x
# 552 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit for a, b in zip(ar, ar): _, _ = a, b
# 730 µs ± 11.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit for i in range(len(ar)): _ = ar[i]
# 746 µs ± 1.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Where there are shorter iterations, the performance is about the same for all methods.

Mark any edits that are less readable, and I'll cull them.

mwtoews · 2021-08-30T10:38:57Z

When replacing with zip(), especially in the test utilities, be sure to handle differing lengths correctly.

Thanks, I was paying attention to this aspect, as zip will stop on the shortest iteration. But also x[i], y[i] would have raised IndexError before if they were different lengths.

mattip · 2021-08-30T14:02:37Z

numpy/core/records.py

-    for i in range(len(arrayList)):
-        _array[_names[i]] = arrayList[i]
+    for name, item in zip(_names, arrayList):
+        _array[name] = item


Could you merge the check in lines 667-671 into this loop so we iterate once over arrayList

Makes sense, done. Also improved check error message to also show name of the field.

numpy/compat/_inspect.py

mattip · 2021-08-30T14:06:14Z

numpy/core/tests/test_umath_complex.py

-        for i in range(len(x)):
-            assert_almost_equal(y[i], y_r[i])
+        for yi, y_ri in zip(y, y_r):
+            assert_almost_equal(yi, y_ri)


Couldn't this be written more compactly without the loop as assert_almost_equal(y, y_r)? What am I missing?

Edit: same thing in the following changes in this file

Must have been an oversight, as assert_almost_equal takes array_like args, and seems to work with complex types. Done 3x.

mattip · 2021-08-30T14:10:08Z

numpy/distutils/misc_util.py

-            args[i] = '"%s"' % (a)
+    for idx, arg in enumerate(args):
+        if ' ' in arg and arg[0] not in '"\'':
+            args[idx] = f'"{arg}"'


I don't see anywhere in numpy or scipy that uses this function. Can we deprecate it?

I've backed out this change, and will follow-up in a separate issue/PR later.

xref #19811

mattip · 2021-08-30T14:11:26Z

numpy/f2py/f2py2e.py

-            cb_rules.buildcallbacks(lst[i])
+    for item in lst:
+        if '__user__' in item['name']:
+            cb_rules.buildcallbacks(item)


The changes here look like a nice cleanup

mattip · 2021-08-30T14:12:54Z

numpy/lib/polynomial.py

-                coefstr = '(%s + %sj)' % (fmt_float(real(coeffs[k])),
-                                          fmt_float(imag(coeffs[k])))
+                coefstr = '(%s + %sj)' % (fmt_float(real(coeff)),
+                                          fmt_float(imag(coeff)))


This looks better now

mattip · 2021-08-30T14:15:10Z

OK, I had a deeper look. Only one of these looks out of place, the rest are either improvements or "doesn't matter". Some of the code can be cleaned up further.

rkern · 2021-08-30T14:36:17Z

Thanks, I was paying attention to this aspect, as zip will stop on the shortest iteration. But also x[i], y[i] would have raised IndexError before if they were different lengths.

Yes, and that needs to continue to be the case, especially for the functions in the tests. That IndexError is part of the test.

mwtoews

@mattip thanks, comments addressed in updated commit (via push -f)

numpy/compat/_inspect.py

mwtoews · 2021-08-31T09:14:40Z

numpy/core/records.py

-    for i in ran
8000
ge(len(arrayList)):
-        _array[_names[i]] = arrayList[i]
+    for name, item in zip(_names, arrayList):
+        _array[name] = item


Makes sense, done. Also improved check error message to also show name of the field.

mwtoews · 2021-08-31T09:17:58Z

numpy/core/tests/test_umath_complex.py

-        for i in range(len(x)):
-            assert_almost_equal(y[i], y_r[i])
+        for yi, y_ri in zip(y, y_r):
+            assert_almost_equal(yi, y_ri)


Must have been an oversight, as assert_almost_equal takes array_like args, and seems to work with complex types. Done 3x.

mwtoews · 2021-08-31T09:19:27Z

numpy/distutils/misc_util.py

-            args[i] = '"%s"' % (a)
+    for idx, arg in enumerate(args):
+        if ' ' in arg and arg[0] not in '"\'':
+            args[idx] = f'"{arg}"'


I've backed out this change, and will follow-up in a separate issue/PR later.

mattip · 2021-08-31T12:37:00Z

numpy/lib/tests/test_function_base.py

-    for i in range(len(desired)):
-        assert_array_equal(res[i], desired[i])
+    for result_item, desired_item in zip(res, desired):
+        assert_array_equal(result_item, desired_item)


Does this need a if len(res) != len(desired): raise ... or is it guarenteed that they will be equal length in all the tests?

It seems this function is not used, so it is now removed.

mattip · 2021-08-31T12:37:37Z

numpy/lib/tests/test_shape_base.py

-    for i in range(len(desired)):
-        assert_array_equal(res[i], desired[i])
+    for result_item, desired_item in zip(res, desired):
+        assert_array_equal(result_item, desired_item)


Does this need a if len(res) != len(desired): raise ... or is it guarenteed that they will be equal length in all the tests?

I've added a length check, with a comment for PEP 618 for Python 3.10, when these functions should be refactored to zip(..., strict=True).

It also turns out there was one failure (test_integer_split_2D_rows), where the lengths were not equal, which is simple to fix in this PR.

mattip · 2021-08-31T12:38:40Z

This looks good now, just two small nits that if broken now they were broken before too.

mwtoews · 2021-09-01T10:04:51Z

@mattip the main changes to review are in numpy/lib/tests/test_shape_base.py

mwtoews · 2021-09-01T10:07:29Z

numpy/lib/tests/test_shape_base.py

@@ -392,7 +392,7 @@ def test_integer_split_2D_rows(self):
        assert_(a.dtype.type is res[-1].dtype.type)

        # Same thing for manual splits:
-        res = array_split(a, [0, 1, 2], axis=0)
+        res = array_split(a, [0, 1], axis=0)
        tgt = [np.zeros((0, 10)), np.array([np.arange(10)]),
8000
               np.array([np.arange(10)])]
        compare_results(res, tgt)


the alternative change is to keep this res, and add a fourth array to tgt.

mattip · 2021-09-01T13:52:06Z

Thanks @mwtoews for working through all this

github-actions bot added the 03 - Maintenance label Aug 30, 2021

mwtoews force-pushed the foreach-item branch 2 times, most recently from f0b865a to 5f6cd40 Compare August 30, 2021 03:07

mattip reviewed Aug 30, 2021

View reviewed changes

numpy/compat/_inspect.py Outdated Show resolved Hide resolved

mattip reviewed Aug 30, 2021

View reviewed changes

< 8000 div class="avatar-parent-child TimelineItem-avatar d-none d-md-block">

mwtoews force-pushed the foreach-item branch from 5f6cd40 to ff8c293 Compare August 31, 2021 09:21

mwtoews commented Aug 31, 2021

View reviewed changes

mattip reviewed Aug 31, 2021

View reviewed changes

MAINT: refactor "for ... in range(len(" statements

64f15a9

mwtoews force-pushed the foreach-item branch from ff8c293 to 64f15a9 Compare September 1, 2021 09:51

mwtoews commented Sep 1, 2021

View reviewed changes

mattip merged commit 6829957 into numpy:main Sep 1, 2021

mwtoews mentioned this pull request Sep 2, 2021

DEP: Deprecate quote_args (from numpy.distutils.misc_util) #19811

Merged

mwtoews deleted the foreach-item branch September 22, 2021 00:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT: refactor "for ... in range(len(" statements #19781

MAINT: refactor "for ... in range(len(" statements #19781

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MAINT: refactor "for ... in range(len(" statements #19781

MAINT: refactor "for ... in range(len(" statements #19781

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!