8000 Removed unnecessary call to reversed in for loop by RScrusoe · Pull Request #125690 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Removed unnecessary call to reversed in for loop #125690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

RScrusoe
Copy link

In random module's shuffle method while iterating over the list in reversed fashion, the code was using 2 generators (reversed, range). But as we know, we can simply use range generator alone (with proper start, end and step) to get the desired reversed iteration.

This is my first PR into my favourite programming language. Your feedback/suggestions are welcome and appreciated!

Note: As this is a minor change, have not opened an issue

@RScrusoe RScrusoe requested a review from rhettinger as a code owner October 18, 2024 11:40
@ghost
Copy link
ghost commented Oct 18, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link
bedevere-app bot commented Oct 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@RScrusoe
Copy link
Author

I am not able to add "skip issue" & "skip news", can the maintainers add it? or maybe I can add, but not able to find it?

@eendebakpt
Copy link
Contributor

It original code indeed used two generators, and this PR only one. But is this PR more readable (I think not) or faster (benchmark)?

Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
@bedevere-app
Copy link
bedevere-app bot commented Oct 18, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@RScrusoe
Copy link
Author

@eendebakpt Thank you for the review. Fixed it!

For the comment on readability, I really believe, that people (who read this code) would know the range iterator and how it is used with start/end/step params.

On the benchmark: I am not sure on how to benchmark this particular small thing, as there can be too many things that would interfere with benchmarking code which would try to benchmark this particular difference (at least, I am not so sure of how to perf test it). But, looking at proposed code, it is clear that an additional function call (stack frame allocation) is saved.

@eendebakpt
Copy link
Contributor

You can do a quick benchmark (using ipython) with

import random
x = list(range(100))
%timeit random.shuffle(x)

That does not tell everything, but is a first indication of whether the code is faster or not.

You do save a function call, but in the pr the arguments to range are more complex, so the end result could be faster, slower or ... about the same in performance.

@TeamSpen210
Copy link

range does have a __reversed__ method, meaning both versions will produce the same iterator object. So the question is whether the Python-level arithmetic is faster than calling reversed().

@eendebakpt
Copy link
Contributor

There was a detailed discussion in #108598 on improving the performance of random.shuffle.

@pochmann Is there a reason why for the proposal3b from #108598 (comment) no PR was created?

@AA-Turner
Copy link
Member

I suspect this is unlikely to be worth it, absent incredible microbenchmarks.

A

@AA-Turner AA-Turner added the pending The issue will be closed if no feedback is provided label Oct 19, 2024
@RScrusoe
Copy link
Author

Sorry for late reply!
PFB my attempt at benchmarks
I spawned a fresh EC2 instance (c5.large) with ubuntu 22.04
Code that I ran:

import random
import timeit

def test_shuffle():
    # Generate a small list
    small_list = list(range(10))
    random.shuffle(small_list)

if __name__ == "__main__":
    # Benchmark the shuffle function over a large number of iterations
    iterations = 1000000  # Number of times to run the shuffle
    time_taken = timeit.timeit("test_shuffle()", setup="from __main__ import test_shuffle", number=iterations)

    print(f"Time taken to shuffle small lists {iterations} times: {time_taken:.4f} seconds")

I specifically tested small lists but large number of times shuffling call.

My benchmark finding:
Original:
Screenshot 2024-10-21 at 11 17 29 AM

The PR version:
Screenshot 2024-10-21 at 11 15 58 AM

Math:
Avg of 3 runs with original code: 2.5849
Avg of 3 runs with PR code: 2.5240
The PR code is better by %: 2.35%
(this is for 1 million calls to the shuffle function)

Suggestions/feedback is welcome

@bedevere-app
Copy link
bedevere-app bot commented Oct 21, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@rhettinger
Copy link
Contributor
rhettinger commented Oct 21, 2024

The code uses reversed because it is more clear than the range variation. See PEP 322

Note the range object has custom C code implementing reverse iteration. So it is quite speedy.

There aren't two levels of stack during iteration. The reversed call it just a pass through to the underlying range.__reversed__ method.

Thanks for the suggestion, but I will decline.

@rhettinger rhettinger closed this Oct 21, 2024
@AA-Turner AA-Turner removed the pending The issue will be closed if no feedback is provided label Oct 21, 2024
@RScrusoe
Copy link
Author

@rhettinger Thanks for the feedback, makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0