-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-27212: Modify islice recipe to consume initial values preceding start #6195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Doc/library/itertools.rst
Outdated
@@ -435,8 +435,13 @@ loops that truncate the stream. | |||
# islice('ABCDEFG', 2, 4) --> C D | |||
# islice('ABCDEFG', 2, None) --> C D E F G | |||
# islice('ABCDEFG', 0, None, 2) --> A C E G | |||
# it = iter('abcdefghi') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please omit the three additional comment lines and only include the code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, please update Lib/test/test_itertools.py to add a new class, TestPurePythonRoughEquivalents() that has a test_islice_recipe() method that runs all the test_islice() tests against the pure python version of islice().
While we're at it, let's modify the consume() recipe to have n default to None:
- def consume(iterator):
+ def consume(iterator, n=None):
Let's also add a test to Lib/test/test_itertools.py libreftest section so that the consume() code gets exercised
>>> it = iter(range(10))
>>> consume(it, 3)
>>> next(it)
3
>>> consume(it)
>>> next(it, 'Done')
'Done'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: Upon reflection, I think you want the pure python version to pass all the tests otherwise you wouldn't have suggested the new test class. Your code example is so well written and succinct that I was afraid to mess it up by adding code for it to pass the tests. But I think it would be best for me to take a shot at it first and then I can always remove the change if that wasn't your intent. Sorry for the noise.
Thank you for the suggestions. I have a question about how far the pure python recipe should be changed to get the tests to work. With adding the tests for the islice_recipe, I'm able to do this to avoid duplicating code:
class TestPurePythonRoughEquivalents(unittest.TestCase):
@staticmethod
def islice(iterable, *args):
s = slice(*args)
it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
try:
nexti = next(it)
except StopIteration:
# Consume *iterable* up to the *start* position.
for i, element in zip(range(s.start or 0), iterable):
pass
return
for i, element in enumerate(iterable):
if i == nexti:
yield element
try:
nexti = next(it)
except StopIteration:
return
def test_islice_recipe(self):
global islice
orig_islice = islice
islice = self.islice
TestBasicOps().test_islice()
islice = orig_islice
However, test_islice()
in TestBasicOps()
has tests that check for invalid arguments with TypeError and ValueError, which the current recipe doesn't have. Should I expand the recipe for these cases?
Also, there's a test for issue 10323 that fails with the pure python version:
c = count()
self.assertEqual(list(islice(c, 1, 3, 50)), [1])
self.assertEqual(next(c), 3)
It returns 2 instead of 3.
And the test for pickling fails with TypeError: can't pickle generator objects
.
I don't know the way to get around the pickle error, but for the others, should I modify the recipe to pass all the tests?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to modify the recipe or try to pass all of the regular islice() tests. Instead, we just want to test the recipe to make sure its basic functionality is okay. The pure python rough equivalent is allowed to not be a class, not be pickleable, and to omit error checks, etc, but it does need to be good at slicing :-) The problem we're fixing here arose because the islice(it, n, n) case wasn't tested to see if it advanced t 8000 he iterator. That is core functionality, so it warrants a test.
Also remember, the main purpose of the pure python rough equivalents it to augment the docs to provide additional insights that aren't easily conveyed in English. It needs to be short and expressive. A true exact equivalent, drop-in substitute would likely lose the pithy explanatory value.
There's no need to work very hard on this one. We want to make a minimal change to the recipe to make it correct with regards to basic functionality, and then add more testing to make sure the job was done right. The latter step should be easy as well -- just copy the relevant lines from test_islice().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explaining that.
A few notes on what the changes I made.
- I changed the recipe for the stop point. This had a test referencing bpo-10323 and, to your point, it affects the slicing. But, it also makes the recipe less pithy, so it may be too much of an edge case to worry about in the recipe? I wanted to highlight it in this commit even though you ultimately may not want to add it.
- There were no existing test cases where start = stop in test_islice, so I added one under the comment for consume. I thought maybe it should replace the existing test case for consume since it's more of an edge case?
# Test number of items consumed SF #1171417
it = iter(range(10))
self.assertEqual(list(islice(it, 3)), list(range(3)))
self.assertEqual(list(it), list(range(3, 10)))
it = iter(range(10))
self.assertEqual(list(islice(it, 3, 3)), [])
self.assertEqual(list(it), list(range(3, 10)))
I added the second test to reflect this issue.
Thanks!
Doc/library/itertools.rst
Outdated
s = slice(*args) | ||
it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1)) | ||
it = iter(range(0, s.stop or sys.maxsize, s.step or 1)) | ||
for i in zip(range(0, s.start or 0), iterable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking the additional code just covers a special case, so it would be better to put the new code inside the exception handler. That way, it doesn't interfere with out understanding of the common case and it doesn't slow down the common case. It also allows us to put in a clarifying comment about why the code is there. Also, the variable on the for-loop should be "i, element" to make clear what the zip components are. Trapping the tuple in a single pair named i is likely confusing because it means a tuple in one place and an index offset in other places:
s = slice(*args)
it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
try:
nexti = next(it)
except StopIteration:
# consume the iterable up the *start* position:
for i, element in zip(range(s.start or 0), iterable):
pass
return
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be poked with soft cushions! |
Doc/library/itertools.rst
Outdated
return | ||
for i, element in enumerate(iterable): | ||
if i == nexti: | ||
yield element | ||
nexti = next(it) | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the try/except outside the for-loop:
try:
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
except StopIteration:
# Consume to *stop*.
for i, element in zip(range(i + 1, stop), iterable):
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Doc/library/itertools.rst
Outdated
try: | ||
nexti = next(it) | ||
except StopIteration: | ||
# Consume to *stop*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, keep this consume-to-stop code as you currently have it. I'm taking a while to mull over whether this needs to be part of the recipe or whether to count this distracting and exotic corner case as the "rough" part of the rough equivalent.
it = iter(range(10)) | ||
self.assertEqual(list(self.islice(it, 3, 3)), []) | ||
self.assertEqual(list(it), list(range(3, 10))) | ||
# Test that slice finishes in predictable state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that I'm still thinking about.
@@ -0,0 +1,2 @@ | |||
Modify documentation for the :func:`islice` recipe to consume initial values | |||
up to start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to the start index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for the one little nit, this looks ready to go. Thanks for your work.
Lib/test/test_itertools.py
Outdated
# Consume to *stop*. | ||
for i, element in zip(range(i + 1, stop), iterable): | ||
pass | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop the "return" on the final line so that it doesn't look like an early-out as opposed to a normal end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Raymond, thank you for the reviews and guiding me through this PR. Were you interested in having the other Python rough equivalent functions added to the new test class? I'd like to contribute those changes if it's something you think should be added. |
Thanks @csabella for the PR, and @rhettinger for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
GH-6266 is a backport of this pull request to the 3.7 branch. |
…tart (pythonGH-6195) (cherry picked from commit da1734c) Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
…tart (pythonGH-6195) (cherry picked from commit da1734c) Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
GH-6267 is a backport of this pull request to the 3.6 branch. |
|
Thanks @csabella for the PR, and @rhettinger for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7. |
Sorry, @csabella and @rhettinger, I could not cleanly backport this to |
Cheryl, can you work on a Python 2.7 backport? |
Yes, I'll work on the backport. I had hoped to do it yesterday but just didn't get the time. I'll try to do it today. |
I apologize, but I don't quite understand how to consume the iterator up to nexti = next(it)
for i, element in zip(xrange(start), iterable):
pass
|
You're right. The try/excepts need to stay now that we're doing something more interesting than just a return in the except-block. :-) For Python 2.7, the range() and zip() should become xrange() and izip(). |
…ding start (pythonGH-6195) (cherry picked from commit da1734c)
GH-6339 is a backport of this pull request to the 2.7 branch. |
https://bugs.python.org/issue27212