8000 bpo-27212: Modify islice recipe to consume initial values preceding start by csabella · Pull Request #6195 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-27212: Modify islice recipe to consume initial values preceding start #6195

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 27, 2018

Conversation

csabella
Copy link
Contributor
@csabella csabella commented Mar 22, 2018

@@ -435,8 +435,13 @@ loops that truncate the stream.
# islice('ABCDEFG', 2, 4) --> C D
# islice('ABCDEFG', 2, None) --> C D E F G
# islice('ABCDEFG', 0, None, 2) --> A C E G
# it = iter('abcdefghi')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please omit the three additional comment lines and only include the code changes.

Copy link
Contributor
@rhettinger rhettinger Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, please update Lib/test/test_itertools.py to add a new class, TestPurePythonRoughEquivalents() that has a test_islice_recipe() method that runs all the test_islice() tests against the pure python version of islice().

While we're at it, let's modify the consume() recipe to have n default to None:

- def consume(iterator):
+ def consume(iterator, n=None):

Let's also add a test to Lib/test/test_itertools.py libreftest section so that the consume() code gets exercised

>>> it = iter(range(10))
>>> consume(it, 3)
>>> next(it)
3
>>> consume(it)
>>> next(it, 'Done')
'Done'

Copy link
Contributor Author
@csabella csabella Mar 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: Upon reflection, I think you want the pure python version to pass all the tests otherwise you wouldn't have suggested the new test class. Your code example is so well written and succinct that I was afraid to mess it up by adding code for it to pass the tests. But I think it would be best for me to take a shot at it first and then I can always remove the change if that wasn't your intent. Sorry for the noise.


Thank you for the suggestions. I have a question about how far the pure python recipe should be changed to get the tests to work. With adding the tests for the islice_recipe, I'm able to do this to avoid duplicating code:

class TestPurePythonRoughEquivalents(unittest.TestCase):

    @staticmethod
    def islice(iterable, *args):
        s = slice(*args)
        it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
        try:
            nexti = next(it)
        except StopIteration:
            # Consume *iterable* up to the *start* position.
            for i, element in zip(range(s.start or 0), iterable):
                pass
            return
        for i, element in enumerate(iterable):
            if i == nexti:
                yield element
                try:
                    nexti = next(it)
                except StopIteration:
                    return

    def test_islice_recipe(self):
        global islice
        orig_islice = islice
        islice = self.islice
        TestBasicOps().test_islice()
        islice = orig_islice

However, test_islice() in TestBasicOps() has tests that check for invalid arguments with TypeError and ValueError, which the current recipe doesn't have. Should I expand the recipe for these cases?

Also, there's a test for issue 10323 that fails with the pure python version:

        c = count()
        self.assertEqual(list(islice(c, 1, 3, 50)), [1])
        self.assertEqual(next(c), 3)

It returns 2 instead of 3.

And the test for pickling fails with TypeError: can't pickle generator objects.

I don't know the way to get around the pickle error, but for the others, should I modify the recipe to pass all the tests?

Thanks!

Copy link
Contributor
@rhettinger rhettinger Mar 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to modify the recipe or try to pass all of the regular islice() tests. Instead, we just want to test the recipe to make sure its basic functionality is okay. The pure python rough equivalent is allowed to not be a class, not be pickleable, and to omit error checks, etc, but it does need to be good at slicing :-) The problem we're fixing here arose because the islice(it, n, n) case wasn't tested to see if it advanced t 8000 he iterator. That is core functionality, so it warrants a test.

Also remember, the main purpose of the pure python rough equivalents it to augment the docs to provide additional insights that aren't easily conveyed in English. It needs to be short and expressive. A true exact equivalent, drop-in substitute would likely lose the pithy explanatory value.

There's no need to work very hard on this one. We want to make a minimal change to the recipe to make it correct with regards to basic functionality, and then add more testing to make sure the job was done right. The latter step should be easy as well -- just copy the relevant lines from test_islice().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining that.

A few notes on what the changes I made.

  1. I changed the recipe for the stop point. This had a test referencing bpo-10323 and, to your point, it affects the slicing. But, it also makes the recipe less pithy, so it may be too much of an edge case to worry about in the recipe? I wanted to highlight it in this commit even though you ultimately may not want to add it.
  2. There were no existing test cases where start = stop in test_islice, so I added one under the comment for consume. I thought maybe it should replace the existing test case for consume since it's more of an edge case?
        # Test number of items consumed     SF #1171417
        it = iter(range(10))
        self.assertEqual(list(islice(it, 3)), list(range(3)))
        self.assertEqual(list(it), list(range(3, 10)))

        it = iter(range(10))
        self.assertEqual(list(islice(it, 3, 3)), [])
        self.assertEqual(list(it), list(range(3, 10)))

I added the second test to reflect this issue.

Thanks!

s = slice(*args)
it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
it = iter(range(0, s.stop or sys.maxsize, s.step or 1))
for i in zip(range(0, s.start or 0), iterable):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the additional code just covers a special case, so it would be better to put the new code inside the exception handler. That way, it doesn't interfere with out understanding of the common case and it doesn't slow down the common case. It also allows us to put in a clarifying comment about why the code is there. Also, the variable on the for-loop should be "i, element" to make clear what the zip components are. Trapping the tuple in a single pair named i is likely confusing because it means a tuple in one place and an index offset in other places:

    s = slice(*args)
    it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
    try:
        nexti = next(it)
    except StopIteration:
        # consume the iterable up the *start* position:
        for i, element in zip(range(s.start or 0), iterable):
            pass
        return
    for i, element in enumerate(iterable):
        if i == nexti:
            yield element
            nexti = next(it)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

return
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the try/except outside the for-loop:

    try:
        for i, element in enumerate(iterable):
            if i == nexti:
                yield element
                nexti = next(it)
    except StopIteration:
        # Consume to *stop*.
        for i, element in zip(range(i + 1, stop), iterable):
            pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

try:
nexti = next(it)
except StopIteration:
# Consume to *stop*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, keep this consume-to-stop code as you currently have it. I'm taking a while to mull over whether this needs to be part of the recipe or whether to count this distracting and exotic corner case as the "rough" part of the rough equivalent.

it = iter(range(10))
self.assertEqual(list(self.islice(it, 3, 3)), [])
self.assertEqual(list(it), list(range(3, 10)))
# Test that slice finishes in predictable state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that I'm still thinking about.

@@ -0,0 +1,2 @@
Modify documentation for the :func:`islice` recipe to consume initial values
up to start.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to the start index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor
@rhettinger rhettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the one little nit, this looks ready to go. Thanks for your work.

# Consume to *stop*.
for i, element in zip(range(i + 1, stop), iterable):
pass
return
Copy link
Contributor
@rhettinger rhettinger Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop the "return" on the final line so that it doesn't look like an early-out as opposed to a normal end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@csabella
Copy link
Contributor Author

Raymond, thank you for the reviews and guiding me through this PR.

Were you interested in having the other Python rough equivalent functions added to the new test class? I'd like to contribute those changes if it's something you think should be added.

@rhettinger rhettinger merged commit da1734c into python:master Mar 27, 2018
@miss-islington
Copy link
Contributor

Thanks @csabella for the PR, and @rhettinger for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-6266 is a backport of this pull request to the 3.7 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 27, 2018
…tart (pythonGH-6195)

(cherry picked from commit da1734c)

Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Mar 27, 2018
…tart (pythonGH-6195)

(cherry picked from commit da1734c)

Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
@bedevere-bot
Copy link

GH-6267 is a backport of this pull request to the 3.6 branch.

rhettinger pushed a commit that referenced this pull request Mar 27, 2018
…tart (GH-6195) (#GH-6266)

(cherry picked from commit da1734c)

Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
rhettinger pushed a commit that referenced this pull request Mar 27, 2018
…tart (GH-6195) (GH-6267)

(cherry picked from commit da1734c)

Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
@rhettinger
Copy link
Contributor
  • This will also need a backport to 2.7 but it won't need the try/except around the next(i) because the StopIteration just bubbles up to end the iteration.

  • No need to put in more recipe tests unless we're modifying them for some reason.

@miss-islington
Copy link
Contributor

Thanks @csabella for the PR, and @rhettinger for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @csabella and @rhettinger, I could not cleanly backport this to 2.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker da1734c58d2f97387ccc9676074717d38b044128 2.7

@csabella csabella deleted the issue27212 branch March 27, 2018 09:52
@rhettinger
Copy link
Contributor

Cheryl, can you work on a Python 2.7 backport?

@csabella
Copy link
Contributor< 10000 /span> Author

Yes, I'll work on the backport. I had hoped to do it yesterday but just didn't get the time. I'll try to do it today.

@csabella
Copy link
Contributor Author

This will also need a backport to 2.7 but it won't need the try/except around the next(i) because the StopIteration just bubbles up to end the iteration.

I apologize, but I don't quite understand how to consume the iterator up to start without the try/except block. If I remove the try/except, then it doesn't get to the loop after the next():

   nexti = next(it)
   for i, element in zip(xrange(start), iterable):
       pass
>>> i = iter(xrange(10))
>>> list(islice(i, 3, 3))
[]
>>> list(i)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> 

@rhettinger
Copy link
Contributor

You're right. The try/excepts need to stay now that we're doing something more interesting than just a return in the except-block. :-)

For Python 2.7, the range() and zip() should become xrange() and izip().

csabella added a commit to csabella/cpython that referenced this pull request Apr 1, 2018
@bedevere-bot
Copy link

GH-6339 is a backport of this pull request to the 2.7 branch.

rhettinger pushed a commit that referenced this pull request Apr 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0