8000 bpo-34002: Minor efficiency and clarity improvements in email package. by selik · Pull Request #7999 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-34002: Minor efficiency and 8000 clarity improvements in email package. #7999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 20, 2019

Conversation

selik
Copy link
Contributor
@selik selik commented Jun 28, 2018

Creating a copy of a list, then repeatedly popping the first element could be much slower than necessary if the list is long. Further explanations in the commit messages

Automerge-Triggered-By: @maxking

selik added 2 commits June 28, 2018 12:36
Comparing ``len(a) > ``len(a - b)`` is essentially looking for an
intersection between the two sets. If set ``b`` does not intersect ``a``
then ``len(a - b)`` will be equal to ``len(a)``. This logic is more
clearly expressed as ``a & b``.
Copying the list, then repeatedly popping the first element was
unnecessarily slow. I also cleaned up a couple other inefficiencies.
There's no need to unpack a tuple, then re-pack and append it. The list
can be created with the first element instead of empty. Secondly, the
``endswith`` method returns a bool, so there's no need for an if-
statement to set ``encoding`` to True or False.
Copy link
Member
@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open an issue on the tracker for discussing this PR.

# Copy params so we don't mess with the original
params = params[:]
new_params = []
new_params = [params[0]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if params[0] is not a 2-tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible for the first params element, though the others are implicitly checked by the unpacking for-loop.

Looking at the usage of decode_params, it's only use is in email.message._get_params_preserve which in turn has 3 call sites. All 3 calls unpack the elements as (k, v) pairs. The bad data would cause an immediate error, given how the method is used. The error would be the same as today, though the traceback would not be as deep.

If that's not satisfactory, I'll revert to:

key, value = params[0]
new_params = [(key, value)]

``a.intersection(b)`` method is more clear of purpose than ``not
a.isdisjoint(b)`` and avoids an unnecessary set construction that ``a &
set(b)`` performs.
@selik selik changed the title Minor efficiency and clarity improvements bpo-34002: minor efficiency and clarity improvements in email package Jun 29, 2018
@selik
Copy link
Contributor Author
selik commented Jun 29, 2018

I created an issue in the tracker and changed the name of this pull-request.

https://bugs.python.org/issue34002

encoded = True
else:
encoded = False
for name, value in params[1:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

params[1:] still creates a copy of the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. The real benefit is avoiding repeated .pop(0).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the code can be made more optimal if avoid a copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that the final value of 'encoded' only depends on the final value of name. If so, the loop is equivalent to

if params:
    encoded = params[-1][0].endswith('*')

Otherwise, one can skip-iterate without copying with

it = iter(params)
try:
    next(it)
except StopIteration:
    pass
for name, value in it:
   ...

Copy link
Contributor Author
@selik selik Sep 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each iteration of the loop uses the encoded value. The iter and nexting would be a better technique.

it = iter(params)
new_params = [next(it)]
for name, value in it:
    ...

While it reads slightly worse, the isdisjoint method will stop when it
finds a counterexample and returns a bool, rather than looping over the
entire iterable and constructing a new set.
@maxking
Copy link
Contributor
maxking commented Jun 14, 2019

This PR seems to be a good and looks like has reviews from several core devs.
Since there isn't any change of behaviour or bugfix, I don't know if there is a need for NEWS file here. It also looks like @selik has made all the requested changes.

Can @serhiy-storchaka, @terryjreedy @warsaw @bitdancer merge this with label to skip news entry? Unless we do do need a NEWS entry, in which case it would be good to know that.

@terryjreedy
< 8000 details-menu class="dropdown-menu dropdown-menu-sw show-more-popover color-fg-default" style="width:185px" src="" preload > Copy link
Member

Though sometimes disagreeing on meaning of 'trivial', we generally want an issue and news entry for non-trivial changes, and this is non-trivial. Someone should add a short news entry. It will inform anyone looking for email changes that might have caused an obscure bug that we made a change that we think should have no visible effect other than performance.

@maxking
Copy link
Contributor
maxking commented Jun 14, 2019

Thanks @terryjreedy, that totally makes sense.

@selik can you please add a news entry for this PR?

@matrixise matrixise requested a review from maxking September 17, 2019 05:03
@maxking
Copy link
Contributor
maxking commented Sep 19, 2019

I am okay with the changes, like I mentioned before, but like @terryjreedy mentioned, it would be good to have a NEWS entry.

I'd push a NEWS entry myself if I had access to push to his PR (which can be done using the flag "Allow changes from maintainers flag", I think.

@selik
Copy link
Contributor Author
selik commented Sep 19, 2019

I believe the box is checked to allow edits from maintainers

@terryjreedy
Copy link
Member

@maxking I am able to edit the files, so you should be able to create a branch, add a blurb, and push.

Copy link
Contributor
@maxking maxking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maxking maxking changed the title bpo-34002: minor efficiency and clarity improvements in email package bpo-34002: Minor efficiency and clarity improvements in email package. Sep 20, 2019
@maxking maxking merged commit 2702638 into python:master Sep 20, 2019
@maxking
Copy link
Contributor
maxking commented Sep 20, 2019

Thanks @selik for this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0