8000 bpo-34002: Minor efficiency and clarity improvements in email package. by selik · Pull Request #7999 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-34002: Minor efficiency and clarity improvements in email package. #7999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 20, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Change while/pop to a for-loop
Copying the list, then repeatedly popping the first element was
unnecessarily slow. I also cleaned up a couple other inefficiencies.
There's no need to unpack a tuple, then re-pack and append it. The list
can be created with the first element instead of empty. Secondly, the
``endswith`` method returns a bool, so there's no need for an if-
statement to set ``encoding`` to True or False.
  • Loading branch information
selik committed Jun 28, 2018
commit 51e14f829804827661822a997b687c981b9d972a
14 changes: 3 additions & 11 deletions Lib/email/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,21 +259,13 @@ def decode_params(params):

params is a sequence of 2-tuples containing (param name, string value).
"""
# Copy params so we don't mess with the original
params = params[:]
new_params = []
new_params = [params[0]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if params[0] is not a 2-tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible for the first params element, though the others are implicitly checked by the unpacking for-loop.

Looking at the usage of decode_params, it's only use is in email.message._get_params_preserve which in turn has 3 call sites. All 3 calls unpack the elements as (k, v) pairs. The bad data would cause an immediate error, given how the method is used. The error would be the same as today, though the traceback would not be as deep.

If that's not satisfactory, I'll revert to:

key, value = params[0]
new_params = [(key, value)]

# Map parameter's name to a list of continuations. The values are a
# 3-tuple of the continuation number, the string value, and a flag
# specifying whether a particular segment is %-encoded.
rfc2231_params = {}
name, value = params.pop(0)
new_params.append((name, value))
while params:
name, value = params.pop(0)
if name.endswith('*'):
encoded = True
else:
encoded = False
for name, value in params[1:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

params[1:] still creates a copy of the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. The real benefit is avoiding repeated .pop(0).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the code can be made more optimal if avoid a copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that the final value of 'encoded' only depends on the final value of name. If so, the loop is equivalent to

if params:
    encoded = params[-1][0].endswith('*')

Otherwise, one can skip-iterate without copying with

it = iter(params)
try:
    next(it)
except StopIteration:
    pass
for name, value in it:
   ...

Copy link
Contributor Author
@selik selik Sep 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each iteration of the loop uses the encoded value. The iter and nexting would be a better technique.

it = iter(params)
new_params = [next(it)]
for name, value in it:
    ...

encoded = name.endswith('*')
value = unquote(value)
mo = rfc2231_continuation.match(name)
if mo:
Expand Down
0