8000 gh-128508: Add some docstrings to xml.dom.minidom by srinivasreddy · Pull Request #128477 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-128508: Add some docstrings to xml.dom.minidom #128477

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
21bb374
Update and Add docstrings for functions and methods in minidom.py module
srinivasreddy Jan 4, 2025
f16761b
Update Lib/xml/dom/minidom.py
srinivasreddy Jan 6, 2025
eb0208b
Update Lib/xml/dom/minidom.py
srinivasreddy Jan 6, 2025
f4f6334
Update Lib/xml/dom/minidom.py
srinivasreddy Jan 6, 2025
70617ea
Add `check-readthedocs` pre-commit hook (#128453)
sobolevn Jan 4, 2025
90b344f
gh-128152: Argument Clinic: ignore pre-processor directives inside C …
erlend-aasland Jan 4, 2025
34c317a
GH-127381: pathlib ABCs: remove `PathBase.move()` and `move_into()` (…
barneygale Jan 4, 2025
5f7ba82
Docs: mark up json.dump() using parameter list (#128482)
erlend-aasland Jan 4, 2025
23d11be
pathlib tests: create `walk()` test hierarchy without using class und…
barneygale Jan 4, 2025
e6a8d38
gh-126719: Clarify math.fmod docs (#127741)
StanFromIreland Jan 4, 2025
1592145
Docs: amend json.dump() post gh-128482 (#128489)
erlend-aasland Jan 4, 2025
1a8ee69
gh-127954: Document PyObject_DelItemString (#127986)
rruuaanng Jan 4, 2025
61c3e8a
gh-127553: Remove outdated TODO comment in _pydatetime (#127564)
bombs-kim Jan 4, 2025
69f8e4b
gh-115765: Document and enforce Autoconf 2.72 requirement (#128502)
erlend-aasland Jan 4, 2025
7f767cd
gh-128437: Add `BOLT_COMMON_FLAGS` with `-update-debug-sections` (gh-…
zanieb Jan 5, 2025
2f8b072
gh-128137: Update PyASCIIObject to handle interned field with the ato…
corona10 Jan 5, 2025
539b638
gh-128504: Upgrade doctest to ubuntu-24.04 (#128506)
Damien-Chen Jan 5, 2025
a4c5f4f
Docs: fix `MessageDefect` references in email.policy docs (#128468)
koyuki7w Jan 5, 2025
c612744
gh-98188: Fix EmailMessage.get_payload to decode data when CTE value …
RanKKI Jan 6, 2025
b0a9129
Revert __repr__ change
srinivasreddy Jan 6, 2025
3f71f1f
Merge branch 'main' into gh-63882-doc_strings
srinivasreddy Jan 6, 2025
8b7ff8e
Update the docstring for _clone_node(...)
srinivasreddy Jan 6, 2025
afa51ef
Update docstring for cloneNode(...)
srinivasreddy Jan 6, 2025
14bb77e
Update docstrings for cloneNode(...)
srinivasreddy Jan 6, 2025
a4840f3
Convert doc string to imperative mode
srinivasreddy Jan 6, 2025
38be045
Update docstring as recommended by argument clinic
srinivasreddy Jan 6, 2025
cca0fda
Update docstring as recommended by argument clinic
srinivasreddy Jan 6, 2025
8df6795
Undo removing comment
srinivasreddy Jan 6, 2025
727af86
Update docstrings as it was done in argument clinic
srinivasreddy Jan 6, 2025
2da171f
Add space back
srinivasreddy Jan 6, 2025
ecb4a54
Update doc strings
srinivasreddy Jan 6, 2025
8b63730
Update Lib/xml/dom/minidom.py
srinivasreddy Jan 6, 2025
49a12e6
Update Lib/xml/dom/minidom.py
srinivasreddy Jan 6, 2025
298a20f
Merge branch 'main' into gh-63882-doc_strings
srinivasreddy Jan 16, 2025
b97d812
Address review comments
srinivasreddy Jan 16, 2025
19d245c
Update docstrings
srinivasreddy Jan 16, 2025
7b1666d
Update Lib/xml/dom/minidom.py
srinivasreddy May 13, 2025
c68cb4d
Merge branch 'main' into gh-63882-doc_strings
srinivasreddy May 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
gh-98188: Fix EmailMessage.get_payload to decode data when CTE value …
…has extra text (#127547)

Up to this point message handling has been very strict with regards to content encoding values: mixed case was accepted, but trailing blanks or other text would cause decoding failure, even if the first token was a valid encoding.  By Postel's Rule we should go ahead and decode as long as we can recognize that first token.  We have not thought of any security or backward compatibility concerns with this fix.

This fix does introduce a new technique/pattern to the Message code: we look to see if the header has a 'cte' attribute, and if so we use that.  This effectively promotes the header API exposed by HeaderRegistry to an API that any header parser "should" support.  This seems like a reasonable thing to do.  It is not, however, a requirement, as the string value of the header is still used if there is no cte attribute.

The full fix (ignore any trailing blanks or blank-separated trailing text) applies only to the non-compat32 API.  compat32 is only fixed to the extent that it now ignores trailing spaces.  Note that the HeaderRegistry parsing still records a HeaderDefect if there is extra text.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
  • Loading branch information
2 people authored and srinivasreddy committed Jan 6, 2025
commit c612744439e3b9dcd3d508d4569f3570ffeecbf4
8 changes: 6 additions & 2 deletions Lib/email/message.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,8 +286,12 @@ def get_payload(self, i=None, decode=False):
if i is not None and not isinstance(self._payload, list):
raise TypeError('Expected list, got %s' % type(self._payload))
payload = self._payload
# cte might be a Header, so for now stringify it.
cte = str(self.get('content-transfer-encoding', '')).lower()
cte = self.get('content-transfer-encoding', '')
if hasattr(cte, 'cte'):
cte = cte.cte
else:
# cte might be a Header, so for now stringify it.
cte = str(cte).strip().lower()
# payload may be bytes here.
if not decode:
if isinstance(payload, str) and utils._has_surrogates(payload):
Expand Down
44 changes: 44 additions & 0 deletions Lib/test/test_email/test_email.py
Original file line number Diff line number Diff line change
Expand Up @@ -810,6 +810,16 @@ def test_unicode_body_defaults_to_utf8_encoding(self):
w4kgdGVzdGFiYwo=
"""))

def test_string_payload_with_base64_cte(self):
msg = email.message_from_string(textwrap.dedent("""\
Content-Transfer-Encoding: base64

SGVsbG8uIFRlc3Rpbmc=
"""), policy=email.policy.default)
self.assertEqual(msg.get_payload(decode=True), b"Hello. Testing")
self.assertDefectsEqual(msg['content-transfer-encoding'].defects, [])



# Test the email.encoders module
class TestEncoders(unittest.TestCase):
Expand Down Expand Up @@ -2352,6 +2362,40 @@ def test_missing_header_body_separator(self):
self.assertDefectsEqual(msg.defects,
[errors.MissingHeaderBodySeparatorDefect])

def test_string_payload_with_extra_space_after_cte(self):
# https://github.com/python/cpython/issues/98188
cte = "base64 "
msg = email.message_from_string(textwrap.dedent(f"""\
Content-Transfer-Encoding: {cte}

SGVsbG8uIFRlc3Rpbmc=
"""), policy=email.policy.default)
self.assertEqual(msg.get_payload(decode=True), b"Hello. Testing")
self.assertDefectsEqual(msg['content-transfer-encoding'].defects, [])

def test_string_payload_with_extra_text_after_cte(self):
msg = email.message_from_string(textwrap.dedent("""\
Content-Transfer-Encoding: base64 some text

SGVsbG8uIFRlc3Rpbmc=
"""), policy=email.policy.default)
self.assertEqual(msg.get_payload(decode=True), b"Hello. Testing")
cte = msg['content-transfer-encoding']
self.assertDefectsEqual(cte.defects, [email.errors.InvalidHeaderDefect])

def test_string_payload_with_extra_space_after_cte_compat32(self):
cte = "base64 "
msg = email.message_from_string(textwrap.dedent(f"""\
Content-Transfer-Encoding: {cte}

SGVsbG8uIFRlc3Rpbmc=
"""), policy=email.policy.compat32)
pasted_cte = msg['content-transfer-encoding']
self.assertEqual(pasted_cte, cte)
self.assertEqual(msg.get_payload(decode=True), b"Hello. Testing")
self.assertDefectsEqual(msg.defects, [])



# Test RFC 2047 header encoding and decoding
class TestRFC2047(TestEmailBase):
Expand Down
5 changes: 5 additions & 0 deletions Lib/test/test_email/test_headerregistry.py
Original file line number Diff line number Diff line change
Expand Up @@ -837,6 +837,11 @@ def cte_as_value(self,
'7bit',
[errors.InvalidHeaderDefect]),

'extra_space_after_cte': (
'base64 ',
'base64',
[]),

}


Expand Down
8000
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1129,6 +1129,7 @@ Gregor Lingl
Everett Lipman
Mirko Liss
Alexander Liu
Hui Liu
Yuan Liu
Nick Lockwood
Stephanie Lockwood
Expand Down
4451
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Fix an issue in :meth:`email.message.Message.get_payload` where data
cannot be decoded if the Content Transfer Encoding mechanism contains
trailing whitespaces or additional junk text. Patch by Hui Liu.
0