gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() #129844

lordmauve · 2025-02-08T10:00:48Z

Change bytes.fromhex() and bytearray.fromhex() to accept a bytes object interpreted as ASCII.

This matches the behaviour of int e.g

>>> int(b'123abc', 16)
1194684
>>> bytes.fromhex(b'123abc')
b'\x12:\xbc'

Fixes #129349

Issue: bytes.fromhex() should parse a bytes #129349

picnixz

We need more tests where invalid non-hexadecimal bytes are given (e.g., b"à", '\uD834\uDD1E'.encode() or bytes with NULs) and where non-bytes/str/bytearray objects are also passed.

cc @vstinner

Doc/library/stdtypes.rst

Lib/test/test_bytes.py

Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst

Objects/bytesobject.c

picnixz · 2025-02-08T13:20:20Z

Although the use case of git cat-file --batch is relevant, I still think it's a little too niche for this. I think the output of git cat-file --batch should be post-processed (and you should indeed call .decode() instead) but others might disagree.

picnixz

picnixz · 2025-02-09T17:22:33Z

Note: The PR you wrote is of good quality for a first contribution. It's just that the feature can be quite niche, especially if it's added on a built-in class.

lordmauve · 2025-02-09T18:35:00Z

binascii seems a little obscure given that int(bytes, 16) works and binascii docs say:

Normally, you will not use these functions directly but use wrapper modules like base64

It seems like a logical extension that bytes.fromhex() should support bytes and it still leaves a place for binascii as the version that supports the buffer protocol.

It's not my first credited contribution, just the first under my own GitHub account maybe.

picnixz · 2025-02-09T22:28:48Z

It's not my first credited contribution, just the first under my own GitHub account maybe.

Sorry I got confused with another account (but the fact that the PR is good remains).

It seems like a logical extension that bytes.fromhex() should support bytes and it still leaves a place for binascii as the version that supports the buffer protocol.

Yes, but OTOH, it's a built-in and except for your use case, I couldn't find other use cases where binascii.unhexlify couldn't fit the bill (or just a call to .decode() before). So I would defer the final decision to Victor and/or Serhiy on that matter.

vstinner

LGTM

Objects/bytesobject.c

Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst

serhiy-storchaka

Maybe add also a test for memoryview or array.array?

Doc/library/stdtypes.rst

Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst

lordmauve · 2025-03-07T11:13:17Z

Maybe add also a test for memoryview or array.array?

Ok, added this and support for the PyBuf_SIMPLE buffer protocol.

Doc/library/stdtypes.rst

picnixz · 2025-03-07T11:25:44Z

Doc/whatsnew/3.14.rst

@@ -354,6 +354,10 @@ Other language changes
  (with :func:`format` or :ref:`f-strings`).
  (Contrubuted by Sergey B Kirpichev in :gh:`87790`.)

+* The :func:`bytes.fromhex` and :func:`bytearray.fromhex` methods now accept
+  ASCII :class:`bytes` and :term:`bytes-like objects <bytes-like object>`.


Question: are bytes also bytes-like objects? if so, you can just link the term. Or more generally, isn't it objects that support the buffer protocol?

Yes, but it seems less accessible to users to just link bytes-like object because that dives into a description of the buffer protocol which is lower level than the audience I was writing for in builtins docs.

Maybe the fault is with :term:\bytes-like object`` because it could just list types that duck-type like bytes.

So I hedged and did both.

You could sa "that support the buffer protocol such as memoryviews" and add a link to whatever example of a type that supports the buffer protocol you used

I prefer to say "bytes and bytes-like", it's more explicit.

Lib/test/test_bytes.py

Objects/bytesobject.c

Doc/library/stdtypes.rst

Objects/bytesobject.c

vstinner · 2025-03-12T07:53:25Z

Oh no, there is now a conflict on clinic/ files. You can merge main into your branch and re-run make clinic.

Fixes python#129349

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

vstinner · 2025-03-12T10:40:56Z

Merged, thank you @lordmauve!

…ython#129844) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>

bedevere-app bot added the awaiting review label Feb 8, 2025

bedevere-app bot mentioned this pull request Feb 8, 2025

bytes.fromhex() should parse a bytes #129349

Closed

picnixz reviewed Feb 8, 2025

View reviewed changes

lordmauve added a commit to lordmauve/cpython that referenced this pull request Feb 9, 2025

Apply suggestions from @picnixz in python#129844

ad5cc10

picnixz reviewed Feb 9, 2025

View reviewed changes

vstinner approved these changes Feb 20, 2025

View reviewed changes

Objects/bytesobject.c Show resolved Hide resolved

bedevere-app bot added awaiting merge and removed awaiting review labels Feb 20, 2025

vstinner reviewed Feb 26, 2025

View reviewed changes

Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst Outdated Show resolved Hide resolved

serhiy-storchaka approved these changes Feb 27, 2025

View reviewed changes

Doc/library/stdtypes.rst Outdated Show resolved Hide resolved

Doc/library/stdtypes.rst Outdated Show resolved Hide resolved

Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst Outdated Show resolved Hide resolved

lordmauve added a commit to lordmauve/cpython that referenced this pull request Mar 7, 2025

Apply suggestions from @picnixz in python#129844

f771a5e

lordmauve force-pushed the lordmauve/issue129349 branch from ad5cc10 to 1a18438 Compare March 7, 2025 11:12

vstinner reviewed Mar 7, 2025

View reviewed changes

Doc/library/stdtypes.rst Show resolved Hide resolved

picnixz reviewed Mar 7, 2025

View reviewed changes

vstinner reviewed Mar 8, 2025

View reviewed changes

Doc/library/stdtypes.rst Outdated Show resolved Hide resolved

vstinner reviewed Mar 8, 2025

View reviewed changes

Objects/bytesobject.c Outdated Show resolved Hide resolved

lordmauve force-pushed the lordmauve/issue129349 branch from 91566c9 to ba05050 Compare March 9, 2025 12:51

lordmauve added a commit to lordmauve/cpython that referenced this pull request Mar 10, 2025

Apply suggestions from @picnixz in python#129844

82c35f0

lordmauve force-pushed the lordmauve/issue129349 branch from be4b9a4 to 1e035db Compare March 10, 2025 20:07

lordmauve added 5 commits March 12, 2025 09:43

Accept bytes in bytes.fromhex()/bytearray.fromhex()

81d7ea2

Fixes python#129349

Update documentation

ea02542

Apply suggestions from @picnixz in python#129844

76d28ce

Fix sphinx-lint

cff091c

Mention bytearray in docs and blurb

be92188

lordmauve and others added 5 commits March 12, 2025 09:43

Use buffer protocol to support all byte-like objects

b0e3eb5

Insert missing braces

ef92b4e

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Update Doc/library/stdtypes.rst

685f583

Add test for invalid type in bytes.fromhex()

f859c3c

PEP-7

31b9ab0

lordmauve force-pushed the lordmauve/issue129349 branch from 1e035db to 31b9ab0 Compare March 12, 2025 09:45

vstinner merged commit e0637ce into python:main Mar 12, 2025
42 checks passed

bedevere-app bot removed the awaiting merge label Mar 12, 2025

lordmauve deleted the lordmauve/issue129349 branch March 12, 2025 12:54

Uh oh!

gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() #129844

gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() #129844

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!