-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() #129844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more tests where invalid non-hexadecimal bytes are given (e.g., b"à"
, '\uD834\uDD1E'.encode()
or bytes with NULs) and where non-bytes/str/bytearray objects are also passed.
cc @vstinner
Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
Outdated
Show resolved
Hide resolved
Although the use case of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I reviewed the PR before thinking more about this. Actually, we have binascii.unhexlify
that already supports bytes inputs. And the docs say:
Similar functionality (accepting only text string arguments, but more liberal towards whitespace) is also accessible using the bytes.fromhex() class method.
So I think the text-only feature for bytes.fromhex
was actually desired.
I also tried to search on GitHub whether there are use cases for this, but I couldn't find any that actually used a non-string input. So I think this modification is likely not ideal. We might want to really restrict the input type for built-in types and perhaps be less restrictive with binascii
.
Thus, sorry for having reviewed the PR (which seemed I endorsed the change), but I don't think we need to make that change. For your use-case, I think you can definitely use binascii.unhexlify
instead (for instance binascii.unhexlify(b'012abc')
would return b'\x01\x2a\xbc
as you tested).
Note that binascii
actually supports any object implementing the buffer protocol, so it's likely the one that needs to be used in such cases (assuming that whitespaces are cleaned). See https://github.com/python/cpython/blob/main/Python/pystrhex.c.
TL;DR: I'm rather -0.5 for the inclusion of this feature but if others disagree and find it good to align it with binascii.unhexlify
, then I won't be against.
Note: The PR you wrote is of good quality for a first contribution. It's just that the feature can be quite niche, especially if it's added on a built-in class. |
It seems like a logical extension that It's not my first credited contribution, just the first under my own GitHub account maybe. |
Sorry I got confused with another account (but the fact that the PR is good remains).
Yes, but OTOH, it's a built-in and except for your use case, I couldn't find other use cases where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add also a test for memoryview
or array.array
?
Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
Outdated
Show resolved
Hide resolved
ad5cc10
to
1a18438
Compare
Ok, added this and support for the |
@@ -354,6 +354,10 @@ Other language changes | |||
(with :func:`format` or :ref:`f-strings`). | |||
(Contrubuted by Sergey B Kirpichev in :gh:`87790`.) | |||
|
|||
* The :func:`bytes.fromhex` and :func:`bytearray.fromhex` methods now accept | |||
ASCII :class:`bytes` and :term:`bytes-like objects <bytes-like object>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: are bytes also bytes-like objects? if so, you can just link the term. Or more generally, isn't it objects that support the buffer protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it seems less accessible to users to just link bytes-like object
because that dives into a description of the buffer protocol which is lower level than the audience I was writing for in builtins docs.
Maybe the fault is with :term:\
bytes-like object`` because it could just list types that duck-type like bytes.
So I hedged and did both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could sa "that support the buffer protocol such as memoryviews" and add a link to whatever example of a type that supports the buffer protocol you used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to say "bytes and bytes-like", it's more explicit.
91566c9
to
ba05050
Compare
be4b9a4
to
1e035db
Compare
Oh no, there is now a conflict on |
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
1e035db
to
31b9ab0
Compare
Merged, thank you @lordmauve! |
…ython#129844) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>
Change
bytes.fromhex()
andbytearray.fromhex()
to accept abytes
object interpreted as ASCII.This matches the behaviour of
int
e.gFixes #129349