8000 gh-101178: Add Ascii85, base85, and Z85 support to binascii by kangtastic · Pull Request #102753 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-101178: Add Ascii85, base85, and Z85 support to binascii #102753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions Doc/library/binascii.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,79 @@ The :mod:`binascii` module defines the following functions:
Added the *newline* parameter.


.. function:: a2b_ascii85(string, /, *, fold_spaces=False, wrap=False, ignore=b"")

Convert Ascii85 data back to binary and return the binary data.

Valid Ascii85 data contains characters from the Ascii85 alphabet in groups
of five (except for the final group, which may have from two to five
characters). Each group encodes 32 bits of binary data in the range from
``0`` to ``2 ** 32 - 1``, inclusive. The special character ``z`` is
accepted as a short form of the group ``!!!!!``, which encodes four
consecutive null bytes.

If *fold_spaces* is true, the special character ``y`` is also accepted as a
short form of the group ``+<VdL``, which encodes four consecutive spaces.
Note that neither short form is permitted if it occurs in the middle of
another group.

If *wrap* is true, the input begins with ``<~`` and ends with ``~>``, as in
the Adobe Ascii85 format.

*ignore* is an optional bytes-like object that specifies characters to
ignore in the input.

Invalid Ascii85 data will raise :exc:`binascii.Error`.


.. function:: b2a_ascii85(data, /, *, fold_spaces=False, wrap=False, width=0, pad=False)

Convert binary data to a formatted sequence of ASCII characters in Ascii85
coding. The return value is the converted data.

If *fold_spaces* is true, four consecutive spaces are encoded as the
special character ``y`` instead of the sequence ``+<VdL``.

< 8000 span class='blob-code-inner blob-code-marker ' data-code-marker="+"> If *wrap* is true, the output begins with ``<~`` and ends with ``~>``, as
in the Adobe Ascii85 format.

If *width* is provided and greater than 0, the output is split into lines
of no more than the specified width separated by the ASCII newline
character.

If *pad* is true, the input is padded to a multiple of 4 before encoding.


.. function:: a2b_base85(string, /, *, strict_mode=False, z85=False)

Convert base85 data back to binary and return the binary data.
More than one line may be passed at a time.

If *strict_mode* is true, only valid base85 data will be converted.
Invalid base85 data will raise :exc:`binascii.Error`.

If *z85* is true, the base85 data uses the Z85 alphabet.
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.

Valid base85 data contains characters from the base85 alphabet in groups
of five (except for the final group, which may have from two to five
characters). Each group encodes 32 bits of binary data in the range from
``0`` to ``2 ** 32 - 1``, inclusive.


.. function:: b2a_base85(data, /, *, pad=False, newline=True, z85=False)

Convert binary data to a line of ASCII characters in base85 coding.
The return value is the converted line.

If *pad* is true, the input is padded to a multiple of 4 before encoding.

If *newline* is true, a newline char is appended to the result.

If *z85* is true, the Z85 alphabet is used for conversion.
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.


.. function:: a2b_qp(data, header=False)

Convert a block of quoted-printable data back to binary and return the binary
Expand Down
5 changes: 5 additions & 0 deletions Include/internal/pycore_global_objects_fini_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions Include/internal/pycore_global_strings.h
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(flags)
STRUCT_FOR_ID(flush)
STRUCT_FOR_ID(fold)
STRUCT_FOR_ID(fold_spaces)
STRUCT_FOR_ID(follow_symlinks)
STRUCT_FOR_ID(format)
STRUCT_FOR_ID(format_spec)
Expand Down Expand Up @@ -636,6 +637,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(outpath)
STRUCT_FOR_ID(overlapped)
STRUCT_FOR_ID(owner)
STRUCT_FOR_ID(pad)
STRUCT_FOR_ID(pages)
STRUCT_FOR_ID(parent)
STRUCT_FOR_ID(password)
Expand Down Expand Up @@ -792,11 +794,14 @@ struct _Py_global_strings {
STRUCT_FOR_ID(weekday)
STRUCT_FOR_ID(which)
STRUCT_FOR_ID(who)
STRUCT_FOR_ID(width)
STRUCT_FOR_ID(withdata)
STRUCT_FOR_ID(wrap)
STRUCT_FOR_ID(writable)
STRUCT_FOR_ID(write)
STRUCT_FOR_ID(write_through)
STRUCT_FOR_ID(year)
STRUCT_FOR_ID(z85)
STRUCT_FOR_ID(zdict)
} identifiers;
struct {
Expand Down
5 changes: 5 additions & 0 deletions Include/internal/pycore_runtime_init_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions Include/internal/pycore_unicodeobject_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

57 changes: 57 additions & 0 deletions Lib/_base64.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""C accelerator wrappers for originally pure-Python parts of base64."""

from binascii import Error, a2b_ascii85, a2b_base85, b2a_ascii85, b2a_base85
from base64 import _bytes_from_decode_data, bytes_types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid import cycles like this, it can make refactoring in the future harder.



# Base 85 encoder functions in base64 silently convert input to bytes.
def _bytes_from_encode_data(b):
return b if isinstance(b, bytes_types) else memoryview(b).tobytes()


# Functions in binascii raise binascii.Error instead of ValueError.
def raise_valueerror(func):
def _func(*args, **kwargs):
try:
return func(*args, **kwargs)
except Error as e:
raise ValueError(e) from None
return _func


@raise_valueerror
def _a85encode(b, *, foldspaces=False, wrapcol=0, pad=False, adobe=False):
b = _bytes_from_encode_data(b)
return b2a_ascii85(b, fold_spaces=foldspaces,
wrap=adobe, width=wrapcol, pad=pad)


@raise_valueerror
def _a85decode(b, *, foldspaces=False, adobe=False, ignorechars=b' \t\n\r\v'):
b = _bytes_from_decode_data(b)
return a2b_ascii85(b, fold_spaces=foldspaces,
wrap=adobe, ignore=ignorechars)


@raise_valueerror
def _b85encode(b, pad=False):
b = _bytes_from_encode_data(b)
return b2a_base85(b, pad=pad, newline=False)


@raise_valueerror
def _b85decode(b):
b = _bytes_from_decode_data(b)
return a2b_base85(b, strict_mode=True)


@raise_valueerror
def _z85encode(s):
s = _bytes_from_encode_data(s)
return b2a_base85(s, newline=False, z85=True)


@raise_valueerror
def _z85decode(s):
s = _bytes_from_decode_data(s)
return a2b_base85(s, strict_mode=True, z85=True)
21 changes: 21 additions & 0 deletions Lib/base64.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,27 @@ def decodebytes(s):
return binascii.a2b_base64(s)


# Use accelerated implementations of originally pure-Python parts if possible.
try:
from _base64 import (_a85encode, _a85decode, _b85encode,
_b85decode, _z85encode, _z85decode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given these are already in a private module, you can remove the prefix. That means the _copy_attributes function only needs to copy __doc__, and __module__ can be set to the static 'base64'.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

from functools import update_wrapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functools is an expensive import, I would copy the relative parts of update_wrapper() locally.

update_wrapper(_a85encode, a85encode)
update_wrapper(_a85decode, a85decode)
update_wrapper(_b85encode, b85encode)
update_wrapper(_b85decode, b85decode)
update_wrapper(_z85encode, z85encode)
update_wrapper(_z85decode, z85decode)
a85encode = _a85encode
a85decode = _a85decode
b85encode = _b85encode
b85decode = _b85decode
z85encode = _z85encode
z85decode = _z85decode
except ImportError:
pass


# Usable as a script...
def main():
"""Small main program"""
Expand Down
Loading
Loading
0