8000 Use-after-free in `unicode_escape` decoder with error handler · Issue #133767 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Use-after-free in unicode_escape decoder with error handler #133767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sethmlarson opened this issue May 9, 2025 · 2 comments
Closed

Use-after-free in unicode_escape decoder with error handler #133767

sethmlarson opened this issue May 9, 2025 · 2 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump type-security A security issue

Comments

@sethmlarson sethmlarson added the type-crash A hard crash of the interpreter, possibly with a core dump label May 9, 2025
@sethmlarson sethmlarson changed the title Use-after-free in unicode escape decoder with error handler Use-after-free in unicode_escape decoder with error handler May 9, 2025
@serhiy-storchaka serhiy-storchaka self-assigned this May 9, 2025
@picnixz picnixz added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-security A security issue 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes labels May 10, 2025
serhiy-storchaka added a commit that referenced this issue May 12, 2025
…rror handler (GH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 12, 2025
…h an error handler (pythonGH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 12, 2025
…der with an error handler (pythonGH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this issue May 13, 2025
…th an error handler (GH-129648) (GH-133942)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link
bedevere-app bot commented May 19, 2025

GH-134255 is a backport of this pull request to the 3.12 branch.

encukou pushed a commit that referenced this issue May 20, 2025
…th an error handler (GH-129648) (GH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)
(cherry picked from commit 0c33e5b)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 20, 2025
…er with an error handler (pythonGH-129648) (pythonGH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)
(cherry picked from commit 0c33e5b)
(cherry picked from commit 8b528ca)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Yhg1s pushed a commit that referenced this issue May 26, 2025
…th an error handler (GH-129648) (GH-133944) (#134337)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
ambv pushed a commit that referenced this issue Jun 2, 2025
…th an error handler (GH-129648) (GH-133944) (GH-134341)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv pushed a commit that referenced this issue Jun 2, 2025
…th an error handler (GH-129648) (GH-133944) (GH-134345)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)
(cherry picked from commit 0c33e5b)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv pushed a commit that referenced this issue Jun 2, 2025
…h an error handler (GH-129648) (GH-133944) (#134346)

* [3.9] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)
(cherry picked from commit 6279eb8)
(cherry picked from commit a75953b)
(cherry picked from commit 0c33e5b)
(cherry picked from commit 8b528ca)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@ambv ambv closed this as completed Jun 3, 2025
mcepl pushed a commit to openSUSE-Python/cpython that referenced this issue Jun 4, 2025
… an error handler

Cut disused recode_encoding logic in _PyBytes_DecodeEscape.

All call sites pass NULL for `recode_encoding`, so this path is
completely untested.  That's been true since before Python 3.0.
It adds significant complexity to this logic, so it's best to
take it out.

All call sites now have a literal NULL, and that's been true since
commit 768921c eliminated a conditional (`foo ? bar : NULL`) at
the call site in Python/ast.c where we're parsing a bytes literal.
But even before then, that condition `foo` had been a constant
since unadorned string literals started meaning Unicode, in commit
572dbf8 aka v3.0a1~1035 .

The `unicode` parameter is already unused, so mark it as unused too.
The code that acted on it was also taken out before Python 3.0, in
commit 8d30cc0 aka v3.0a1~1031 .

The function (PyBytes_DecodeEscape) is exposed in the API, but it's
never been documented.

Fixes: bsc#1243273 (CVE-2025-4516)
Fixes: gh#python#133767
From-PR: gh#python/cpython!134346
Patch: CVE-2025-4516-DecodeError-handler.patch
@mcepl
Copy link
Contributor
mcepl commented Jun 4, 2025

I have uploaded above my draft of the backport to 3.6. Any review or comments would be more than welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump type-security A security issue
Projects
Development

No branches or pull requests

5 participants
0