8000 [3.12] gh-135034: Normalize link targets in tarfile, add `os.path.realpath(strict='allow_missing')` (GH-135037) by Yhg1s · Pull Request #135066 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

[3.12] gh-135034: Normalize link targets in tarfile, add os.path.realpath(strict='allow_missing') (GH-135037) #135066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 3, 2025
Merged
Next Next commit
[3.12] gh-135034: Normalize link targets in tarfile, add `os.path.rea…
…lpath(strict='allow_missing')` (GH-135037)

Addresses CVEs 2024-12718, 2025-4138, 2025-4330, and 2025-4517.
(cherry picked from commit 3612d8f)

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Signed-off-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Seth Michael Larson <seth@python.org>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
  • Loading branch information
5 people authored and Yhg1s committed Jun 3, 2025
commit c358142cab7ce621a2745262a90df967b357f61c
33 changes: 29 additions & 4 deletions Doc/library/os.path.rst
Original file line number Diff line number Diff line change
Expand Up @@ -377,10 +377,26 @@ the :mod:`glob` module.)
links encountered in the path (if they are supported by the operating
system).

If a path doesn't exist or a symlink loop is encountered, and *strict* is
``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
resolved as far as possible and any remainder is appended without checking
whether it exists.
By default, the path is evaluated up to the first component that does not
exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
All such components are appended unchanged to the existing part of the path.

Some errors that are handled this way include "access denied", "not a
directory", or "bad argument to internal function". Thus, the
resulting path may be missing or inaccessible, may still contain
links or loops, and may traverse non-directories.

This behavior can be modified by keyword arguments:

If *strict* is ``True``, the first error encountered when evaluating the path is
re-raised.
In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
or another :exc:`OSError` if it is otherwise inaccessible.

If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
:exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
Thus, the returned path will not contain any symbolic links, but the named
file and some of its parent directories may be missing.

.. note::
This function emulates the operating system's procedure for making a path
Expand All @@ -399,6 +415,15 @@ the :mod:`glob` module.)
.. versionchanged:: 3.10
The *strict* parameter was added.

.. versionchanged:: next
The :py:data:`~os.path.ALLOW_MISSING` value for the *strict* parameter
was added.

.. data:: ALLOW_MISSING

Special value used for the *strict* argument in :func:`realpath`.

.. versionadded:: next

.. function:: relpath(path, start=os.curdir)

Expand Down
20 changes: 20 additions & 0 deletions Doc/library/tarfile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,15 @@ The :mod:`tarfile` module defines the following exceptions:
Raised to refuse extracting a symbolic link pointing outside the destination
directory.

.. exception:: LinkFallbackError

Raised to refuse emulating a link (hard or symbolic) by extracting another
archive member, when that member would be rejected by the filter location.
The exception that was raised to reject the replacement member is available
as :attr:`!BaseException.__context__`.

.. versionadded:: next


The following constants are available at the module level:

Expand Down Expand Up @@ -1039,6 +1048,12 @@ reused in custom filters:
Implements the ``'data'`` filter.
In addition to what ``tar_filter`` does:

- Normalize link targets (:attr:`TarInfo.linkname`) using
:func:`os.path.normpath`.
Note that this removes internal ``..`` components, which may change the
meaning of the link if the path in :attr:`!TarInfo.linkname` traverses
symbolic links.

- :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft)
that link to absolute paths, or ones that link outside the destination.

Expand Down Expand Up @@ -1067,6 +1082,10 @@ reused in custom filters:

Return the modified ``TarInfo`` member.

.. versionchanged:: next

Link targets are now normalized.


.. _tarfile-extraction-refuse:

Expand All @@ -1093,6 +1112,7 @@ Here is an incomplete list of things to consider:
* Extract to a :func:`new temporary directory <tempfile.mkdtemp>`
to prevent e.g. exploiting pre-existing links, and to make it easier to
clean up after a failed extraction.
* Disallow symbolic links if you do not need the functionality.
* When working with untrusted data, use external (e.g. OS-level) limits on
disk, memory and CPU usage.
* Check filenames against an allow-list of characters
Expand Down
34 changes: 34 additions & 0 deletions Doc/whatsnew/3.12.rst