8000 PEP NNN: Recording provenance of installed packages by fridex · Pull Request #2988 · python/peps · GitHub
[go: up one dir, main page]

Skip to content

PEP NNN: Recording provenance of installed packages #2988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already 8000 on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions pep-0705.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
PEP: 705
Title: Recording the provenance of installed packages
Author: Fridolin Pokorny <fridolin.pokorny at gmail.com>,
Trishank Karthik Kuppusamy <karthik@trishank.com>,
Sponsor:
PEP-Delegate:
Copy link
Member
@CAM-Gerlach CAM-Gerlach Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PEP-Delegate:
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>

@pfmoore , as the standing PEP Delegate for packaging metadata PEPs, are you going to assume this role (assuming this is accepted as a PEP with the appropriate discussion and reworking), or would we need to look for someone else (or submit to the SC)?

Alternatively, if no PEP delegate is assigned for now, then this header should simply be omitted

Suggested change
PEP-Delegate:

Discussions-To: https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340
Status: Draft
Type: Process
Content-Type: text/x-rst
Created: 30-Jan-2002
Post-History: `03-Dec-2021 <https://discuss.python.org/t/pip-installation-reports/12316>`__,
`30-Jan-2023 <https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340>`__,


Abstract
========

This PEP describes a way to record the provenance of Python packages installed.
The record is created by an installer and is available to users in the form of a
JSON file ``direct_url.json`` in the ``.dist-info`` directory. The PEP is an
extension to :pep:`610` for cases when installed packages come from a
package index.


Motivation
==========

Installing a Python package involves downloading the package from a source and
extracting its content to an appropriate place. After the installation process
is done, information about the artifact used as well as its source is generally
lost. Nevertheless, there are use cases for keeping records of artifacts used
for installing packages and their provenance.

Python wheels can be built with different compiler flags or supporting
different wheel tags. In both cases, users might get into a situation in which
multiple wheels might be considered by installers (possibly from different
package indexes) and immediately finding out which artifact was actually used
during the installation might be helpful. This way, tools reporting software
installed, such as tools reporting a software bill of Materials (SBOM), might give
more accurate reports.

The motivation described in this PEP is a direct extension to :pep:`610`.
Besides stating information about packages installed from a URL, installers
SHOULD record information also for packages installed from Python package
indexes or from the filesystem.


Examples
========
Comment on lines +49 to +50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've jumped right to the examples before introducing any actual specification (and I don't see a section titled such anywhere). I would suggest starting with a clear, precise and unambiguous specification first, which is a preliminary requirement to be able to accept your PEP.


An example of a ``direct_url.json`` file in the ``.dist-info`` directory
alongside files stated in :pep:`376` and further adjusted in :pep:`627`. The
specified file stores a JSON describing artifact used to install a package as
well as its source. The record is similar to ``direct_url.json`` described in
:pep:`610`:

.. code-block:: json

{
"archive_info": {
"hash": "sha256=714ac14496c3e68c99c29b00845f7a2b85f3bb6f1078fd9f72fd20f0570002b2"
},
"url": "https://files.pythonhosted.org/packages/ed/35/a31aed2993e398f6b09a790a181a7927eb14610ee8bbf02dc14d31677f1c/packaging-23.0-py3-none-any.whl"
}

If a source distribution was used to build a wheel file which was subsequently
installed, the ``url`` MUST state URL to the source distribution used.

For cases when a package is installed from a local directory,
``direct_url.json`` SHOULD preserve path to the file used:

.. code-block:: json

{
"archive_info": {
"hash": "sha256=b9c46cc36662a7949f34b52d8ec7bb59c0d74ba08ba6cb9ce9adc1d8676d9526"
},
"url": "file:///home/user/wheels/Flask-2.2.2-py3-none-any.whl"
}

For installations when a package is installed by providing a URL, :pep:`610` is
still applicable.

In both cases, the JSON document is stating the following entries:

* ``archive_info.hash`` MUST be present with a value of ``<hash-algorithm>=<expected-hash>``,
currently supported hash algorithm is only ``sha256``.

* ``url`` - MUST be present and points to the source from where the package was obtained.
The value MUST be stripped of any sensitive authentication information, for security
reasons.

Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
0