8000 PEP 788: Minor clarity changes and improvements by ZeroIntensity · Pull Request #4474 · python/peps · GitHub
[go: up one dir, main page]

Skip to content

PEP 788: Minor clarity changes and improvements #4474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 27, 2025
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 102 additions & 89 deletions peps/pep-0788.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ of a Python process.

The "current interpreter" refers to the interpreter-state
pointer on an :term:`attached thread state`, as returned by
:c:func:`PyThreadState_GetInterpreter`.
:c:func:`PyThreadState_GetInterpreter` or :c:func:`PyInterpreterState_Get`.

Native and Python Threads
-------------------------
Expand Down Expand Up @@ -162,8 +162,17 @@ This affects CPython itself, and there's not much that can be done
to fix it with the current API. For example,
`python/cpython#129536 <https://github.com/python/cpython/issues/129536>`_
remarks that the :mod:`ssl` module can emit a fatal error when used at
finalization, because a daemon thread got hung while holding the lock.

finalization, because a daemon thread got hung while holding the lock
for :data:`sys.stderr`, and then a finalizer tried to write to it.
Ideally, a thread should be able to temporarily prevent the interpreter
from hanging it while it holds the lock.

However, it's generally unsafe to acquire Python locks (for example,
:class:`threading.Lock`) in finalizers, because the garbage collector
might run while the lock is held, which would deadlock if another finalizer
tried to acquire the lock. This does not apply to many C locks, such as with
:data:`sys.stderr`, because Python code cannot be run while the lock is held.
This PEP intends to fix this problem for C locks, not Python locks.

Daemon Threads are not the Problem
**********************************
Expand All @@ -179,9 +188,9 @@ threads is that they're a large cause of problems in the interpreter:
down upon runtime finalization. As in they have pointers to global state for
the interpreter.

In practice, daemon threads are useful for simplifying many threading applications
in Python, and since the program is about to close in most cases, it's not worth
the added complexity to try and gracefully shut down a thread.
However, in practice, daemon threads are useful for simplifying many threading
applications in Python, and since the program is about to close in most cases,
it's not worth the added complexity to try and gracefully shut down a thread.

When I’ve needed daemon threads, it’s usually been the case of “Long-running,
uninterruptible, third-party task” in terms of the examples in the linked issue.
Expand All @@ -196,7 +205,7 @@ As noted by this PEP, extension modules are free to create their own threads
and attach thread states for them. Similar to daemon threads, Python doesn't
try and join them during finalization, so trying to remove daemon threads
as a whole would involve trying to remove them from the C API, which would
require a massive API change.
require a much more massive API change.

Realize however that even if we get rid of daemon threads, extension
module code can and does spawn its own threads that are not tracked by
Expand All @@ -216,7 +225,7 @@ needs to already have an :term:`attached thread state` for the thread. If
there's no guarantee of that, then :func:`atexit.register` cannot be safely
called without the risk of hanging the thread. This shifts the contract
of joining the thread to the caller rather than the callee, which again,
isn't done in practice.
isn't reliable enough in practice to be a viable solution.

For example, large C++ applications might want to expose an interface that can
call Python code. To do this, a C++ API would take a Python object, and then
Expand Down Expand Up @@ -252,8 +261,12 @@ The GIL-state APIs are Buggy and Confusing

There are currently two public ways for a user to create and attach a
:term:`thread state` for their thread; manual use of :c:func:`PyThreadState_New`
and :c:func:`PyThreadState_Swap`, and :c:func:`PyGILState_Ensure`. The latter,
:c:func:`PyGILState_Ensure`, is `the most common <https://grep.app/search?q=pygilstate_ensure>`_.
and :c:func:`PyThreadState_Swap`, or the convenient :c:func:`PyGILState_Ensure`.

The latter, :c:func:`PyGILState_Ensure`, is significantly more common, having
`nearly 3,000 hits <https://grep.app/search?q=pygilstate_ensure>`_ in a code
search, whereas :c:func:`PyThreadState_New` has
`less than 400 hits <https://grep.app/search?q=PyThreadState_New>`_.

``PyGILState_Ensure`` Generally Crashes During Finalization
***********************************************************
Expand All @@ -263,7 +276,7 @@ always match the documentation. Instead of hanging the thread during finalizatio
as previously noted, it's possible for it to crash with a segmentation
fault. This is a `known issue <https://github.com/python/cpython/issues/124619>`_
that could be fixed in CPython, but it's definitely worth noting
here. Incidentally, acceptance and implementation of this PEP will likely fix
here, because acceptance and implementation of this PEP will likely fix
the existing crashes caused by :c:func:`PyGILState_Ensure`.

The Term "GIL" is Tricky for Free-threading
Expand All @@ -279,28 +292,7 @@ created by the authors of this PEP:
omit ``PyGILState_Ensure`` in fresh threads.

Again, :c:func:`PyGILState_Ensure` gets an :term:`attached thread state`
for the thread on both with-GIL and free-threaded builds. To demonstate,
:c:func:`PyGILState_Ensure` is very roughly equivalent to the following:

.. code-block:: c

PyGILState_STATE
PyGILState_Ensure(void)
{
PyThreadState *existing = PyThreadState_GetUnchecked();
if (existing == NULL) {
// Chooses the interpreter of the last attached thread state
// for this thread. If Python has never ran in this thread, the
// main interpreter is used.
PyInterpreterState *interp = guess_interpreter();
PyThreadState *tstate = PyThreadState_New(interp);
PyThreadState_Swap(tstate);
return opaque_tstate_handle(tstate);
} else {
return opaque_tstate_handle(existing);
}
}

for the thread on both with-GIL and free-threaded builds.
An attached thread state is always needed to call the C API, so
:c:func:`PyGILState_Ensure` still needs to be called on free-threaded builds,
but with a name like "ensure GIL", it's not immediately clear that that's true.
Expand Down Expand Up @@ -331,8 +323,8 @@ subinterpreter, but then called :c:func:`PyGILState_Ensure`, the thread would
have an :term:`attached thread state` pointing to the main interpreter,
not the subinterpreter. This means that any :term:`GIL` assumptions about the
object are wrong! There isn't any synchronization between the two GILs, so both
the thread (who thinks it's in the subinterpreter) and the main thread could try
to increment the reference count at the same time, causing a data race!
the thread and the main thread could try to increment the object's reference count
at the same time, causing a data race.

An Interpreter Can Concurrently Deallocate
------------------------------------------
Expand All @@ -342,12 +334,17 @@ The other way of creating a native thread that can invoke Python,
for supporting subinterpreters (because :c:func:`PyThreadState_New` takes an
explicit interpreter, rather than assuming that the main interpreter was
requested), but is still limited by the current hanging problems in the C API.
Manual creation of thread states ("manual" in contrast to the implicit creation
of one in :c:func:`PyGILState_Ensure`) does not solve any of the aforementioned
thread-safety issues with thread states.

In addition, subinterpreters typically have a much shorter lifetime than the
main interpreter, so there's a much higher chance that an interpreter passed
to a thread will have already finished and have been deallocated. So, passing
that interpreter to :c:func:`PyThreadState_New` will most likely crash the program
because of a use-after-free on the interpreter-state.
main interpreter, so if there was no synchronization between the calling thread
and the created thread, there's a much higher chance that an interpreter-state
passed to a thread will have already finished and have been deallocated,
causing use-after-free crashes. As of writing, this is a relatively
theoretical problem, but it's likely this will become more of an issue
in newer versions with the recent acceptance of :pep:`734`.

Rationale
=========
Expand All @@ -367,17 +364,30 @@ thread being hung.
This means that interfacing Python (for example, in a C++ library) will need
a reference to the interpreter in order to safely call the object, which is
definitely more inconvenient than assuming the main interpreter is the right
choice, but there's not really another option.
choice, but there's not really another option. A future proposal could perhaps
make this cleaner by adding a tracking mechanism for an object's interpreter
(such as a field on :c:type:`PyObject`).

Generally speaking, a strong interpreter reference should be short-lived. An
interpreter reference should act similar to a lock, or a "critical section",
where the interpreter must not hang the thread or deallocate. For example,
when acquiring an IO lock, a strong interpreter reference should be acquired
before locking, and then released once the lock is released.

Weak References
***************

This proposal also comes with weak references to an interpreter that don't
prevent it from shutting down, but can be promoted to a strong reference when
the user decides that they want to call the C API. Promotion of a weak reference
to a strong reference can fail if the interpreter has already finalized, or
reached a point during finalization where it can't be guaranteed that the
thread won't hang.
the user decides that they want to call the C API. A weak reference will
typically live much longer than a strong reference. This is useful for many of
the asynchronous situations stated previously, where the thread itself
shouldn't prevent the desired interpreter from shutting down, but also allow
the thread to execute Python when needed.

For example, a (non-reentrant) event handler may store a weak interpreter
reference in its ``void *arg`` parameter, and then that weak reference will
be promoted to a strong reference when it's time to call Python code.

Deprecation of the GIL-state APIs
---------------------------------
Expand All @@ -389,16 +399,18 @@ subinterpreters:

- :c:func:`PyGILState_Ensure`: :c:func:`PyThreadState_Swap` & :c:func:`PyThreadState_New`
- :c:func:`PyGILState_Release`: :c:func:`PyThreadState_Clear` & :c:func:`PyThreadState_Delete`
- :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get`
- :c:func:`PyGILState_GetThisThreadState`: :c:func:`PyThreadState_Get` (roughly)
- :c:func:`PyGILState_Check`: ``PyThreadState_GetUnchecked() != NULL``

This PEP specifies a ten-year deprecation for these functions (while remaining
in the stable ABI), mainly because it's expected that the migration will be a
little painful, because :c:func:`PyThreadState_Ensure` and
:c:func:`PyThreadState_Release` aren't drop-in replacements for
This PEP specifies a deprecation for these functions (while remaining
in the stable ABI), because :c:func:`PyThreadState_Ensure` and
:c:func:`PyThreadState_Release` will act as more-correct replacements for
:c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release`, due to the
requirement of a specific interpreter. The exact details of this deprecation
aren't too clear, see :ref:`pep-788-deprecation`.
requirement of a specific interpreter.

The exact details of this deprecation aren't too clear. It's likely that
the usual five-year deprecation (as specificed by :pep:`387`) will be too
short, so for now, these functions will have no specific removal date.

Specification
=============
Expand All @@ -407,19 +419,22 @@ Interpreter References to Prevent Shutdown
------------------------------------------

An interpreter will keep a reference count that's managed by users of the
C API. When the interpreter starts finalizing, it will until its reference count
reaches zero before proceeding to a point where threads will be hung. This will
happen around the same time when :class:`threading.Thread` objects are joined,
but note that this *is not* the same as joining the thread; the interpreter will
only wait until the reference count is zero, and then proceed. The interpreter
must not hang threads until this reference count has reached zero.
C API. When the interpreter starts finalizing, it will wait until its reference
count reaches zero before proceeding to a point where threads will be hung and
it may deallocate its state. The interpreter will wait on its reference count
around the same time when :class:`threading.Thread` objects are joined, but
note that this *is not* the same as joining the thread; the interpreter will
only wait until the reference count is zero, and then proceed.
After the reference count has reached zero, threads can no longer prevent the
interpreter from shutting down.
interpreter from shutting down (thus :c:func:`PyInterpreterRef_Get` and
:c:func:`PyInterpreterWeakRef_AsStrong` will fail).

A weak reference to the interpreter won't prevent it from finalizing, but can
be safely accessed after the interpreter no longer supports strong references,
and even after the interpreter has been deleted. But, at that point, the weak
reference can no longer be promoted to a strong reference.
A weak reference to an interpreter won't prevent it from finalizing, and can
be safely accessed after the interpreter no longer supports creating strong
references, and even after the interpreter-state has been deleted. Deletion
and duplication of the weak reference will always be allowed, but promotion
(:c:func:`PyInterpreterWeakRef_AsStrong`) will always fail after the
interpreter reaches a point where strong references have been waited on.

Strong Interpreter References
*****************************
Expand Down Expand Up @@ -583,14 +598,25 @@ existing and new ``PyThreadState`` APIs. Namely:
instead.

All of the ``PyGILState`` APIs are to be removed from the non-limited C API in
Python 3.25. They will remain available in the stable ABI for compatibility.
a future Python version. They will remain available in the stable ABI for
compatibility.

It's worth noting that :c:func:`PyThreadState_Get` and
:c:func:`PyThreadState_GetUnchecked` aren't perfect replacements for
:c:func:`PyGILState_GetThisThreadState`, because
:c:func:`PyGILState_GetThisThreadState` is able to return a thread state even
when it is :term:`detached <attached thread state>`. This PEP intentionally
doesn't leave a perfect replacement for this, because the GIL-state pointer
(which holds the last used thread state by the thread) is only useful for
those implementing :c:func:`PyThreadState_Ensure` or similar. It's not a
common API to want as a user.

Backwards Compatibility
=======================

This PEP specifies a breaking change with the removal of all the
``PyGILState`` APIs from the public headers of the non-limited C API in 10
years (Python 3.25).
``PyGILState`` APIs from the public headers of the non-limited C API in a
future version.

Security Implications
=====================
Expand Down Expand Up @@ -676,24 +702,16 @@ held. Any future finalizer that wanted to acquire the lock would be deadlocked!
/* Python interpreter has shut down */
return NULL;
}
/* Temporarily hold a strong reference to ensure that the
lock is released. */
if (PyThreadState_Ensure(ref) < 0) {
PyErr_NoMemory();
PyInterpreterRef_Close(ref);
return NULL;
}

Py_BEGIN_ALLOW_THREADS;
acquire_some_lock();
Py_END_ALLOW_THREADS;

/* Do something while holding the lock.
The interpreter won't finalize during this period. */
// ...

release_some_lock();
PyThreadState_Release();
Py_END_ALLOW_THREADS;
PyInterpreterRef_Close(ref);
Py_RETURN_NONE;
}
Expand Down Expand Up @@ -780,8 +798,9 @@ This is the same code, rewritten to use the new functions:
Example: A Daemon Thread
************************

Native daemon threads are still a use-case, and as such,
they can still be used with this API:
With this PEP, daemon threads are very similar to how native threads are used
in the C API today. After calling :c:func:`PyThreadState_Ensure`, simply
release the interpreter reference, allowing the interpreter to shut down.

.. code-block:: c

Expand Down Expand Up @@ -1038,21 +1057,15 @@ under that category.
Open Issues
===========

.. _pep-788-deprecation:

When Should the GIL-state APIs be Removed?
------------------------------------------
There are currently no open issues for this PEP.

:c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` have been around
for over two decades, and it's expected that the migration will be difficult.
Currently, the plan is to remove them in 10 years (opposed to the 5 years
required by :pep:`387`), but this is subject to further discussion, as it's
unclear if that's enough (or too much) time.
Acknowledgements
================

In addition, it's unclear whether to remove them at all. A
:term:`soft deprecation <soft deprecated>` could reasonably fit for these
functions if it's determined that a full ``PyGILState`` removal would
be too disruptive for the ecosystem.
This PEP is based on prior work, feedback, and discussions from many people,
including Victor Stinner, Antoine Pitrou, Da Woods, Sam Gross, Matt Page,
Ronald Oussoren, Matt Wozniski, Eric Snow, Steve Dower, Petr Viktorin,
and Gregory P. Smith.

Copyright
=========
Expand Down
0