8000 PEP 554: Seasonal Updates. by ericsnowcurrently · Pull Request #944 · python/peps · GitHub
[go: up one dir, main page]

Skip to content

PEP 554: Seasonal Updates. #944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 23, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Fix typos, formatting, and small clarifications.
  • Loading branch information
ericsnowcurrently committed Mar 23, 2019
commit c5271cf8988a4a7d4aed62dba7aeff3de90d9894
112 changes: 63 additions & 49 deletions pep-0554.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Abstract

CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5 (1997). The feature has been
available via the C-API. [c-api]_ Subinterpreters operate in
available via the C-API. [c-api]_ Subinterpreters operate in
`relative isolation from one another <Interpreter Isolation_>`_, which
provides the basis for an
`alternative concurrency model <Concurrency_>`_.
Expand Down Expand Up @@ -152,7 +152,7 @@ For sharing data between interpreters:
| | | receiving end of the channel and wait. |
| | | Associate the interpreter with the channel. |
+---------------------------+-------------------------------------------------+
| .send_nowait(obj) | | Like send(), but Fail if not received. |
| .send_nowait(obj) | | Like send(), but fail if not received. |
+---------------------------+-------------------------------------------------+
| .send_buffer(obj) | | Send the object's (PEP 3118) buffer to the |
| | | receiving end of the channel and wait. |
Expand Down Expand Up @@ -494,8 +494,8 @@ each with different goals. Most center on correctness and usability.

One class of concurrency models focuses on isolated threads of
execution that interoperate through some message passing scheme. A
notable example is `Communicating Sequential Processes`_ (CSP), upon
which Go's concurrency is based. The isolation inherent to
notable example is `Communicating Sequential Processes`_ (CSP) (upon
which Go's concurrency is roughly based). The isolation inherent to
subinterpreters makes them well-suited to this approach.

Shared data
Expand All @@ -521,9 +521,9 @@ There are a number of valid solutions, several of which may be
appropriate to support in Python. This proposal provides a single basic
solution: "channels". Ultimately, any other solution will look similar
to the proposed one, which will set the precedent. Note that the
implementation of ``Interpreter.run()`` can be done in a way that allows
for multiple solutions to coexist, but doing so is not technically
a part of the proposal here.
implementation of ``Interpreter.run()`` will be done in a way that
allows for multiple solutions to coexist, but doing so is not
technically a part of the proposal here.

Regarding the proposed solution, "channels", it is a basic, opt-in data
sharing mechanism that draws inspiration from pipes, queues, and CSP's
Expand All @@ -534,7 +534,8 @@ channels have two operations: send and receive. A key characteristic
of those operations is that channels transmit data derived from Python
objects rather than the objects themselves. When objects are sent,
their data is extracted. When the "object" is received in the other
interpreter, the data is converted back into an object.
interpreter, the data is converted back into an object owned by that
interpreter.

To make this work, the mutable shared state will be managed by the
Python runtime, not by any of the interpreters. Initially we will
Expand Down Expand Up @@ -589,11 +590,11 @@ Finally, some potential isolation is missing due to the current design
of CPython. Improvements are currently going on to address gaps in this
area:

* interpreters share the GIL
* interpreters share memory management (e.g. allocators, gc)
* GC is not run per-interpreter [global-gc]_
* at-exit handlers are not run per-interpreter [global-atexit]_
* extensions using the ``PyGILState_*`` API are incompatible [gilstate]_
* interpreters share memory management (e.g. allocators, gc)
* interpreters share the GIL

Existing Usage
--------------
Expand Down Expand Up @@ -683,7 +684,7 @@ The module also provides the following class:
"channels" keyword argument is provided (and is a mapping of
attribute names to channels) then it is added to the interpreter's
execution namespace (the interpreter's "__main__" module). If any
of the values are not are not RecvChannel or SendChannel instances
of the values are not RecvChannel or SendChannel instances
then ValueError gets raised.

This may not be called on an already running interpreter. Doing
Expand Down Expand Up @@ -763,21 +764,25 @@ whether an object is shareable or not:
a cross-interpreter way, whether via a proxy, a copy, or some other
means.

This proposal provides two ways to do share such objects between
This proposal provides two ways to share such objects between
interpreters.

First, shareable objects may be passed to ``run()`` as keyword arguments,
where they are effectively injected into the target interpreter's
``__main__`` module. This is mainly intended for sharing meta-objects
(e.g. channels) between interpreters, as it is less useful to pass other
objects (like ``bytes``) to ``run``.
First, channels may be passed to ``run()`` via the ``channels``
keyword argument, where they are effectively injected into the target
interpreter's ``__main__`` module. While passing arbitrary shareable
objects this way is possible, doing so is mainly intended for sharing
meta-objects (e.g. channels) between interpreters. It is less useful
to pass other objects (like ``bytes``) to ``run`` directly.

Second, the main mechanism for sharing objects (i.e. their data) between
interpreters is through channels. A channel is a simplex FIFO similar
to a pipe. The main difference is that channels can be associated with
zero or more interpreters on either end. Unlike queues, which are also
many-to-many, channels have no buffer.

The ``interpreters`` module provides the following functions and
classes related to channels:

``create_channel()``::

Create a new channel and return (recv, send), the RecvChannel and
Expand All @@ -802,24 +807,25 @@ many-to-many, channels have no buffer.
``RecvChannel(id)``::

The receiving end of a channel. An interpreter may use this to
receive objects from another interpreter. At first only bytes will
be supported.
receive objects from another interpreter. At first only a few of
the simple, immutable builtin types will be supported.

id:

The channel's unique ID.
The channel's unique ID. This is shared with the "send" end.

interpreters:

The list of associated interpreters: those that have called
the "recv()" or "__next__()" methods and haven't called
"release()" (and the channel hasn't been explicitly closed).
the "recv()" method and haven't called "release()" (and the
channel hasn't been explicitly closed).

recv():

Return the next object (i.e. the data from the sent object) from
the channel. If none have been sent then wait until the next
send. This associates the current interpreter with the channel.
send. This associates the current interpreter with the "recv"
end of the channel.

If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
Expand Down Expand Up @@ -848,7 +854,7 @@ many-to-many, channels have no buffer.
to 0, the channel is actually marked as closed. The Python
runtime will garbage collect all closed channels, though it may
not be immediately. Note that "release()" is automatically called
in behalf of the current interpreter when the channel is no longer
on behalf of the current interpreter when the channel is no longer
used (i.e. has no references) in that interpreter.

This operation is idempotent. Return True if "release()" has not
Expand All @@ -857,21 +863,21 @@ many-to-many, channels have no buffer.
close(force=False):

Close both ends of the channel (in all interpreters). This means
that any further use of the channel raises ChannelClosedError. If
the channel is not empty then raise ChannelNotEmptyError (if
"force" is False) or discard the remaining objects (if "force"
is True) and close it.
that any further use of the channel anywhere raises
ChannelClosedError. If the channel is not empty then raise
ChannelNotEmptyError (if "force" is False) or discard the
remaining objects (if "force" is True) and close it.


``SendChannel(id)``::

The sending end of a channel. An interpreter may use this to send
objects to another interpreter. At first only bytes will be
supported.
objects to another interpreter. At first only a few of
the simple, immutable builtin types will be supported.

id:

The channel's unique ID.
The channel's unique ID. This is shared with the "recv" end.

interpreters:

Expand All @@ -882,8 +888,9 @@ many-to-many, channels have no buffer.

Send the object (i.e. its data) to the receiving end of the
channel. Wait until the object is received. If the the
object is not shareable then ValueError is raised. Currently
only bytes are supported.
object is not shareable then ValueError is raised. This
associates the current interpreter with the "send" end of the
channel.

If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
Expand All @@ -892,9 +899,10 @@ many-to-many, channels have no buffer.

send_nowait(obj):

Send the object to the receiving end of the channel. If the other
end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send()".
Send the object to the receiving end of the channel. If no
interpreter is currently receiving (waiting on the other end)
then raise NotReceivedError. Otherwise this is the same as
"send()".

send_buffer(obj):

Expand All @@ -918,9 +926,9 @@ many-to-many, channels have no buffer.
Close both ends of the channel (in all interpreters). No matter
what the "send" end of the channel is immediately closed. If the
channel is empty then close the "recv" end immediately too.
Otherwise wait until the channel is empty before closing it (if
"force" is False) or discard the remaining items and close
immediately (if "force" is True).
Otherwise, if "force" if False, close the "recv" end (and hence
the full channel) once the channel becomes empty; or, if "force"
is True, discard the remaining items and close immediately.

Note that ``send_buffer()`` is similar to how
``multiprocessing.Connection`` works. [mp-conn]_
Expand All @@ -937,6 +945,7 @@ Open Questions
Open Implementation Questions
=============================

.. XXX
Does every interpreter think that their thread is the "main" thread?
--------------------------------------------------------------------

Expand All @@ -949,6 +958,7 @@ or not. This presents a problem in cases where "main thread" is meant
to imply "main thread in the main interpreter" [main-thread]_, where
the main interpreter is the initial one.

.. XXX
Disallow subinterpreters in the main thread?
--------------------------------------------

Expand Down Expand Up @@ -1048,10 +1058,11 @@ Syntactic Support

The ``Go`` language provides a concurrency model based on CSP, so
it's similar to the concurrency model that subinterpreters support.
``Go`` provides syntactic support, as well several builtin concurrency
primitives, to make concurrency a first-class feature. Conceivably,
similar syntactic (and builtin) support could be added to Python using
subinterpreters. However, that is *way* outside the scope of this PEP!
However, ``Go`` also provides syntactic support, as well several builtin
concurrency primitives, to make concurrency a first-class feature.
Conceivably, similar syntactic (and builtin) support could be added to
Python using subinterpreters. However, that is *way* outside the scope
of this PEP!

Multiprocessing
---------------
Expand All @@ -1072,26 +1083,29 @@ raise an ImportError if unsupported.

Alternately we could support opting in to subinterpreter support.
However, that would probably exclude many more modules (unnecessarily)
than the opt-out approach.
than the opt-out approach. Also, note that PEP 489 defined that an
extension's use of the PEP's machinery implies support for
subinterpreters.

The scope of adding the ModuleDef slot and fixing up the import
machinery is non-trivial, but could be worth it. It all depends on
how many extension modules break under subinterpreters. Given the
relatively few cases we know of through mod_wsgi, we can leave this
for later.
how many extension modules break under subinterpreters. Given that
there are relatively few cases we know of through mod_wsgi, we can
leave this for later.

Poisoning channels
------------------

CSP has the concept of poisoning a channel. Once a channel has been
poisoned, and ``send()`` or ``recv()`` call on it will raise a special
poisoned, any ``send()`` or ``recv()`` call on it would raise a special
exception, effectively ending execution in the interpreter that tried
to use the poisoned channel.

This could be accomplished by adding a ``poison()`` method to both ends
of the channel. The ``close()`` method can be used in this way
(mostly), but these semantics are relatively specialized and can wait.

.. XXX
Sending channels over channels
------------------------------

Expand Down Expand Up @@ -1161,7 +1175,7 @@ Per Antoine Pitrou [async]_::
on (probably a file descriptor?).

A possible solution is to provide async implementations of the blocking
channel methods (``__next__()``, ``recv()``, and ``send()``). However,
channel methods (``recv()``, and ``send()``). However,
the basic functionality of subinterpreters does not depend on async and
can be added later.

Expand Down
0