8000 PEP 544: Protocols by ilevkivskyi · Pull Request #224 · python/peps · GitHub
[go: up one dir, main page]

Skip to content

PEP 544: Protocols #224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Mar 18, 2017
Merged
Changes from 1 commit
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
3fa3e48
Collect various ideas
ilevkivskyi Mar 5, 2017
51b0ca0
Some formatting and reordering
ilevkivskyi Mar 5, 2017
993f7b3
Add some examples
ilevkivskyi Mar 5, 2017
e384336
Add planned link templates
ilevkivskyi Mar 6, 2017
7ea5d41
Add links + minor changes
Mar 6, 2017
cdcf62f
Polishing rationale
Mar 6, 2017
1ffed9b
Some more reshuffling and formatting
Mar 6, 2017
72ceae6
Add more examples
Mar 6, 2017
6bea2e8
Add more examples to existing approaches
Mar 6, 2017
d5972c3
Typos, reordering, and few more details (backport)
ilevkivskyi Mar 6, 2017
57d375f
Update list of protocols in typing
ilevkivskyi Mar 6, 2017
9d4d685
Defining protocols plus minor changes and formatting
ilevkivskyi Mar 7, 2017
82258d5
Explicitly declaring implementation and other changes
ilevkivskyi Mar 7, 2017
5d9fb7c
More polishing
ilevkivskyi Mar 7, 2017
a6e6d9e
Edit rejected/postponed ideas
ilevkivskyi Mar 7, 2017
3175013
Runtime things, reorder links
ilevkivskyi Mar 7, 2017
cbff669
Runtime decorator
ilevkivskyi Mar 7, 2017
dfccd06
Backward compatible part and last bits
Mar 8, 2017
60f4d52
Some clarifications
ilevkivskyi Mar 9, 2017
60e7f7f
Add links in text
ilevkivskyi Mar 9, 2017
c90aa1c
Caption style, add cross-refs
Mar 9, 2017
b008de1
Remove redundant links; + minor changes
ilevkivskyi Mar 10, 2017
02cca5c
One more tiny change
ilevkivskyi Mar 10, 2017
7d89b6b
Merge remote-tracking branch 'upstream/master' into protocols
8000 ilevkivskyi Mar 10, 2017
0f3732a
Copyediting changes
JelleZijlstra Mar 10, 2017
95fbf58
Merge pull request #1 from JelleZijlstra/patch-2
ilevkivskyi Mar 10, 2017
cb65bff
Rename PEP with a valid number to get the build running
ilevkivskyi Mar 10, 2017
817bf2f
Reflow to 79 characters
ilevkivskyi Mar 10, 2017
2d89ba9
fix typo
JelleZijlstra Mar 10, 2017
0efcbff
Some grammar tweaks
brettcannon Mar 10, 2017
ebd4b17
Merge pull request #3 from brettcannon/patch-1
ilevkivskyi Mar 10, 2017
0de36be
Implement Guido's idea of EIBTI plus minor comments
ilevkivskyi Mar 11, 2017
767c58b
Fix typo
ilevkivskyi Mar 11, 2017
efc3154
Make implementation enforcement optional; fix order of Protocolbase
ilevkivskyi Mar 12, 2017
7d714c3
Add missing @abstractmethod decorators
ilevkivskyi Mar 13, 2017
d4ab050
Minor clarification
ilevkivskyi Mar 13, 2017
d9d21c2
Implement Jukka's and David's comments; few more minor things
ilevkivskyi Mar 16, 2017
4dfbfb2
Implement most comments by Łukasz; few more to do
ilevkivskyi Mar 17, 2017
d51420e
More changes in response to comments
Mar 18, 2017
f6240c8
Remove one reamining 'All'
ilevkivskyi Mar 18, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Implement most comments by Łukasz; few more to do
  • Loading branch information
ilevkivskyi committed Mar 17, 2017
commit 4dfbfb202cabd6058f00e55ecb7dfa01d7b73faf
103 changes: 51 additions & 52 deletions pep-0544.txt
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ this conforms to PEP 484::
The same problem appears with user-defined ABCs: they must be explicitly
subclassed or registered. This is particularly difficult to do with library
types as the type objects may be hidden deep in the implementation
of the library. Moreover, extensive use of ABCs might impose additional
of the library. Also, extensive use of ABCs might impose additional
runtime costs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, extensive use of ABCs might impose additional runtime costs.

Do we have any numbers? We know ABCs aren't free but I don't know if this is worth mentioning unless it's a major factor. The rationale for protocols to be the pythonic, dynamic and idiomatic version of ABCs is strong enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class C(collections.abc.Iterable):
    def __iter__(self):
        return []

is 2x slower than

class C:
    def __iter__(self):
        return []

The numbers for typing.Iterable are even worse, but I have some ideas on how to improve those. The 2x slowdown is still not much, so I use "extensive" and "might". If you think it is not necessary we could remove this altogether.


The intention of this PEP is to solve all these problems
Expand All @@ -62,6 +62,9 @@ using structural [wiki-structural]_ subtyping::
def collect(items: Iterable[int]) -> int: ...
result: int = collect(Bucket()) # Passes type check

Note that ABCs in ``typing`` module already provide structural behavior
at runtime, ``isinstance(Bucket(), Iterable)`` returns ``True``.
The main goal of this proposal is to support such behavior statically.
The same functionality will be provided for user-defined protocols, as
specified below. The above code with a protocol class matches common Python
conventions much better. It is also automatically extensible and works
Expand Down Expand Up @@ -107,7 +110,7 @@ Existing Approaches to Structural Subtyping
Before describing the actual specification, we review and comment on existing
approaches related to structural subtyping in Python and other languages:

* Zope interfaces [zope-interfaces]_ was one of the first widely used
* ``zope.interface`` [zope-interfaces]_ was one of the first widely used
approaches to structural subtyping in Python. It is implemented by providing
special classes to distinguish interface classes from normal classes,
to mark interface attributes, and to explicitly declare implementation.
Expand Down Expand Up @@ -188,10 +191,10 @@ approaches related to structural subtyping in Python and other languages:
assert isinstance(MyIterable(), Iterable)

Such behavior seems to be a perfect fit for both runtime and static behavior
of protocols. The main goal of this proposal is to support such behavior
statically. In addition, to allow users achieving such runtime behavior
for user defined protocols a special ``@runtime`` decorator will be
provided, see detailed `discussion`_ below.
of protocols. As discussed in `rationale`_, we propose to add static support
for such behavior. In addition, to allow users to achieve such runtime
behavior for *user defined* protocols a special ``@runtime`` decorator will
be provided, see detailed `discussion`_ below.

* TypeScript [typescript]_ provides support for user defined classes and
interfaces. Explicit implementation declaration is not required and
Expand Down Expand Up @@ -339,8 +342,8 @@ Static methods, class methods, and properties are equally allowed
in protocols.

To define a protocol variable, one must use PEP 526 variable
annotations in the class body. Attributes defined in the body of a method
by assignment via ``self`` are not allowed. The rationale
annotations in the class body. Additional attributes *only* defined in
the body of a method by assignment via ``self`` are not allowed. The rationale
for this is that the protocol class implementation is often not shared by
subtypes, so the interface should not depend on the default implementation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well in some sense that same rationale would apply to default (i.e. non-abstract) implementations.

A defensible position could be that as long as a type checker can verify that an implementation that doesn't explicitly inherit from the protocol class still defines the variable, there's no great reason to disallow variables, since they could just be interpreted as a shorthand for a setter and a getter method.

Also note that mypy (to take one example) doesn't actually check whether an attribute is always set -- it only checks that if it is used it has the right type.

All this gets pretty messy though and I am leaning towards not allowing protocol variables at all, other than read-only properties (possibly abstract). I don't recall having seen real-world examples of them (but if there are examples that might sway me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this point needs some clarifications. First, I don't see why we need to force people to write:

class Point2D(Protocol):
    @property
    def x(self) -> int:
        ...
    @property
    def y(self) -> int:
        ...

instead of just

class Point2D(Protocol):
    x: int
    y: int

I think almost every function parameter annotated with a named tuple or a similar struct-like class could be replaced with a protocol annotation. (Also, this will be a natural counterpart of TypedDict.) Second, I don't think we need to specify how "smart" type checkers should be. Maybe we could only mention that it is reasonable to expect that a type checker will recognize

class Coordinates:
    def __init__(self, x: int, y: int) -> None:
        self.x = x
        self.y = y

as implicitly implementing Point2D.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this gets pretty messy though and I am leaning towards not allowing protocol variables at all, other than read-only properties

Disallowing protocol variables entirely significantly limits the scope of structural typing. There's many examples in the stdlib alone (like the stdout.buffer field, named tuples and enums used structurally, etc.).

Only allowing read-only properties is less limiting but still so. More importantly, it is confusing to the user, like Ivan's example shows above:

  • if we specify Point2D as a read-only property, should the type checker reject a writable implementation?
  • if not, it feels strange to use a property to define something that is just an attribute in the implementation.

I can imagine users of this rewriting their attributes to be properties just to satisfy the protocol specification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not adamantly against protocol variables, but I think we should at least settle what to do about read-only attributes. When you use @property (not using the @f.setter syntax) or use PEP 526 syntax inside aNamedTuple you get a read-only attribute; otherwise PEP 526 syntax gets you a writable attribute. Which should be the default for Protocols, and how should you be able to declare the other kind?

Note that this is not about immutability -- that's in the programmer's head (or perhaps guided by whether __hash__ is defined). This is purely about whether attribute assignment should be allowed.

Examples::
Expand Down Expand Up @@ -403,7 +406,7 @@ subtyping -- the semantics of inheritance is not changed. Examples::
represent(nice) # OK
represent(another) # Also OK

Note that there are no conceptual difference between explicit an implicit
Note that there is no conceptual difference between explicit and implicit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, hm, there seem to be plenty of subtle differences, e.g. the requirement to implement everything only in an implicit subclass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. the requirement to implement everything only in an implicit subclass

This is exactly what I mean on the next line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, starting with "there is no conceptual difference..." seems too strong.

subtypes, the main benefit of explicit subclassing is to get some protocol
methods "for free". In addition, type checkers can statically verify that
the class actually implements the protocol correctly::
Expand All @@ -421,28 +424,26 @@ the class actually implements the protocol correctly::

# Type checker might warn that 'intensity' is not defined

The general philosophy is that protocols are mostly like regular ABCs,
but a static type checker will handle them specially. Subclassing a protocol
class would not turn the subclass into a protocol unless it also has
``typing.Protocol`` as an explicit base class. Without this base, the class
is "downgraded" to a regular ABC that cannot be used with structural
subtyping. See section on `extending`_ for details of defining subprotocols.

A class can explicitly inherit from multiple protocols and also form normal
classes. In this case methods are resolved using normal MRO and a type checker
verifies that all subtyping are correct. The semantics of ``@abstractmethod``
is not changed, all of them must be implemented by an explicit subclass
before it could be instantiated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could -> can



.. _extending :

Merging and extending protocols
-------------------------------

Subprotocols are also supported. A subprotocol can be defined
by having both one or more protocols as immediate base classes and also
having ``typing.Protocol`` as an immediate base class::
The general philosophy is that protocols are mostly like regular ABCs,
but a static type checker will handle them specially. Subclassing a protocol
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(At runtime maybe we could say that they behave like an implicit register() call exists?)

class would not turn the subclass into a protocol unless it also has
``typing.Protocol`` as an explicit base class. Without this base, the class
is "downgraded" to a regular ABC that cannot be used with structural
subtyping.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a rationale for this rule. I think there probably is one, but it needs to be stated clearly, and I'm not sure what it is -- perhaps that we don't want to accidentally have some class act as a protocol just because one of its base classes happens to be one. That is, we still (slightly) prefer nominal subtyping over structural subtyping (in the static typing world).


A subprotocol can be defined by having *both* one or more protocols as
immediate base classes and also having ``typing.Protocol`` as an immediate
base class::

from typing import Sized, Protocol

Expand Down Expand Up @@ -499,7 +500,7 @@ protocols will be useful for representing self-referential data structures
like trees in an abstract fashion::

class Traversable(Protocol):
leafs: Iterable['Traversable']
leaves: Iterable['Traversable']


Using Protocols
Expand All @@ -516,13 +517,13 @@ relationships are subject to the following rules:
* A concrete type or a protocol ``X`` is a subtype of another protocol ``P``
if and only if ``X`` implements all protocol members of ``P``. In other
words, subtyping with respect to a protocol is always structural.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If X is a protocol itself, "implements" gets a different meaning, right? I think in that case it just means that it must define all of P's methods right? Maybe it would be less ambiguous if you separated the two cases into separate bullets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be less ambiguous if you separated the two cases into separate bullets.

I agree.

* Edge case: for recursive protocols, structural subtyping is decided
positively for situations where such decision depends on itself. Continuing
the previous example::
* Edge case: for recursive protocols, a class is considered a subtype of
the protocol in situations where such decision depends on itself.
Continuing the previous example::

class Tree(Generic[T]):
def __init__(self, value: T,
leafs: 'List[Tree[T]]') -> None:
leaves: 'List[Tree[T]]') -> None:
self.value = value
self.leafs = leafs

Expand All @@ -539,8 +540,8 @@ using structural compatibility instead of compatibility defined by
inheritance relationships.


``Union[]`` and ``All[]``
-------------------------
Unions and intersections of protocols
-------------------------------------

``Union`` of protocol classes behaves the same way as for non-protocol
classes. For example::
Expand All @@ -562,28 +563,28 @@ classes. For example::
return 0
finish(GoodJob()) # OK

In addition, we propose to add another special type construct
``All`` that represents intersection types. Although for normal types
it is not very useful, there are many situations where a variable should
implement more than one protocol. Annotation by ``All[Proto1, Proto2, ...]``
means that a given variable or parameter must implement all protocols
``Proto1``, ``Proto2``, etc. either implicitly or explicitly. Example::
One can use multiple inheritance to define an intersection of protocols.
Example::

from typing import Sequence, Hashable, All
from typing import Sequence, Hashable

class HashableFloats(Sequence[float], Hashable, Protocol):
pass

def cached_func(args: All[Sequence[float], Hashable]) -> float:
def cached_func(args: HashableFloats) -> float:
...
cached_func((1, 2, 3)) # OK, tuple is hashable and sequence
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence

The interaction between union and intersection types is specified by PEP 483,
and basically reflects the corresponding interactions for sets.
If the this will prove to be a widely used scenario. A special ``Intersection``
type may be added in future as specified by PEP 483.


``Type[]`` with protocols
-------------------------

Variables and parameters annotated with ``Type[Proto]`` accept only concrete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is Type[C] defined to work with a non-protocol ABCs? It seems that they should both work the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could work like this for all ABCs. But it does not work like this in mypy now, and it is not specified in PEP 484. I feel like this should be specified here rather that in PEP 484.

By the way, there is an old high-priority issue for this python/mypy#1843 and my PR python/mypy#2853 that didn't receive a review for a month :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this PEP is the right place to finish specifying it. I find the existing explanation missing the "why". It is neatly demonstrated in python/mypy#1843:

  • we don't care for the additional error around passing a Protocol or an ABC
  • we do care that treating Type[P] as any concrete subtype of P silences errors that signify invalid behavior only if ABCs or Protocols are passed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure I agree that this should be specified in this PEP for cases other than protocols (though it's fine to discuss it in this thread). I'm also not sure about the 10000 sense of @ambv's second bullet, since "we care" is somehow ambiguous in my mind (do we find it a significant mistake or a significant feature?).

IIRC it's pretty subtle to prohibit passing an abstract class object (or a protocol object) to a parameter that requires something annotated with Type[], since the thing you are passing might itself have such a type (e.g. cls in a class method).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvanrossum

Hm, I'm not sure I agree that this should be specified in this PEP for cases other than protocols.

As I understand, there are only two options:

  • this behavior only applies to protocols
  • it applies to all ABCs (including protocols)

I think the simplest solution is to say here than this behavior applies to protocols, but don't
say that it doesn't apply to ABCs. Then, when we decide on ABCs, we could update PEP 484 independently.

IIRC it's pretty subtle to prohibit passing an abstract class object (or a protocol object) to a parameter that requires something annotated with Type[], since the thing you are passing might itself have such a type (e.g. cls in a class method).

My PR, that I mentioned above, actually takes care of this and few other edge cases, but it has never been reviewed...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still torn about what we should actually specify. I could imagine a function that takes a protocol class and an object and checks whether the object is an instance of that protocol class using isinstance(). In this case, if the protocol class supports isinstance() (using @runtime) I think there's no harm in it. And for ABCs (where isinstance() is always well-defined) there's definitely no harm in it. OTOH many other things are harmful, e.g. calling abstract class methods or instantiation. But my intention in python/mypy#1843 was to shove this under the rug, at least until we have an official notation to differentiate a concrete subclass of a given ABC (or protocol). Also, at least for ABCs, instantiation is always a problem (because Liskov doesn't get checked there).

Honestly I'm not sure what exactly to do here, especially for instantiation, and would prefer the PEP to state that this is an unresolved issue -- it may fail at runtime. (Maybe we could have a special decorator on __init__ to state that Liskov should apply? That would solve the separate instantiation concern for ABCs.)

(non-protocol) subtypes of ``Proto``. For example::
(non-protocol) subtypes of ``Proto``. The main reason for this is to allow
instantiation of parameters with such type. For example::

class Proto(Protocol):
@abstractmethod
Expand All @@ -605,9 +606,10 @@ The same rule applies to variables::
var = Concrete # OK
var().meth() # OK

Assigning a protocol class to a variable is allowed if it is not explicitly
typed, and such assignment creates a type alias. For non-protocol classes,
the behavior of ``Type[]`` is not changed.
Assigning an ABC or a protocol class to a variable is allowed if it is
not explicitly typed, and such assignment creates a type alias.
For normal (non-abstract) classes, the behavior of ``Type[]`` is
not changed.


``NewType()`` and type aliases
Expand Down Expand Up @@ -688,18 +690,15 @@ Using Protocols in Python 2.7 - 3.5
Variable annotation syntax was added in Python 3.6, so that the syntax
for defining protocol variables proposed in `specification`_ section can't
be used in earlier versions. To define these in earlier versions of Python
one can use abstract properties::
one can use properties::

class Foo(Protocol):
@abstractproperty
def c(self) -> int: ...
@property
def c(self) -> int:
return 42 # Default value can be provided for property...

@abstractproperty
def c(self) -> int: # Default value can be provided for property.
return 0

@property
def e(self) -> int: # Note, this is not a protocol member.
def d(self) -> int: # ... or it can be abstract
return 0

In Python 2.7 the function type comments should be used as per PEP 484.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should also be allowed for Python 3 (even 3.6+) and the text earlier that introduces these should mention this. (It's fine of course to recommend PEP 526 syntax and to use it in all examples except one showing the fallback syntax.)

Expand All @@ -722,7 +721,7 @@ effects on the core interpreter and standard library except in the
a protocol or not. Add a class attribute ``__protocol__ = True``
if that is the case. Verify that a protocol class only has protocol
base classes in the MRO (except for object).
* Implement ``@runtime`` that adds all attributes to ``__subclsshook__()``.
* Implement ``@runtime`` that adds all attributes to ``__subclasshook__()``.
* All structural subtyping checks will be performed by static type checkers,
such as ``mypy`` [mypy]_. No additional support for protocol validation will
be provided at runtime.
Expand Down
0