8000 Invariance, contravariance, and covariance for containers · Issue #10427 · python/mypy · GitHub
[go: up one dir, main page]

Skip to content

Invariance, contravariance, and covariance for containers #10427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send 8000 you account related emails.

Already on GitHub? Sign in to your account

Closed
hlovatt opened this issue May 6, 2021 · 11 comments
Closed

Invariance, contravariance, and covariance for containers #10427

hlovatt opened this issue May 6, 2021 · 11 comments
Labels
bug mypy got something wrong

Comments

@hlovatt
Copy link
hlovatt commented May 6, 2021

Mypy doesn't diagnose type problems correctly for containers with invariant, contravariant, nor covariant content types.

To Reproduce

Run the following code through Mypy (note Mypy errors in comments):

from typing import TypeVar, Generic

class T: ...
class M(T): ...
class B(M): ...

InMT = TypeVar('InMT', bound=M)
ContraMT = TypeVar('ContraMT', bound=M, contravariant=True)
CoMT = TypeVar('CoMT', bound=M, covariant=True)

class In(Generic[InMT]):
    x: InMT
class Contra(Generic[ContraMT]):
    x: ContraMT
class Co(Generic[CoMT]):
    x: CoMT

t = T()
m = M()
b = B()

m_in: In[M] = In()
m_contra: Contra[M] = Contra()
m_co: Co[M] = Co()

m_in.x = t  # mypy: Incompatible types in assignment (expression has type "T", variable has type "M").
m_in.x = m
m_in.x = b

m_contra.x = t  # mypy: Incompatible types in assignment (expression has type "T", variable has type "M").
m_contra.x = m
m_contra.x = b

m_co.x = t  # mypy: Incompatible types in assignment (expression has type "T", variable has type "M").
m_co.x = m
m_co.x = b

Expected Behavior

  1. Error message for m_in.x = t is wrong since the variable is of type InMT not M.
  2. m_in.x = b should be an error because a B is not an InMT (only an M is).
  3. m_contra.x = t should not be an error because a T is a ContraMT.
  4. m_contra.x = b should be an error because a B is not an ContraMT (only an M or a T are).
  5. As point 1 above, variable is of type CoMT not M.

Actual Behavior

See comments in code snippet.

Your Environment

  • Mypy version used: 0.812
  • Mypy command-line flags: None
@erictraut
Copy link

If the container and the variable x is mutable, then the TypeVar that defines the type of x must be invariant. That means the classes Contra and Co above are incorrect, and we can discount any behaviors that stem from them.

That leaves class In and the statement m_in.x = b (item number 2 in the list of expected beahviors). I agree this is a bug. The other expected behaviors I don't think are bugs in mypy.

@erictraut
Copy link

Thinking about this more, I think mypy's behavior is correct in its entirety. As I mentioned, I think the code is buggy in the cases where the type variable is defined as covariant and contravariant. In the case where it's invariant, you should be able to assign subtypes to the attribute x.

I recommend closing because the current behavior in mypy looks right to me.

@davidhalter
Copy link

I agree, that mypy is correct here and would also recommend closing.

The basic assumption that the variances matter there are wrong. You should always be able to assign subtypes. The variances matter when it comes to inheritance for example.

However having a covariant/contravariant x is a bit weird. For inheritance this does not make a lot of sense, because it is always both input and output (i.e. invariant). So type checkers should probably reject such definitions, but I'm not sure how much it would break backwards compatibility now that it's been around for a long time.

That means the classes Contra and Co above are incorrect, and we can discount any behaviors that stem from them.

Does this mean that pyright flags those usages as incorrect?

@erictraut
Copy link

Does this mean that pyright flags those usages as incorrect?

Similar to mypy, pyright validates variance only for protocol classes. I agree with you that enforcing this for non-protocols would cause a lot of churn for existing code bases. Pyright also implements PEP 695, which computes TypeVar variance automatically when the new syntax is used.

@hlovatt
Copy link
Author
hlovatt commented Aug 20, 2023

If the container and the variable x is mutable, then the TypeVar that defines the type of x must be invariant. That means the classes Contra and Co above are incorrect, and we can discount any behaviors that stem from them.

That leaves class In and the statement m_in.x = b (item number 2 in the list of expected beahviors). I agree this is a bug. The other expected behaviors I don't think are bugs in mypy.

I don't think what you say is true. EG, a function that accepts a mutable covariant argument can read from that argument and a function that accepts a mutable contravariant argument can write to that argument. Thus demonstrating that there are type safe usages for both co and contra type hinting of mutable variables. For example using the T, M, and B classes from before and 1st establishing a base case that everyone can agree is correct:

TV = TypeVar("TV")


class BoxCovariantChecked(Generic[TV]):
    def __init__(self, x: TV) -> None:
        self._x = x
        self._t = type(x)

    def check(self, x: TV) -> None:
        if not isinstance(x, self._t):
            raise TypeError(f"x is of type {type(x)}, which is not a {self._t}.")

    @property
    def x(self) -> TV:
        x = self._x
        self.check(x)
        return x

    @x.setter
    def x(self, x: TV) -> None:
        self.check(x)
        self._x = x


tb = BoxCovariantChecked(T())
tb.x = T()
print(tb.x)  # <__main__.T object at 0x100c50290>
tb.x = M()
print(tb.x)  # <__main__.M object at 0x100c50210>
tb.x = B()
print(tb.x)  # <__main__.B object at 0x100c50290>

mb = BoxCovariantChecked(M())
# Runtime TypeError: x is of type <class '__main__.T'>, which is not a <class '__main__.M'>.
# mypy error: Incompatible types in assignment (expression has type "T", variable has type "M")  [assignment]
# mb.x = T()
print(mb.x)  # <__main__.M object at 0x102e9fc10>
mb.x = M()
print(mb.x)  # <__main__.M object at 0x102e9fc50>
mb.x = B()
print(mb.x)  # <__main__.B object at 0x102e9fc10>

bb = BoxCovariantChecked(B())
# Runtime TypeError: x is of type <class '__main__.T'>, which is not a <class '__main__.B'>.
# mypy error: Incompatible types in assignment (expression has type "T", variable has type "B")  [assignment]
# bb.x = T()
print(bb.x)  # <__main__.M object at 0x104e9fc10>
# Runtime TypeError: x is of type <class '__main__.M'>, which is not a <class '__main__.B'>.
# mypy error: Incompatible types in assignment (expression has type "M", variable has type "B")  [assignment]
# bb.x = M()
print(bb.x)  # <__main__.M object at 0x104e9fc50>
bb.x = B()
print(bb.x)  # <__main__.B object at 0x104e9fc10>

As can be seen in the above both the runtime type checks and mypy agree and both are correct. The lines that fail are commented out to allow the above code to run to completion, but see comments for the errors.

Apply the above baseline classes and instances to a covariant function:

def checked_read_from_an_mbox(b: BoxCovariantChecked[CoMT]):
    t = b._t
    if not issubclass(t, M):
        raise TypeError(f"b is of type BoxCovariantChecked[{t}], which is not a BoxCovariantChecked[M].")
    print(b.x)


# Runtime TypeError: b is of type BoxCovariantChecked[<class '__main__.T'>], which is not a BoxCovariantChecked[M].
# mypy error: Value of type variable "CoMT" of "checked_read" cannot be "T"  [type-var]
# checked_read_from_an_mbox(tb)
checked_read_from_an_mbox(mb)
checked_read_from_an_mbox(bb)

Again runtime and mypy agree and they are correct. Again line that causes the error is commented out to allow code to run.

But if we convert the read function into a write using contra-variance we get a mypy errors.

def checked_write_of_an_m_to_an_mbox(b: BoxCovariantChecked[ContraMT]):
    t = b._t
    if not issubclass(M, t):  # Note type check reversed, super-type test!
        raise TypeError(f"b is of type BoxCovariantChecked[{t}], which is not a supertype of BoxCovariantChecked[M].")
    # mypy error: Incompatible types in assignment (expression has type "M", variable has type "ContraMT")  [assignment]
    #  Which is incorrect because an M is of type ContraMT.
    #  Also note that the BoxCovariantChecked instance allows this assignment, demonstrating it is correct.
    b.x = M()
    print(b.x)


# mypy error: Value of type variable "ContraMT" of "checked_write_of_an_m" cannot be "T"  [type-var]
#  Which is incorrect, a contra of an M can be a T.
checked_write_of_an_m_to_an_mbox(tb)
checked_write_of_an_m_to_an_mbox(mb)
# Runtime TypeError: b is of type BoxCovariantChecked[<class '__main__.B'>],
#   which is not a supertype of BoxCovariantChecked[M].
# mypy passes this - it shouldn't!
# checked_write_of_an_m_to_an_mbox(bb)

Again problem lines commented out to allow code to run. Errors in comments.

I still think my original example is correct, this second example shows the problem in another way that may be easier to follow.

@erictraut
Copy link

By "covariant function", I presume that you mean a function that makes use of a function-scoped type variable that is defined as covariant, as in your example checked_read_from_an_mbox. The problem is that variance doesn't apply to function-scoped type variables, so the concept of "covariant function" doesn't make sense. Variance applies only to class-scoped type variables and is completely ignored by type checkers for function-scoped type variables.

Here's a simple thought experiment that should help make it clear why this is the case. As you probably know, the type parameter for list is invariant. If you have a function with an input parameter annotated with list[T_co] where T_co is a function-scoped type parameter defined as covariant, that doesn't mean that you have somehow changed list to act covariantly. The variance of type parameter T_co is irrelevant in this case because the variance of list comes from its class definition, which defines this type parameter as invariant. Perhaps type checkers could help eliminate this confusion by emitting a warning if a type variable defined as covariant or contravariant is bound to a function scope. Unfortunately, it's pretty common practice to use a T_co or T_contra in multiple places within a file, including both class and function scopes, so this warning would generate a lot of noise for existing code bases.

The concept of variance confuses many Python developers, so you're not alone here. This is exacerbated by the way TypeVars have historically been defined in Python, where the definition is separated from the scope to which they are bound. PEP 695, which is implemented in Python 3.12, introduces a new dedicated syntax for type parameters that aims to reduce this confusion. It largely eliminate the need to understand variance. (Full disclosure: I'm the primary author of this PEP.)

I don't think the error for checked_write_of_an_m_to_an_mbox(tb) is a false positive, nor do I think checked_write_of_an_m_to_an_mbox(bb) is a false negative. Mypy is doing the right thing in both cases. You're seeing runtime behaviors that don't match these error conditions because the implementation of checked_write_of_an_m_to_an_mbox violates its interface contract. In other words, this is a bug in your code, not in mypy. It's analogous to the following code, which generates no type violation errors but crashes at runtime because the function is making invalid assumptions that violate its interface contract.

def func(v: list[float]):
    # Verify that all elements of the list are floats.
    # (This is an invalid assumption for `list[float]`.)
    assert all(type(x) == float for x in v)

func([1])

@hlovatt
Copy link
Author
hlovatt commented Aug 23, 2023

Thanks for pushing PEP 695, I think it will be a big improvement. I wanted to try out 695 on my examples; unfortunately, although I can get python 3.12.0rc1 installed it doesn't look like Mypy 1.5.1 supports 695? Is that correct?

@erictraut
Copy link
erictraut commented Aug 24, 2023

Pyright is currently the only Python type checker that supports PEP 695. (Full disclosure: I'm the primary author of pyright.)

There is a tracking issue for mypy. I don't think anyone has started on it yet, so it could be quite a bit of time before you'll see support in mypy.

@hlovatt
Copy link
Author
hlovatt commented Aug 24, 2023

OK will give that a go.

@hlovatt
Copy link
Author
hlovatt commented Aug 25, 2023

Thanks for writing pyright; haven't used it before, but is seems very good :)

Consider this code:

"""Think I pinched this example from a C# answer on Stackoverflow!"""

from dataclasses import dataclass, field


@dataclass
class Person:
    name: str


@dataclass
class Teacher(Person):
    ...


@dataclass
class StudentTeacher(Teacher):
    ...


@dataclass
class School:
    people: list[Person] = field(default_factory=list)
    teachers: list[Teacher] = field(default_factory=list)
    student_teachers: list[StudentTeacher] = field(default_factory=list)

    @staticmethod
    def add_to_list[T](items: list[T], item: T) -> None: 
        items.append(item)

    def add_teacher(self, name: str) -> None:
        teacher = Teacher(name)
        School.add_to_list(self.people, teacher)
        School.add_to_list(self.teachers, teacher)
        # School.add_to_list(self.student_teachers, teacher) - "list[StudentTeacher]" is incompatible with "list[Teacher | StudentTeacher]"

def main():
    school = School()
    teacher = Teacher("Ellory")
    school.people.append(teacher)
    school.teachers.append(teacher)
    # school.student_teachers.append(teacher) - "Teacher" is incompatible with "StudentTeacher"

    school.add_teacher("Ash")

    print(f"{school.people=}")
    print(f"{school.teachers=}")
    print(f"{school.student_teachers=}")

if __name__ == "__main__":
    main()

Which works and catches the errors, commented out, but with incorrect explanations e.g., School.add_to_list(self.student_teachers, teacher) is rejected because of "list[StudentTeacher] is incompatible with list[Teacher | StudentTeacher]" which makes no sense.

If we compare the Java code for the key method, add_to_list, the following definitions all work:

  static <S, T extends S> void addToList(List<S> list, T item) { // Covariance only!
  static void addToList(List<? super Teacher> list, Teacher item) { // The obvious way!
  static <T> void addToList(List<? super T> list, T item) { // More general!

All methods working is a nice feature because they all make sense and programmers might arrive at any of these options depending on code history and their experiences. PEP 695 code, as checked by pyright, isn't so straightforward.

  def add_to_list[S, T: S](items: list[S], item: T) -> None:  # Would have thought this works, but no!
  def add_to_list[T](items: list[T], item: Teacher) -> None:  # This doesn't work, which is a pity because likely to be tried!
  def add_to_list[T](items: list[T], item: T) -> None:  # This is the only version that works and it's not intuitive!

The last form, the only one that works!, doesn't read well since you would have thought the two Ts, List[T] and T, would refer to the same type if it is to type check however they don't. In line School.add_to_list(self.people, teacher), the 1st T is Person and the second Teacher! Contrast that with Java, that makes it clear they are different, but related, types.

I don't know if you have considered this for 695, but you could have default as covariant and annotate for contravariant (:>=) and invariant (:==). I suggest annotation for contravariant and invariant, since you nearly always want covariant. Examples above would become:

  def add_to_list[S, T: S](items: list[S], item: T) -> None:  # S and T both covariant and T has an upper bound of S.
  def add_to_list(items: list[_:>= Teacher], item: Teacher) -> None:  # Anonymous super type, lower bounded by Teacher.
  def add_to_list[T](items: list[_:>= T], item: T) -&
93A3
gt; None:  # Anonymous super bounded T.

An advantage of having types covariant by default is that it is clear that list[Person] can accept Teachers. At the moment the default is invariant and therefore it looks like people.append(teacher) will fail, but it works!

Boat may have sailed on changes to PEP though :(.

@erictraut
Copy link

The mypy issue tracker is perhaps not the best forum for this discussion, but I want to respond to your comments to clarify some misunderstandings. Hopefully this will assist others who come across this discussion.

During development of PEP 695, there was discussion about explicit vs implicit (auto) variance. You can see that this suggestion appears in the rejected ideas section. The problem with explicit variance is that most developers get it wrong because they don't understand variance. It's a complex topic, and even people who think they understand it often do not ;). The decision was made to go with auto variance because this is something that type checkers can calculate. This is the approach taken by TypeScript, and it works well there.

...is rejected because of "list[StudentTeacher] is incompatible with list[Teacher | StudentTeacher]" which makes no sense.

This error makes sense if you understand the notion of invariance. Pyright even spells this out with the added note Type parameter "_T@list" is invariant, but "StudentTeacher" is not the same as "Teacher | StudentTeacher". It is because of this invariance that list[StudentTeacher] is not compatible with list[Teacher | StudentTeacher]. This has nothing to do with PEP 695; the type parameter for list in Python has always been defined as invariant, so you'd see the same errors reported if you switched the above code to use the pre-PEP 695 syntax. In fact, most of your observations and criticisms above have nothing to do with the PEP 695 syntax; these are existing behaviors and limitations of the Python type system and are unaffected by PEP 695.

Note that variance doesn't apply to function-scoped type variables like T in your code sample above. Variance applies only to class-scoped type variables, such as the one used in the list class definition.

If you want to report bugs or suggest improvements for pyright, please file issues in the pyright issue tracker. If you want to suggest extensions to the type system (including the PEP 695 syntax) feel free to post to the python/typing forum.

I recommend closing this issue. Mypy is behaving correctly here.

@JelleZijlstra JelleZijlstra closed this as not planned Won't fix, can't repro, duplicate, stale Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong
Projects
None yet
Development

No branches or pull requests

4 participants
0