You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NumPy 2.2 had a lot of typing improvements, but that also means some regressions (at least and maybe especially for mypy users).
So maybe this exercise is mainly useful to me to make sense of the mega-issue in gh-27957.
My own take-away is that we need the user documentation (gh-28077), not just for users, but also to understand better who and why people have to change their typing. That is to understand the two points:
How many users and what kind of users are affected:
Early "shaping users" of unsupported shapes may be few?
mypy users of unannotated code are maybe quite many.
And what do they need to do:
Removing shape types seems easy (if unfortunate).
Adding --allow-redefinition is easy, avoiding mypy may be more work (maybe unavoidable).
Are there other work-around? Maybe scipy-lectures is "special" or could hide generic types outside the code users see...
One other thing that I would really like to see is also the "alternatives". Maybe there are none, but I would at least like to spell it out, as in:
Due to ... only thing that we might be able to avoid these regression is to hide it away as from numpy.typing_future import ndarray and that is impractical/impossible because...
CC @jorenham although it is probably boring to you, also please feel free to amend or expand.
Issues that require user action
User issues due to (necessarily) incomplete typing
There are two things that came up where NumPy used to have less precise or wrong typing, but correcting it making it more precise (while also necessarily incomplete as it may require a new PEP) means that type checking can fail:
floating is now used as a supertype of float64 (rather than identity) meaning it (correctly) matches float32, float, etc.
Incomplete typing means functions may return floating rather than float64 even when they clearly return float64.
(N.B.: NumPy runtime is slightly fuzzy about this, since np.dtype(np.floating) gives float64, but with a warning because it is not a good meaning.)
There is now some support for shape typing
Previously, users could add shapes, but these were ignored.
Shape typing should not be used currently, because most functions will return shape-generic results, meaning that even correct shapes types will typically just type checking.
(Users could choose to use this, but probably would need to cast explicitly often.)
There is a mypy specific angle in gh-27957 to both of these, because mypy defaults (but always runs into it) to infer the type at the first assignment. This assignment is likely (e.g. creation) to include the correct shape and float64 type, but later re-assignments will fail.
(I, @seberg, cannot tell how problematic these are, or what options we have to try to make this easier on downstream, short of reverting or including reverting.)
I believe that all cases of unannotated code that were valid with numpy<2.2 but invalid with >=2.2 (such as #27957), are limited to mypy, and do not occur with (based)?pyright.
There are two distinct numpy==2.2.0 changes that could cause this. One is related to the ongoing work to support shape-typing, and the other has to do with changes made to float64 and complex128. Both these cases are only an issue for mypy users. But all that's needed to fix this, is to help mypy a bit by add a single type annotation.
shape-typing
The shape-typing mypy issues are all caused by #27211, which changed the shape-type "default" of ndarray from Any to tuple[int, ...]. Without this change, shape-typing simply wouldn't be possible to achieve practice (which has to do with the undefined behavior of @overload in the presence of Any).
With numpy>=2.2.0, the following (valid) code will cause mypy (and only mypy) to report an error:
importnumpyasnpx=np.arange(2)
x=x+1
Incompatible types in assignment (expression has type "ndarray[tuple[int, ...], dtype[signedinteger[Any]]]", variable has type "ndarray[tuple[int], dtype[signedinteger[Any]]]")
Since numpy 2.2.0, the return type of np.arange(2) is an ndarray with a 1-dimensional shape-type, i.e. ndarray[tuple[int], dtype[signedinteger]].
But x + 1 does not (yet) take the shape-types of the input into account, and in this case returns NDArray[signedinteger], which resolves to ndarray[tuple[int, ...], dtype[signedinteger]].
It's not allowed to assign tuple[int, ...] to tuple[int], so it also isn't allowed to assign ndarray[tuple[int, ...], _] to a ndarray[tuple[int], _].
And Mypy only looks at x = np.arange(2) when it infers the type of x, so it immediately determines that x: ndarray[tuple[int], dtype[signedinteger]], even if this is contradicted on the very next line.
So to work around this, we need to help mypy a bit by explicitly annotating x:
Before 2.2.0, float64 and complex128 were incorrectly annotated type-aliases of floating[_64Bit] and complexfloating[_64Bit, _64Bit], respectively. These changes were made in #27334, and apply to complex128 in the same way as float64. So for the sake of brevity, I'll limit this to float64.
This change fixes two large issues:
At runtime, float64 subclasses floating, but the stubs incorrectly annotated float64 as a (restricted) alias of floating. The result is that x: float64 = floating[Any] will now be rejected, as it should (because it's type-unsafe).
float64 is also a subclass of builtins.float, but the stubs did not reflect this. This was causing e.g. x: float = np.float64(42) to be falsely rejected on numpy<2.2.0, even though it's perfectly type-safe.
So for example, mypy (and only mypy) will reject the last of the following example
y=np.array([], np.float64)
y=y+1
Incompatible types in assignment (expression has type "ndarray[tuple[int, ...], dtype[floating[Any]]]", variable has type "ndarray[tuple[int, ...], dtype[float64]]")
The np.array([], np.float64) expression evaluates to a NDArray[float64] type. But adding 1 to this, results in a NDArray[floating] type. But since numpy>=2.2.0 it's no longer the case that floating is assignable to float64 (because that'd be type-unsafe), so we also can't assign NDArray[floating] to NDArray[float64].
This causes mypy to (falsely) reject y = y + 1, just like it rejected x + 1 in the shape-typing example. We can also work around this in the same way:
We see here that y4 + 1 is inferred as ndarray[tuple[int, ...], dtype[floating[Any]]], which is identical to the type of y + 1.
So the errors that mypy reports are not caused by regressions or bugs; they are the consequence of a necessary improvement. And this is why these reported mypy errors are, in fact, falsy positives.
I suppose this is related and a symptom that many methods and functions in Numpy 2.2 erase shape information, giving rise to similar problems as mentioned above
Under mypy this results in a type error for the last line
foo.py:6: error: Incompatible types in assignment (expression has type "ndarray[tuple[int, ...], dtype[float64]]", target has type "ndarray[tuple[int, int, int], dtype[float64]]") [assignment]
many methods and functions in Numpy 2.2 erase shape information
The lack of shape-typing support for these functions has always been the case. Since NumPy 2.2 we made several functions, including as numpy.zeros, transparent to shape-types.
giving rise to similar problems as mentioned above
import numpy as np
from numpy import float64
a = np.zeros((3, 3, 3), dtype=float64)
output = [a] * 3
output[0] = a[0:1, :, :]
Even if NumPy would have had perfect shape-typing support, your example would still be flagged as invalid, by both mypy and pyright:
The type of output is a list, and list is an invariant type. With perfect shape-typing, it would only accept float64 arrays of shape (3,3,3). But you assign an array of shape (1,3,3). This is type-unsafe, and therefore flagged as invalid by both mypy and pyright.
This comment builds a straw man argument. First of all, I don't claim that 2.2 introduced shape erasure, but the fact that this information is absent in 2.2 and that creates conflicts with other late changes.
Second, this was a minimal example. A variable would have the same problem. But in any case lists are invariant but the inferred shape for the elements by MyPy is tuple[int, int, int] not (3,3,3). Hence the assignment is correct.
If Numpy chooses now to fixe the size of the arrays in the type that would be yet another extremely unfortunate choice because having inhomogeneous tensor sizes with a similar rank is a valid application. If the dimensions must be fixed that should be declared by the user.
And I didn't say that you did claim that 🤷🏻. It's just that I wanted to minimize the probability that someone else would misinterpret it that way (because exactly that has happened before, and it caused a lot of confusion).
A variable would have the same problem.
No, it would be a different error code, and would be limited to mypy, whereas your example is also invalid on pyright.
the inferred shape for the elements by MyPy is tuple[int, int, int] not (3,3,3)
That's indeed the type of a at the moment, but it's far from optimal. So there's a good chance that it'll be changed to a more specific shape-type, such as tuple[Literal[3], Literal[3], Literal[3]]. Once we support shape-typing in ndarray.__getitem__, then a[0:1, :, :] might return tuple[int, int, int], tuple[Literal[1], Literal[3], Literal[3]], or something else entirely. My point is that the shape of an array is part of its type. So assigning an array to a list of arrays with a different is invalid.
If you would've assigned a to a variable instead, then you'd only see an error when you use mypy, and Pyright, for example, would allow it. See the shape-typing section in #28076 (comment) for why exactly that is, and how you can work around it.
If Numpy chooses now to fixe the size of the arrays in the type that would be yet another extremely unfortunate choice because having inhomogeneous tensor sizes with a similar rank is a valid application.
Hmm I don't really understand I'm afraid 🤔.
As the name suggests, the purpose of shape-typing is to statically describe the shape of arrays, not only the number of dimensions. And since the shape-type parameter and the tuple types are covariant, I don't see how that would make same-rank operations invalid.
But at the moment, using Literal as axis-type won't really work, because type-checkers tend to upcast e.g. Literal[42] to int when operated upon. There have been some ideas put forward, such as a LiteralInt, and refinement types, as a solution to this (see e.g. https://docs.google.com/presentation/d/11IKAfpS_ODE_TXmBK4BlVzx4stAcOAECOS-LF_sAzhM/edit). But both are incredibly complicated to implement, and I don't expect a solution anytime soon.
So for the foreseeable future, "shape-typing" will actually mean "rank-typing" (i.e. with the tuple-of-ints types). Anything more than that will realistically require a PEP, and a rather beefy one at that.
Hmm I don't really understand I'm afraid 🤔.
As the name suggests, the purpose of shape-typing is to statically describe the shape of arrays, not only the number of dimensions
Any additional future type information can lead to similar things. Maybe shape-typing vs. rank-typing won't happen (even ever), but when/if it does there will be code as in the example that starts failing type-checking, because the exact shape nees to now be explicitly shape erasure.
If I understand correctly, this is a clear example where shape typing (not restricted to mypy) is inconvenient because the user must explicitly type less restrictive. I.e. where it fails Stéfan's rule of "untyped code should always pass".
We can decide that the advantages of this are larger than the disadvantages especially long-term. But the truth is that I doubt old discussions/pushes about shape typing really took these downsides into account. So we need to be very clear about them and understand how much they affect users (compared to the long-term benefits of correct shapes).
Right now, we are in the unfortunate situation that shape typing has very limited use, but does (occasionally?) require users to explicitly erase the shape.
If I understand correctly, this is a clear example where shape typing (not restricted to mypy) is inconvenient because the user must explicitly type less restrictive. I.e. where it fails Stéfan's rule of "untyped code should always pass".
If I remember correctly, Stéfan's rule only applied to valid untyped code. So it doesn't apply to the example of @juanjosegarciaripoll: list is invariant, and therefore only accepts arrays that have the same exact shape. Updating a list with an array of a different shape is type-unsafe. But before NumPy 2.2, type-checkers wouldn't tell you that.
Maybe it's not the best example, but the outcome is the same if you append a non-square matrix, e.g. of shape (1, 2). Type-checkers are meant to prevent such mistakes, and NumPy 2.2 makes it easier for them to do so.
Updating a list with an array of a different shape is type-unsafe
But I think the important thing to accept it, is that this is only type-unsafe if you assume that typing shapes is the correct/useful level of abstraction for (NumPy) array typing! And that was not the status-quo.
We can argue that clearly shapes are important, just like in your example. But that doesn't mean it right or even useful for all code.
And I think we have to accept that it may be nicer if in the example you would have to type it as pauli : List[SquareMatrix] = to check the shapes, rather than pauli defaulting to it. Of course, I don't think that is possible...
But I think the important thing to accept it, is that this is only type-unsafe if you assume that typing shapes is the correct/useful level of abstraction for (NumPy) array typing! And that was not the status-quo.
That's a very good point. And I agree that we should've thought it through better. But I'm not sure if that would've been enough, given that it was one of those "unknown unknowns".
But now that we have mypy_primer running, I'm a lot more confident that we can prevent such mistakes in the future.
But either way, even if we would've done everything right, then that list_matmul example would still raise a ValueError; with or without shape-types. Without shape-typing it would theoretically be type-safe, but at runtime, it's just as unsafe.
I have a related question, I think, though if it should go somewhere else let me know - I tried to follow the above comments regarding float and np.floating but I'm a bit lost. What is the "right" way to handle float vs np.floating in user code? I'm running afoul of np.floating[Any] not being assignable to float, but all I'm doing is taking the output of np.max() and trying to use it where a float is expected (in this case, in a matplotlib function).
Like so, if you'll pardon the extremely short non-reproducible example to get the idea:
I have a related question, I think, though if it should go somewhere else let me know - I tried to follow the above comments regarding float and np.floating but I'm a bit lost. What is the "right" way to handle float vs np.floating in user code? I'm running afoul of np.floating[Any] not being assignable to float, but all I'm doing is taking the output of np.max() and trying to use it where a float is expected (in this case, in a matplotlib function).
In the NumPy 2.2.0 release we made np.float64 an actual subclass both np.floating and float (the builtins one). Before 2.2.0, it was simply an alias of np.floating, which was incorrect for several reasons, which I outlined in #27334.
One of the consequences of this incorrect definition on numpy<2.2, is that type-checkers allowed you to assign x: np.floating to y: np.float64. But since float64 is a subclass of floating (at runtime), that shouldn't be allowed, because it is type-unsafe. To see why, consider this example:
So you're effectively assigning a float32 to float64 here, which is clearly type-unsafe. Before NumPy 2.2, this was just as invalid as it is now. But because float64 was incorrectly defined, type checkers could see that this was, in fact, an error.
I understand that it can be frustrating to have to change a lot of your annotations. But it's not because of a regression that you have to do that. It's because your annotations were type-unsafe, and NumPy 2.2 made it possible for type-checkers to help you fix it.
Like so, if you'll pardon the extremely short non-reproducible example to get the idea:
exts = (np.min(xvals), np.max(xvals), np.min(yvals), np.max(yvals))
ax.imshow(twodeearray, extent=exts)
The checker in question is basedpyright running "standard" checks. I'm not quite sure what to do about such a thing without hacking up my code...
The matplotlib stubs annotate the extent parameter of Axes.imshow as
So it doesn't accept floating, and before NumPy 2.2.0, it also didn't accept float64. If you think that it should accept something like tuple[floating * 4], then you should probably raise that with matplotlib.
So it seems like NumPy is (and I guess always has been) actually incompatible with the built-in float? Okay ... then is asking matplotlib to change their behavior (which no I don't think is the right thing) the only option to satisfy the type-checkers? Or using cast? I'm honestly seeking advice here since the very top of this thread is talking about user documentation.
So it seems like NumPy is (and I guess always has been) actually incompatible with the built-in float?
Before NumPy 2.2, type-checkers rejected x: float = np.float64(), but this is now allowed. But for the same reasons as I mentioned before, illegal: np.float64 = float() is not allowed.
Okay ... then is asking matplotlib to change their behavior (which no I don't think is the right thing) the only option to satisfy the type-checkers?
I at runtime matplotlib accepts both float and np.floating input, then the type annotations should also reflect that. So asking them to fix it is indeed an option. You could also consider submitting a PR yourself.
Based on your explanation, I think the thing that is biting me the most is this, from the original post:
Incomplete typing means functions may return floating rather than float64 even when they clearly return float64.
In particular, for me, it's fftfreq returning an array with type floating[Any], which then propagates down through the rest of my code. I'll keep watching for updates!
Based on your explanation, I think the thing that is biting me the most is this, from the original post:
Incomplete typing means functions may return floating rather than float64 even when they clearly return float64.
In particular, for me, it's fftfreq returning an array with type floating[Any], which then propagates down through the rest of my code. I'll keep watching for updates!
Yea that's understandable. I've had similar issues like that in a library I maintain that uses NumPy, so I understand how annoying it can be when you're forced to cast(np.float64, why_isnt_this_annotated_as_f64).
For what it's worth, we're putting a lot of work into improving the type signatures, e.g. by narrowing the return type in cases like yours. You can follow the progress at https://github.com/numpy/numtype, and you're welcome to help us out if feel like it, e.g. by raising issues or opening PR's for sub-optimally annotated functions like fftfreq (for which I already opened numpy/numtype#339, btw).
This code uses dicts as pseudo-records a lot and therefore the typing
is sloppier than I would ideally like.
For clarity, sm_make_map was folded into sm_make_maps and the loop
unrolled.
So as not to be using one dict for two radically different things, up
in make_movies, the alternative threads/no-threads code paths needed
to be split apart.
Some of the functions in calibrate.py were incorrectly annotated in
an earlier commit; this is corrected now I can see what their callers
actually supply.
In a few places we use the experimental shape typing from numpy 2.2;
this should be removed with prejudice if it causes any problems
whatsoever (see numpy/numpy#28076 ) but
it does seem to work for the very limited case this code wants, i.e.
“this is a 2-d matrix”.
…ying class (such as np.float32)
looking at microsoft/pyright#9051, they declined to fix it themselves, and suggested instead that the used must add a # pyright: ignore or # type: ignore directive to suppress this error.
Numpy is working to resolve them: numpy/numpy#28076 and has already done so with npfloat64 (which I can verify in our errors) -- see numpy/numpy#27957 .
Uh oh!
There was an error while loading. Please reload this page.
NumPy 2.2 had a lot of typing improvements, but that also means some regressions (at least and maybe especially for mypy users).
So maybe this exercise is mainly useful to me to make sense of the mega-issue in gh-27957.
My own take-away is that we need the user documentation (gh-28077), not just for users, but also to understand better who and why people have to change their typing. That is to understand the two points:
mypy
users of unannotated code are maybe quite many.--allow-redefinition
is easy, avoidingmypy
may be more work (maybe unavoidable).scipy-lectures
is "special" or could hide generic types outside the code users see...One other thing that I would really like to see is also the "alternatives". Maybe there are none, but I would at least like to spell it out, as in:
Due to ... only thing that we might be able to avoid these regression is to hide it away as
from numpy.typing_future import ndarray
and that is impractical/impossible because...CC @jorenham although it is probably boring to you, also please feel free to amend or expand.
Issues that require user action
User issues due to (necessarily) incomplete typing
There are two things that came up where NumPy used to have less precise or wrong typing, but correcting it making it more precise (while also necessarily incomplete as it may require a new PEP) means that type checking can fail:
floating
is now used as a supertype offloat64
(rather than identity) meaning it (correctly) matchesfloat32
,float
, etc.floating
rather thanfloat64
even when they clearly returnfloat64
.np.dtype(np.floating)
gives float64, but with a warning because it is not a good meaning.)(Users could choose to use this, but probably would need to cast explicitly often.)
There is a mypy specific angle in gh-27957 to both of these, because
mypy
defaults (but always runs into it) to infer the type at the first assignment. This assignment is likely (e.g. creation) to include the correct shape and float64 type, but later re-assignments will fail.mypy
has--allow-redefinition
although it doesn't fix it fully at least for nested scopes in for-loops,mypy
may improve this.The user impact is that:
mypy
fails even for unannotated code.float64
and shape types due to imprecise NumPy type stubs. These previously passed, whether intentional or not.float64
passing previously was arguably a bug, but is still a regression.(I, @seberg, cannot tell how problematic these are, or what options we have to try to make this easier on downstream, short of reverting or including reverting.)
Simple regressions fixed or fixable in NumPy
ndarray.__setitem__
withobject_
dtype in NumPy 2.2 #27964floating
change has at least that seems very much fixable with follow-ups, see TYP: inconsistent static typing offloat64
addition #28071 (e.g.numpy.zeros(2, dtype=numpy.float64) + numpy.float64(1.0)
is clearlyfloat64
).ndarray.item
never typechecks #27977np.ndarray.tolist
return type seems broken in numpy 2.2.0 #27944np.dtype
andnp.ndarray.dtype
in numpy 2.2.0 #27945Type-checkers issues that may impact NumPy
The text was updated successfully, but these errors were encountered:
Thanks for this analysis; it's spot on 👌🏻
Boring means predictable, and predictable means prevanteable. So by definition, regressions and bugs aren't boring 😉.
Uh oh!
There was an error while loading. Please reload this page.
Maybe this is type-able if we use
@deprecated
in combination with theoptype.typing.Just
trick 🤔 .Uh oh!
There was an error while loading. Please reload this page.
I believe that all cases of unannotated code that were valid with
numpy<2.2
but invalid with>=2.2
(such as #27957), are limited to mypy, and do not occur with(based)?pyright
.There are two distinct
numpy==2.2.0
changes that could cause this. One is related to the ongoing work to support shape-typing, and the other has to do with changes made tofloat64
andcomplex128
. Both these cases are only an issue for mypy users. But all that's needed to fix this, is to help mypy a bit by add a single type annotation.shape-typing
The shape-typing mypy issues are all caused by #27211, which changed the shape-type "default" of
ndarray
fromAny
totuple[int, ...]
. Without this change, shape-typing simply wouldn't be possible to achieve practice (which has to do with the undefined behavior of@overload
in the presence ofAny
).With
numpy>=2.2.0
, the following (valid) code will cause mypy (and only mypy) to report an error:Since numpy 2.2.0, the return type of
np.arange(2)
is anndarray
with a 1-dimensional shape-type, i.e.ndarray[tuple[int], dtype[signedinteger]]
.But
x + 1
does not (yet) take the shape-types of the input into account, and in this case returnsNDArray[signedinteger]
, which resolves tondarray[tuple[int, ...], dtype[signedinteger]]
.It's not allowed to assign
tuple[int, ...]
totuple[int]
, so it also isn't allowed to assignndarray[tuple[int, ...], _]
to andarray[tuple[int], _]
.And Mypy only looks at
x = np.arange(2)
when it infers the type ofx
, so it immediately determines thatx: ndarray[tuple[int], dtype[signedinteger]]
, even if this is contradicted on the very next line.So to work around this, we need to help mypy a bit by explicitly annotating
x
:float64
andcomplex128
Before
2.2.0
,float64
andcomplex128
were incorrectly annotated type-aliases offloating[_64Bit]
andcomplexfloating[_64Bit, _64Bit]
, respectively. These changes were made in #27334, and apply tocomplex128
in the same way asfloat64
. So for the sake of brevity, I'll limit this tofloat64
.This change fixes two large issues:
float64
subclassesfloating
, but the stubs incorrectly annotatedfloat64
as a (restricted) alias offloating
. The result is thatx: float64 = floating[Any]
will now be rejected, as it should (because it's type-unsafe).float64
is also a subclass ofbuiltins.float
, but the stubs did not reflect this. This was causing e.g.x: float = np.float64(42)
to be falsely rejected onnumpy<2.2.0
, even though it's perfectly type-safe.So for example, mypy (and only mypy) will reject the last of the following example
The
np.array([], np.float64)
expression evaluates to aNDArray[float64]
type. But adding1
to this, results in aNDArray[floating]
type. But sincenumpy>=2.2.0
it's no longer the case thatfloating
is assignable tofloat64
(because that'd be type-unsafe), so we also can't assignNDArray[floating]
toNDArray[float64]
.This causes mypy to (falsely) reject
y = y + 1
, just like it rejectedx + 1
in the shape-typing example. We can also work around this in the same way:Important
Note that mypy will (still) accept the following:
We see here that
y4 + 1
is inferred asndarray[tuple[int, ...], dtype[floating[Any]]]
, which is identical to the type ofy + 1
.So the errors that mypy reports are not caused by regressions or bugs; they are the consequence of a necessary improvement. And this is why these reported mypy errors are, in fact, falsy positives.
ndarray
binop return types forfloat64
&complex128
#28108ndarray
binop return types forfloat64
& ``complex128" #28112This comment has been minimized.
This comment has been minimized.
numpy
2.0 CAREamics/careamics#310Uh oh!
There was an error while loading. Please reload this page.
I suppose this is related and a symptom that many methods and functions in Numpy 2.2 erase shape information, giving rise to similar problems as mentioned above
Under mypy this results in a type error for the last line
The lack of shape-typing support for these functions has always been the case. Since NumPy 2.2 we made several functions, including as
numpy.zeros
, transparent to shape-types.Even if NumPy would have had perfect shape-typing support, your example would still be flagged as invalid, by both mypy and pyright:
The type of
output
is alist
, andlist
is an invariant type. With perfect shape-typing, it would only acceptfloat64
arrays of shape(3,3,3)
. But you assign an array of shape(1,3,3)
. This is type-unsafe, and therefore flagged as invalid by both mypy and pyright.This comment builds a straw man argument. First of all, I don't claim that 2.2 introduced shape erasure, but the fact that this information is absent in 2.2 and that creates conflicts with other late changes.
Second, this was a minimal example. A variable would have the same problem. But in any case lists are invariant but the inferred shape for the elements by MyPy is tuple[int, int, int] not (3,3,3). Hence the assignment is correct.
If Numpy chooses now to fixe the size of the arrays in the type that would be yet another extremely unfortunate choice because having inhomogeneous tensor sizes with a similar rank is a valid application. If the dimensions must be fixed that should be declared by the user.
And I didn't say that you did claim that 🤷🏻. It's just that I wanted to minimize the probability that someone else would misinterpret it that way (because exactly that has happened before, and it caused a lot of confusion).
No, it would be a different error code, and would be limited to mypy, whereas your example is also invalid on pyright.
That's indeed the type of
a
at the moment, but it's far from optimal. So there's a good chance that it'll be changed to a more specific shape-type, such astuple[Literal[3], Literal[3], Literal[3]]
. Once we support shape-typing inndarray.__getitem__
, thena[0:1, :, :]
might returntuple[int, int, int]
,tuple[Literal[1], Literal[3], Literal[3]]
, or something else entirely. My point is that the shape of an array is part of its type. So assigning an array to a list of arrays with a different is invalid.If you would've assigned
a
to a variable instead, then you'd only see an error when you use mypy, and Pyright, for example, would allow it. See the shape-typing section in #28076 (comment) for why exactly that is, and how you can work around it.Hmm I don't really understand I'm afraid 🤔.
As the name suggests, the purpose of shape-typing is to statically describe the shape of arrays, not only the number of dimensions. And since the shape-type parameter and the
tuple
types are covariant, I don't see how that would make same-rank operations invalid.But at the moment, using
Literal
as axis-type won't really work, because type-checkers tend to upcast e.g.Literal[42]
toint
when operated upon. There have been some ideas put forward, such as aLiteralInt
, and refinement types, as a solution to this (see e.g. https://docs.google.com/presentation/d/11IKAfpS_ODE_TXmBK4BlVzx4stAcOAECOS-LF_sAzhM/edit). But both are incredibly complicated to implement, and I don't expect a solution anytime soon.So for the foreseeable future, "shape-typing" will actually mean "rank-typing" (i.e. with the tuple-of-ints types). Anything more than that will realistically require a PEP, and a rather beefy one at that.
Any additional future type information can lead to similar things. Maybe shape-typing vs. rank-typing won't happen (even ever), but when/if it does there will be code as in the example that starts failing type-checking, because the exact shape nees to now be explicitly shape erasure.
If I understand correctly, this is a clear example where shape typing (not restricted to mypy) is inconvenient because the user must explicitly type less restrictive. I.e. where it fails Stéfan's rule of "untyped code should always pass".
We can decide that the advantages of this are larger than the disadvantages especially long-term. But the truth is that I doubt old discussions/pushes about shape typing really took these downsides into account. So we need to be very clear about them and understand how much they affect users (compared to the long-term benefits of correct shapes).
Right now, we are in the unfortunate situation that shape typing has very limited use, but does (occasionally?) require users to explicitly erase the shape.
Uh oh!
There was an error while loading. Please reload this page.
If I remember correctly, Stéfan's rule only applied to valid untyped code. So it doesn't apply to the example of @juanjosegarciaripoll:
list
is invariant, and therefore only accepts arrays that have the same exact shape. Updating a list with an array of a different shape is type-unsafe. But before NumPy 2.2, type-checkers wouldn't tell you that.This is what I mean with "type-unsafe":
Maybe it's not the best example, but the outcome is the same if you append a non-square matrix, e.g. of shape
(1, 2)
. Type-checkers are meant to prevent such mistakes, and NumPy 2.2 makes it easier for them to do so.But I think the important thing to accept it, is that this is only type-unsafe if you assume that typing shapes is the correct/useful level of abstraction for (NumPy) array typing! And that was not the status-quo.
We can argue that clearly shapes are important, just like in your example. But that doesn't mean it right or even useful for all code.
And I think we have to accept that it may be nicer if in the example you would have to type it as
pauli : List[SquareMatrix] =
to check the shapes, rather thanpauli
defaulting to it. Of course, I don't think that is possible...That's a very good point. And I agree that we should've thought it through better. But I'm not sure if that would've been enough, given that it was one of those "unknown unknowns".
But now that we have mypy_primer running, I'm a lot more confident that we can prevent such mistakes in the future.
But either way, even if we would've done everything right, then that
list_matmul
example would still raise aValueError
; with or without shape-types. Without shape-typing it would theoretically be type-safe, but at runtime, it's just as unsafe.ignore
numpy
test shape65e7ff0
pyright
to ci narwhals-dev/narwhals#2035I have a related question, I think, though if it should go somewhere else let me know - I tried to follow the above comments regarding float and np.floating but I'm a bit lost. What is the "right" way to handle float vs np.floating in user code? I'm running afoul of
np.floating[Any]
not being assignable tofloat
, but all I'm doing is taking the output ofnp.max()
and trying to use it where a float is expected (in this case, in a matplotlib function).Like so, if you'll pardon the extremely short non-reproducible example to get the idea:
The checker in question is basedpyright running "standard" checks. I'm not quite sure what to do about such a thing without hacking up my code...
In the NumPy 2.2.0 release we made
np.float64
an actual subclass bothnp.floating
andfloat
(thebuiltins
one). Before 2.2.0, it was simply an alias ofnp.floating
, which was incorrect for several reasons, which I outlined in #27334.One of the consequences of this incorrect definition on
numpy<2.2
, is that type-checkers allowed you to assignx: np.floating
toy: np.float64
. But sincefloat64
is a subclass offloating
(at runtime), that shouldn't be allowed, because it is type-unsafe. To see why, consider this example:So you're effectively assigning a
float32
tofloat64
here, which is clearly type-unsafe. Before NumPy 2.2, this was just as invalid as it is now. But becausefloat64
was incorrectly defined, type checkers could see that this was, in fact, an error.I understand that it can be frustrating to have to change a lot of your annotations. But it's not because of a regression that you have to do that. It's because your annotations were type-unsafe, and NumPy 2.2 made it possible for type-checkers to help you fix it.
The
matplotlib
stubs annotate theextent
parameter ofAxes.imshow
ashttps://github.com/matplotlib/matplotlib/blob/c887ecbc753763ff4232041cc84c9c6a44d20fd4/lib/matplotlib/axes/_axes.pyi#L480
So it doesn't accept
floating
, and before NumPy 2.2.0, it also didn't acceptfloat64
. If you think that it should accept something liketuple[floating * 4]
, then you should probably raise that with matplotlib.So it seems like NumPy is (and I guess always has been) actually incompatible with the built-in float? Okay ... then is asking matplotlib to change their behavior (which no I don't think is the right thing) the only option to satisfy the type-checkers? Or using cast? I'm honestly seeking advice here since the very top of this thread is talking about user documentation.
Before NumPy 2.2, type-checkers rejected
x: float = np.float64()
, but this is now allowed. But for the same reasons as I mentioned before,illegal: np.float64 = float()
is not allowed.I at runtime matplotlib accepts both
float
andnp.floating
input, then the type annotations should also reflect that. So asking them to fix it is indeed an option. You could also consider submitting a PR yourself.Uh oh!
There was an error while loading. Please reload this page.
Based on your explanation, I think the thing that is biting me the most is this, from the original post:
In particular, for me, it's
fftfreq
returning an array with typefloating[Any]
, which then propagates down through the rest of my code. I'll keep watching for updates!edited for typo
fft._helper
numpy/numtype#339Yea that's understandable. I've had similar issues like that in a library I maintain that uses NumPy, so I understand how annoying it can be when you're forced to
cast(np.float64, why_isnt_this_annotated_as_f64)
.For what it's worth, we're putting a lot of work into improving the type signatures, e.g. by narrowing the return type in cases like yours. You can follow the progress at https://github.com/numpy/numtype, and you're welcome to help us out if feel like it, e.g. by raising issues or opening PR's for sub-optimally annotated functions like
fftfreq
(for which I already opened numpy/numtype#339, btw).Thanks for all the info! I had run across numtype before but not realized that it was basically the future of numpy typing. I'll check it out!
spacing
numpy/numtype#363Add full type annotations to gPhoton/moviemaker/_steps.py.
7c7dd53
Several types in numpy 2.2+ are defined as typeAliases with no underl…
822c83c