-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: make typing module available at runtime #16558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ArrayLike = Any | ||
DtypeLike = Any | ||
_SupportsArray = Any | ||
from numpy.typing import ArrayLike, DtypeLike, _SupportsArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also execute all the code in pass
, so we're also testing here that you can really import these things at runtime.
@@ -0,0 +1,3 @@ | |||
from ._array_like import _SupportsArray, ArrayLike |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't much code in this package, but since typing is so verbose it would be a little painful to keep Any
, overload
, ... out of the public namespace if we crammed everything in a typing.py
file. So instead make a package and use this init to make sure we're only exporting exactly what we mean to.
from numpy import ndarray | ||
from ._dtype_like import DtypeLike | ||
|
||
if sys.version_info >= (3, 8): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the "no hard dependency on typing_extensions
" dance mentioned in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be an idea to issue an ImportWarning
if HAVE_PROTOCOL = False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd worry about that because it's pretty common for projects to run their tests suites with warnings turned into errors, and I could see something like e.g. a SciPy test run that doesn't have mypy installed and errors out because of the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that might be a problem, yes.
Does NumPy have a logger where where such information could be displayed? Since I suspect that silently setting _SupportsArray
to Any
(and by extension ArrayLike
) could result in some unexpected issues (at least from a end user perspective).
If not, then it should be mentioned in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No logger, when most of the action is happening in C code (potentially with the GIL released) logging doesn't work great. I added a big warning to to the top of the documentation in 4a120f0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so be it.
At least it is documented now, which is the most important thing.
import sys | ||
from typing import Any, overload, Sequence, Tuple, Union | ||
|
||
from numpy import ndarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of what I was trying to describe in the PR description where if we do from numpy import ndarray
, then mypy goes and looks in __init__.pyi
, finds ndarray
, and types it correctly. But if we were to do from ..core.multiarray import ndarray
, then it would find no stubs for that file and fall back on treating ndarray
as Any
, which would be bad.
Closes numpy#16550. This makes `np.typing.ArrayLike` and `np.typing.DtypeLike` available at runtime in addition to typing time. Some things to consider: - `ArrayLike` uses protocols, which are only in the standard library in 3.8+, but are backported in `typing_extensions`. This conditionally imports `Protocol` and sets `_SupportsArray` to `Any` at runtime if the module is not available to prevent NumPy from having a hard dependency on `typing_extensions`. Since e.g. mypy already includes `typing_extensions` as a dependency, anybody actually doing type checking will have it set correctly. - We are starting to hit the edges of "the fiction of the stubs". In particular, they could just cram everything into `__init__.pyi` and ignore the real structure of NumPy. But now that typing is available a runtime, we have to e.g. carefully import `ndarray` from `numpy` in the typing module and not from `..core.multiarray`, because otherwise mypy will think you are talking about a different ndarray. We will probably need to do some shuffling the stubs into more fitting locations to mitigate weirdness like this.
ef540a5
to
8e8a8f1
Compare
Now that I think of it, there is a much simpler solution to this problem-keep the current from typing import _Any
DtypeLike = _Any
ArrayLike = _Any That is, the types are correctly defined at typing time and unconditionally defined to be whatever are runtime. It's short, avoids the "where do I import from" question, and the "is The drawback is that runtime introspection of the types is now impossible, but we mainly intended this as a syntactic convenience, i.e. we wanted people to be able to What do people think? |
I like this solution, though I feel if we ever decide to expose a Nevertheless, this is not an issue if we're just exposing |
Though I do feel a value more descriptive than What about a plain string? DtypeLike = "numpy.typing.DtypeLike"
ArrayLike = "numpy.typing.ArrayLike" |
The reason I like |
Right, using |
Come to think if it, this non-run-time only solution will also run into issues if |
Hm yeah that’s absolutely convinced me that making this work at runtime is worth it. |
from numpy import ndarray | ||
from ._dtype_like import DtypeLike | ||
|
||
if sys.version_info >= (3, 8): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be an idea to issue an ImportWarning
if HAVE_PROTOCOL = False
?
HAVE_PROTOCOL = True | ||
|
||
if HAVE_PROTOCOL: | ||
class _SupportsArray(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for the future:
If _SupportsArray
ever becomes public we should make it useable for run-time isinstance()
and issubclass()
checks, similar to the likes of SupportsInt
(ref).
This is the direction python is moving in anyway. |
Eh, kind of? Nevertheless, as of now the >>> from __future__ import annotations
>>> from numpy.typing import ArrayLike
>>> def func(ar: ArrayLike[int]): # This will work fine, even if ArrayLike is set Any during runtime
... pass
>>> type_alias = ArrayLike[int] # This won't
Traceback (most recent call last):
...
TypeError: typing.Any is not subscriptable |
Ok, I've added initial documentation for the |
Typing `ArrayLike` correctly relies on `Protocol`, so warn users that they should be on 3.8+ or install `typing-extensions` if they want everything to work as expected.
After all the discussion above, how do people feel about the approach taken here? |
The release notes for typing work should probably go under |
According to README.rst in |
The list of available sections is in the |
See numpy#16558 (comment). It was previously an "improvement".
A few questions, I am new to the world of typing. Feel free to point me to background reading if that is easier than answering the questions directly. Does this slow down import time? Could you give a higher-level picture of when availability of typing at runtime is desired, what use-case this answers? Is it typical for libraries (vs. user code) to provide such an ability? |
It shouldn't in that it I opted not to add any
Right, so it's important to be clear that this doesn't enable anything new*; see e.g. numpy/numpy-stubs#66 (comment) for a discussion of ways to use the things in from numpy.typing import ArrayLike
x: ArrayLike = [1, 2, 3, 4] *Though I will note that there are packages that do runtime introspection of types (pydantic); if anybody wanted to do something like that with
I think that it is atypical. It's hard to say why, some things that come to mind:
If so inclined, @ethanhs might be able to offer better insights on this question. |
else: | ||
_SupportsArray = Any | ||
|
||
ArrayLike = Union[bool, int, float, complex, _SupportsArray, Sequence] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to support buffer protocols, but Python's typing doesn't support that yet. In the meantime I would suggest adding memoryview
and a comment referencing python/typing#593
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since memoryview
s are Sequence
s we do allow them currently, e.g.
import numpy as np
x = b'foobar'
v = memoryview(x)
np.array(v)
passes mypy. I will add the comment though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test explicitly for memoryviews though; I'll open an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd highly recommend commenting on this B.P.O issue asking for support for a Buffer protocol in typing, as that is the better place than the typing issue: https://bugs.python.org/issue27501
numpy/typing/__init__.py
Outdated
.. code-block:: python | ||
|
||
np.array(x**2 for x in range(10)) | ||
|
||
is valid NumPy code which will create an object array. The types will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this example would be better motivated by showing the output, REPL style:
In [2]: np.array(x**2 for x in range(10))
...:
Out[2]: array(<generator object <genexpr> at 0x1118c5a20>, dtype=object)
Could also substitute "object array" -> "0-dimensional object array"
Most readers will probably not realize that this code works in an unexpected way otherwise!
numpy/typing/__init__.py
Outdated
is valid NumPy code which will create an object array. The types will | ||
complain about this usage however. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The language "the types will complain" sounds a little weird to me.
Types don't complain, they just are :).
Instead, I would say "Type checkers will complain"
|
||
is valid NumPy code which will create an object array. The types will | ||
complain about this usage however. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a brief note on the suggested work-around?
The obvious way would be to add a comment disabling typing:
np.array(x**2 for x in range(10)) # type: ignore
Are there other recommended options?
I think we've also discussed making checks less strict if dtype=object
is specified, e.g.,
np.array(x**2 for x in range(10), dtype=object)
I don't know if that works yet. If it does, perhaps we should mention it, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other recommended options?
The other way we test for:
https://github.com/numpy/numpy/blob/master/numpy/tests/typing/pass/array_like.py#L43)
is adding an explicit Any
annotation. I've added examples of both methods to the docs.
I don't know if that works yet. If it does, perhaps we should mention it, too.
Seems like @seberg would probably know the answer to that?
The overhead of importing the standard library's So I think it should be fine to import |
Oh, weird, I wrote out a comment but I suppose I didn't hit the comment button. Anyway, making types available at runtime should be the default in my opinion. There are a few reasons why:
|
Thanks for that perspective @ethanhs. Anybody else have strong opinions on importing |
Thanks @person142 |
Closes #16550.
This makes
np.typing.ArrayLike
andnp.typing.DtypeLike
availableat runtime in addition to typing time. Some things to consider:
ArrayLike
uses protocols, which are only in the standard libraryin 3.8+, but are backported in
typing_extensions
. Thisconditionally imports
Protocol
and sets_SupportsArray
toAny
at runtime if the module is not available to prevent NumPy from
having a hard dependency on
typing_extensions
. Since e.g. mypyalready includes
typing_extensions
as a dependency, anybodyactually doing type checking will have it set correctly.
particular, they could just cram everything into
__init__.pyi
andignore the real structure of NumPy. But now that typing is available
a runtime, we have to e.g. carefully import
ndarray
fromnumpy
in the typing module and not from
..core.multiarray
, becauseotherwise mypy will think you are talking about a different
ndarray. We will probably need to do some shuffling the stubs into
more fitting locations to mitigate weirdness like this.