Description
Back in #17719 the first steps were taken into introducing static typing support for array dtypes.
Since the dtype has a substantial effect on the semantics of an array, there is a lot of type-safety
to be gained if the various function-annotations in numpy can actually utilize this information.
Examples of this would be the rejection of string-arrays for arithmetic operations, or inferring the
output dtype of mixed float/integer operations.
The Plan
With this in mind I'd ideally like to implement some basic dtype support throughout the main numpy
namespace (xref #16546) before the release of 1.22.
Now, what does "basic" mean in this context? Namely, any array-/dtype-like that can be parametrized
w.r.t. np.generic
. Notably this excludes builtin scalar types and character codes (literal strings), as the
only way of implementing the latter two is via excessive use of overloads.
With this in mind, I realistically only expect dtype-support for builtin scalar types (e.g. func(..., dtype=float)
)
to-be added with the help of a mypy plugin, e.g. via injecting a type-check-only method into the likes of
builtins.int
that holds some sort of explicit reference to np.int_
.
Examples
Two examples wherein the dtype can be automatically inferred:
from typing import TYPE_CHECKING
import numpy as np
AR_1 = np.array(np.float64(1))
AR_2 = np.array(1, dtype=np.float64)
if TYPE_CHECKING:
reveal_type(AR_1) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"
reveal_type(AR_2) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"
Three examples wherein dtype-support is substantially more difficult to implement.
AR_3 = np.array(1.0)
AR_4 = np.array(1, dtype=float)
AR_5 = np.array(1, dtype="f8")
if TYPE_CHECKING:
reveal_type(AR_3) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
reveal_type(AR_4) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
reveal_type(AR_5) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
In the latter three cases one can always manually declare the dtype of the array:
import numpy.typing as npt
AR_6: npt.NDArray[np.float64] = np.array(1.0)
if TYPE_CHECKING:
reveal_type(AR_6) # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating*[numpy.typing._64Bit*]]]"