8000 ENH: Add annotations to the last 8 functions in numpy.core.fromnumeric by BvB93 · Pull Request #16729 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Add annotations to the last 8 functions in numpy.core.fromnumeric #16729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 24, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions numpy/__init__.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -1231,3 +1231,113 @@ def amin(
initial: _NumberLike = ...,
where: _ArrayLikeBool = ...,
) -> Union[number, ndarray]: ...

# TODO: `np.prod()``: For object arrays `initial` does not necasarily
# have to be a numerical scalar.
# The only requirement is that it is compatible
# with the `.__mult__()` method(s) of the passed array's elements.

# Note that the same holds for `np.sum()` and `.__add__()`.

@overload
def prod(
a: _Number,
axis: Optional[_ShapeLike] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
keepdims: bool = ...,
initial: _NumberLike = ...,
where: _ArrayLikeBool = ...,
) -> _Number: ...
@overload
def prod(
a: ArrayLike,
axis: None = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
keepdims: Literal[False] = ...,
initial: _NumberLike = ...,
where: _ArrayLikeBool = ...,
) -> number: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming we are allowing number subtypes for one but not the other why is that ?
In other words, why is there a difference in return types of the above two overloads ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key difference is that the first one is a TypeVar while the latter is not.

To illustrate, in the first overload an arbitrary number subclass (denoted by _Number)
is provided as input and a value of the same type is returned.

In the second overload we know that an instance of np.int_ is returned,
but there is a catch: np.int_ is a platform-dependent alias for one of the number subclasses,
which is not something we've been able to easily express as of yet.

numpy/numpy/__init__.pyi

Lines 511 to 517 in 92665ab

# TODO(alan): Platform dependent types
# longcomplex, longdouble, longfloat
# bytes, short, intc, intp, longlong
# half, single, double, longdouble
# uint_, int_, float_, complex_
# float128, complex256
# float96

As a workaround the latter case is now simply annotated as returning a number instance and
while correct, this is obviously not as specific as it could be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay got it, if my understanding is correct, the functionality is exactly the same if we use number of _Number here (second overload), but it is something we want to change in the future ?

Copy link
Member Author
@BvB93 BvB93 Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..., but it is something we want to change in the future ?

Correct, once we've dealt with these platform-dependent generic subclasses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hang on, I'm actually confusing the issues pertaining int_ and number here, sorry.

The problem with returningnumber is more straightforward.
As ArrayLike is not yet Generic with respect to the data type we don't have a way yet to correlate the input and output types.
In the future we should be able to use expressions such as the one below (or at least something similar):

>>> from typing import TypeVar
>>> import numpy as np
>>> from numpy.typing import ArrayLike

>>> T = TypeVar('T', bound=np.number)

>>> def func(a: ArrayLike[T]) -> T:
...     pass

If you're interested, see #16759 for more details.

@overload
def prod(
a: ArrayLike,
axis: Optional[_ShapeLike] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
keepdims: bool = ...,
initial: _NumberLike = ...,
where: _ArrayLikeBool = ...,
) -> Union[number, ndarray]: ...
def cumprod(
a: ArrayLike,
axis: Optional[int] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
) -> ndarray: ...
def ndim(a: ArrayLike) -> int: ...
def size(a: ArrayLike, axis: Optional[int] = ...) -> int: ...
@overload
def around(
a: _Number, decimals: int = ..., out: Optional[ndarray] = ...
) -> _Number: ...
@overload
def around(
a: _NumberLike, decimals: int = ..., out: Optional[ndarray] = ...
) -> number: ...
@overload
def around(
a: ArrayLike, decimals: int = ..., out: Optional[ndarray] = ...
) -> ndarray: ...
@overload
def mean(
a: ArrayLike,
axis: None = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
keepdims: Literal[False] = ...,
) -> number: ...
@overload
def mean(
a: ArrayLike,
axis: Optional[_ShapeLike] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
keepdims: bool = ...,
) -> Union[number, ndarray]: ...
@overload
def std(
a: ArrayLike,
axis: None = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
ddof: int = ...,
keepdims: Literal[False] = ...,
) -> number: ...
@overload
def std(
a: ArrayLike,
axis: Optional[_ShapeLike] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
ddof: int = ...,
keepdims: bool = ...,
) -> Union[number, ndarray]: ...
@overload
def var(
a: ArrayLike,
axis: None = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
ddof: int = ...,
keepdims: Literal[False] = ...,
) -> number: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number isn't correct when out is specified:

>>> np.var([1], out=np.zeros(()))
array(0.)

Copy link
Member
@eric-wieser eric-wieser Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies for most but not all functions in this patch with out.

< 685C details-menu class="dropdown-menu dropdown-menu-sw show-more-popover color-fg-default" style="width:185px" src="" preload > Copy link
Member Author
@BvB93 BvB93 Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies for most but not all functions in this patch with out.

Considering the widespread nature of this issue I'd propose to save it for a future maintenance pull request and fix it all at once (as there are already quite a few older annotated functions with out parameter).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime then, it would probably be a good idea to annotate these with Union[number, ndarray] - weak annotations are better than incorrect ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it turns out is fairly easy way to fix this: out: Optional[ndarray] must be changed into out: None when a number is returned.

I have noticed some inconsistent behavior when using out in combination with passing a number though,
as in such case another number is returned rather than a ndarray.
Builtin scalars, builtin sequences and ndarrays do all return an ndarray (see example below).

In [1]: import numpy as np                                                                                                  

In [2]: np.__version__                                                                                                      
Out[2]: '1.20.0.dev0+62f26cf'

In [3]: np.var(np.int64(1), out=np.zeros(()))                                                                               
Out[3]: 0.0

In [4]: np.var(1, out=np.zeros(()))                                                                                         
Out[4]: array(0.)

In [5]: np.var([1], out=np.zeros(()))                                                                                       
Out[5]: array(0.)

In [6]: np.var(np.ones(1), out=np.zeros(()))                                                                                
Out[6]: array(0.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is arguably a bug in np.var, when out is given, it must be the return value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number isn't correct when out is specified:

Fixed as of 5a42440.

Copy link
Member Author
@BvB93 BvB93 Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is arguably a bug in np.var, when out is given, it must be the return value.

I'll double check which (currently annotated) functions are affected by this this and create an issue afterwards (see #16734).

@overload
def var(
a: ArrayLike,
axis: Optional[_ShapeLike] = ...,
dtype: DtypeLike = ...,
out: Optional[ndarray] = ...,
ddof: int = ...,
keepdims: bool = ...,
) -> Union[number, ndarray]: ...
28 changes: 28 additions & 0 deletions numpy/tests/typing/fail/fromnumeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,31 @@
np.amin(a, out=1.0) # E: No overload variant of "amin" matches argument type
np.amin(a, initial=[1.0]) # E: No overload variant of "amin" matches argument type
np.amin(a, where=[1.0]) # E: List item 0 has incompatible type

np.prod(a, axis=1.0) # E: No overload variant of "prod" matches argument type
np.prod(a, out=False) # E: No overload variant of "prod" matches argument type
np.prod(a, keepdims=1.0) # E: No overload variant of "prod" matches argument type
np.prod(a, initial=int) # E: No overload variant of "prod" matches argument type
Copy link
Member
@eric-wieser eric-wieser Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is legal for suitable a:

>>> a = np.array([], dtype=object)
>>> np.prod(a, initial=int)
int

Or for an almost plausible use:

>>> from ctypes import c_uint8
>>> shape = np.array([1, 2, 3])
>>> np.prod(shape.astype(object), initial=c_uint8)
# a ctypes array type
numpy.core.fromnumeric.c_ubyte_Array_1_Array_2_Array_3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, this also proves that the return type of np.prod can be anything, not just number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, this also proves that the return type of np.prod can be anything, not just number.

If I'm not mistaken this is limited to, the notoriously difficult to type, object arrays.
If so I feel it'd be better to leave things as they are for now, especially since this appears to be a very specific (seemingly undocumented?) sit 10000 uation.

The alternative would be to set number and the return type to Any,
which is practically useless from a typing standpoint.

Copy link
Member
@eric-wieser eric-wieser Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially since this appears to be a very specific (seemingly undocumented?) situation.

How so? initial is documented as a "scalar", not as a number.

Perhaps the compromise is to stick # TODO: This is actually legal or similar by each test case that currently expects and results in an error, but should not.

Copy link
Member Author
@BvB93 BvB93 Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? initial is documented as a "scalar", not as a number.

That's actually a good point; I more or less conflated scalar with numerical scalar here.

Perhaps the compromise is to stick # TODO: This is actually legal or similar by each test case that currently expects and results in an error, but should not.

I agree, I'll push an update in a bit.

As for the future:

What I've gathered is, in the case of object arrays, that np.prod() simply falls back to the .__mult__() method of the passed arrays individual elements and hence all that's required of initial is that its type is compatible with aforementioned elements.
For object arrays we may or may not be able to easily express this with a __mult__ Protocol once ndarray is generic over its dtype, though this will depend on the exact implementation details.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the compromise is to stick # TODO: This is actually legal or similar by each test case that currently expects and results in an error, but should not.

Added in 3bc51b8.

Copy link
Member
@eric-wieser eric-wieser Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For object arrays we may or may not be able to easily express this with a __mult__ Protocol once ndarray is generic over its dtype, though this will depend on the exact implementation details.

I wouldn't personally bother. Simply having prod(ndarray[int], initial: int) -> int, prod(ndarray[object], initial: object) -> object etc is all that's really needed. The details of exactly what the object loops of ufuncs do and do not support is not really interesting.

np.prod(a, where=1.0) # E: No overload variant of "prod" matches argument type

np.cumprod(a, axis=1.0) # E: Argument "axis" to "cumprod" has incompatible type
np.cumprod(a, out=False) # E: Argument "out" to "cumprod" has incompatible type

np.size(a, axis=1.0) # E: Argument "axis" to "size" has incompatible type

np.around(a, decimals=1.0) # E: No overload variant of "around" matches argument type
np.around(a, out=type) # E: No overload variant of "around" matches argument type

np.mean(a, axis=1.0) # E: No overload variant of "mean" matches argument type
np.mean(a, out=False) # E: No overload variant of "mean" matches argument type
np.mean(a, keepdims=1.0) # E: No overload variant of "mean" matches argument type

np.std(a, axis=1.0) # E: No overload variant of "std" matches argument type
np.std(a, out=False) # E: No overload variant of "std" matches argument type
np.std(a, ddof='test') # E: No overload variant of "std" matches argument type
np.std(a, keepdims=1.0) # E: No overload variant of "std" matches argument type

np.var(a, axis=1.0) # E: No overload variant of "var" matches argument type
np.var(a, out=False) # E: No overload variant of "var" matches argument type
np.var(a, ddof='test') # E: No overload variant of "var" matches argument type
np.var(a, keepdims=1.0) # E: No overload variant of "var" matches argument type
64 changes: 64 additions & 0 deletions numpy/tests/typing/pass/fromnumeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,3 +183,67 @@
np.amin(B, axis=0)
np.amin(A, keepdims=True)
np.amin(B, keepdims=True)

np.prod(a)
np.prod(b)
np.prod(c)
np.prod(A)
np.prod(B)
np.prod(A, axis=0)
np.prod(B, axis=0)
np.prod(A, keepdims=True)
np.prod(B, keepdims=True)

np.cumprod(a)
np.cumprod(b)
np.cumprod(c)
np.cumprod(A)
np.cumprod(B)

np.ndim(a)
np.ndim(b)
np.ndim(c)
np.ndim(A)
np.ndim(B)

np.size(a)
np.size(b)
np.size(c)
np.size(A)
np.size(B)

np.around(a)
np.around(b)
np.around(c)
np.around(A)
np.around(B)

np.mean(a)
np.mean(b)
np.mean(c)
np.mean(A)
np.mean(B)
np.mean(A, axis=0)
np.mean(B, axis=0)
np.mean(A, keepdims=True)
np.mean(B, keepdims=True)

np.std(a)
np.std(b)
np.std(c)
np.std(A)
np.std(B)
np.std(A, axis=0)
np.std(B, axis=0)
np.std(A, keepdims=True)
np.std(B, keepdims=True)

np.var(a)
np.var(b)
np.var(c)
np.var(A)
np.var(B)
np.var(A, axis=0)
np.var(B, axis=0)
np.var(A, keepdims=True)
np.var(B, keepdims=True)
64 changes: 64 additions & 0 deletions numpy/tests/typing/reveal/fromnumeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,3 +203,67 @@
reveal_type(np.amin(B, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.amin(A, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.amin(B, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]

reveal_type(np.prod(a)) # E: numpy.number
reveal_type(np.prod(b)) # E: numpy.float32
reveal_type(np.prod(c)) # E: numpy.number
reveal_type(np.prod(A)) # E: numpy.number
reveal_type(np.prod(B)) # E: numpy.number
reveal_type(np.prod(A, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.prod(B, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.prod(A, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.prod(B, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]

reveal_type(np.cumprod(a)) # E: numpy.ndarray
reveal_type(np.cumprod(b)) # E: numpy.ndarray
reveal_type(np.cumprod(c)) # E: numpy.ndarray
reveal_type(np.cumprod(A)) # E: numpy.ndarray
reveal_type(np.cumprod(B)) # E: numpy.ndarray

reveal_type(np.ndim(a)) # E: int
reveal_type(np.ndim(b)) # E: int
reveal_type(np.ndim(c)) # E: int
reveal_type(np.ndim(A)) # E: int
reveal_type(np.ndim(B)) # E: int

reveal_type(np.size(a)) # E: int
reveal_type(np.size(b)) # E: int
reveal_type(np.size(c)) # E: int
reveal_type(np.size(A)) # E: int
reveal_type(np.size(B)) # E: int

reveal_type(np.around(a)) # E: numpy.number
reveal_type(np.around(b)) # E: numpy.float32
reveal_type(np.around(c)) # E: numpy.number
reveal_type(np.around(A)) # E: numpy.ndarray
reveal_type(np.around(B)) # E: numpy.ndarray

reveal_type(np.mean(a)) # E: numpy.number
reveal_type(np.mean(b)) # E: numpy.number
reveal_type(np.mean(c)) # E: numpy.number
reveal_type(np.mean(A)) # E: numpy.number
reveal_type(np.mean(B)) # E: numpy.number
reveal_type(np.mean(A, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.mean(B, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.mean(A, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.mean(B, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]

reveal_type(np.std(a)) # E: numpy.number
reveal_type(np.std(b)) # E: numpy.number
reveal_type(np.std(c)) # E: numpy.number
reveal_type(np.std(A)) # E: numpy.number
reveal_type(np.std(B)) # E: numpy.number
reveal_type(np.std(A, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.std(B, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.std(A, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.std(B, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]

reveal_type(np.var(a)) # E: numpy.number
reveal_type(np.var(b)) # E: numpy.number
reveal_type(np.var(c)) # E: numpy.number
reveal_type(np.var(A)) # E: numpy.number
reveal_type(np.var(B)) # E: numpy.number
reveal_type(np.var(A, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.var(B, axis=0)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.var(A, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
reveal_type(np.var(B, keepdims=True)) # E: Union[numpy.number, numpy.ndarray]
0