8000 MAINT: Introduced a couple of fixes for the np.core.fromnumeric functions by BvB93 · Pull Request #74 · numpy/numpy-stubs · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Jun 10, 2020. It is now read-only.

MAINT: Introduced a couple of fixes for the np.core.fromnumeric functions #74

Merged
merged 9 commits into from
May 1, 2020
Merged

MAINT: Introduced a couple of fixes for the np.core.fromnumeric functions #74

merged 9 commits into from
May 1, 2020

Conversation

BvB93
Copy link
Member
@BvB93 BvB93 commented Apr 25, 2020

Follow-up on #71 (comment).

#67 Fixes & Updates

  • Integers and booleans can often be used interchangeably (in fact, bool is an int subclass in builtin python). Created_ArrayLikeIntOrBool to reflect this compatibility.
  • Rename _ScalarGeneric to _ScalarGenericDT; add a new _ScalarGeneric TypeVar which is only bound to np.generic.
  • np.choose() can only accept array-like objects consisting of integers (or booleans); reflect this in its annotation.
  • np.argpartition() will return an np.integer if a scalar is passed; not an np.ndarray.

Further Notes

  • It seems np.partition() might actually bugged in NumPy 1.17.3.
    it's default value for axis is set to -1 instead of None. Consequently, it will raise an AxisError if a scalar is passed unless one explicitly passes axis=None to the function.
>>> import numpy as np

>>> np.partition(0, 0, axis=None)
array([0])

>>> np.partition(0, 0)
Traceback (most recent call last):
  ...
AxisError: axis -1 is out of bounds for array of dimension 0

Bas van Beek added 4 commits April 24, 2020 15:05
New functions:
* ``argmax()``
* ``argmin()``
* ``searchsorted()``
* ``resize()``
* ``squeeze()``
* ``diagonal()``
* ``trace()``
* ``ravel()``
* ``nonzero()``
* ``shape()``
* ``compress()``

Updates:
* Integers and booleans can often be used interchangebly (in fact, ``bool`` is an ``int`` subclass in builtin python).
  Updated ``_ArrayLikeInt`` to reflect this compatibility.
* Rename ``_ScalarGeneric`` to ``_ScalarGenericPlus``;
  add a new ``_ScalarGeneric`` TypeVar which is *only* bound to ``np.generic``.
* ``np.choose()`` cannot only accept integers (or booleans); reflect this in its annotation.
* It seems ``np.partition()`` is bugged in NumPy 1.17.3;
  it's default value for ``axis`` is set to ``-1`` instead of ``None``.
  Consquently, it will raise an ``AxisError`` if a scalar is passed *unless* one explicitly passes ``axis=None`` to the function.
* ``np.argpartition()`` will return a ``np.generic`` if a scalar is passed; not an ``np.ndarray``.
* ``np.searchsorted()`` always takes an 1D array; replace ``Sequence[_ArrayLike]`` with ``Sequence[_Scalar]``.
* Added ``_ArrayLikeNested`` to ``np.swapaxes()``
@@ -898,6 +913,23 @@ def partition(
kind: _PartitionKind = ...,
order: Union[None, str, Sequence[str]] = ...,
) -> ndarray: ...
@overload
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour of np.argpartition() here is a bit odd.
If an np.ndarray is passed an np.ndarray is returned and if a np.generic is passed a np.generic is returned, so far so good.

However, if a builtin scalar is passed it also returns a np.ndarray, rather than a np.generic.

>>> import numpy as np

>>> type(np.argpartition(0, 0))
numpy.ndarray

>>> type(np.argpartition(np.int64(0), 0))
numpy.int64
8000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. Sounds like it could potentially be worthy of opening an issue against NumPy.

_ScalarGeneric = TypeVar(
"_ScalarGeneric", bound=Union[dt.datetime, dt.timedelta, generic]
# Integers and booleans can generally be used interchangeably
_ScalarInt = TypeVar("_ScalarInt", bound=Union[integer, bool_])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think _ScalarInt is the best name here; seems like it should be _ScalarIntOrBool. I mostly don't want this to accidentally get cargo-culted to other places where NumPy int/bool diverge; e.g. arithmetic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on #74 (comment) I'd argue that it is a suitable name. It can still be changed though if you don't agree,

Besides, for arithmetic it would (most likely) be more convenient to use a typevar of np.number anyway.

# Integers and booleans can generally be used interchangeably
_ScalarInt = TypeVar("_ScalarInt", bound=Union[integer, bool_])
_ScalarGeneric = TypeVar("_ScalarGeneric", bound=generic)
_ScalarGenericPlus = TypeVar(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly _ScalarGenericPlus is not very descriptive of the type here. Also this definition would allow passing through subclasses of datetime and timedelta; are we confident that will work correctly?

8000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like _ScalarGenericDT instead?

About your second concern, some quick testing suggests that subclassing does indeed work:

>>> import datetime as dt
>>> import numpy as np

>>> class A(dt.timedelta): ...
>>> np.take(A(1), 0).__class__
__main__.A

>>> class B(dt.datetime): ...
>>> np.take(B(2000, 1, 1), 0).__class__
__main__.B

)

# An array-like object consisting of integers
_Int = Union[int, integer]
_Int = Union[int, integer, bool, bool_]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above about naming.

_ArrayLikeBoolNested = Any # TODO: wait for support for recursive types

# Integers and booleans can generally be used interchangeably
_ArrayLikeInt = Union[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above about naming.

@overload
def choose(
a: _ArrayLike,
a: _ArrayLikeInt,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while passing bools technically works, given that the the docstring says

"This array must contain integers in [0, n-1], where n is the number of choices"

it seems to me that this is more an accident than the intent. I'm not sure that we should allow this in the types as it seems like not a best practice. @rgommers WDYT about this case? (We could also solicit opinions on the mailing list.)

Copy link
Member Author
@BvB93 BvB93 Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accident might not be the best term here.
The general compatibility between int and bool stems from the fact the bool is a subclass of the former.

>>> import numpy as np

>>> issubclass(int, bool)
True

>>> True == 1 and False == 0
True

>>> np.bool_(True) == 1 and np.bool_(False) == 0
True

Union[int, bool] is consequently equivalent to just plain int, but as np.bool_ is not a subclass of np.integer this does not hold for np.bool_. Even though the latter two classes behave (more or less; arithmetic is a bit of an exception here) identically to int and bool.

This also means that we could remove np.bool_ from the _ArrayLikeInt union, but properly removing bool would require adding a new overload along the lines of
choose(a: bool, ...) -> NoReturn.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as np.bool_ is not a subclass of np.integer this does not hold for np.bool_

And this is why I don't want to include it. Yes, including int gets you bool, but that's intentionally not true for numpy.bool_, so I'm not sure we should make an extra effort to bring it back here. To be clear, when I said "accident" I meant "that np.bool_ works".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I've implemented the name changes in #74.

@@ -898,6 +913,23 @@ def partition(
kind: _PartitionKind = ...,
order: Union[None, str, Sequence[str]] = ...,
) -> ndarray: ...
@overload
Copy link
Member 8000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. Sounds like it could potentially be worthy of opening an issue against NumPy.

Bas van Beek added 4 commits April 27, 2020 23:30
* Renamed ``_ScalarGenericPlus `` to ``_ScalarGenericDT`` (#74 (comment)).
* Renamed ``_Int`` to ``_IntOrBool`` (#74 (comment)).
* Renamed ``_ArrayLikeInt`` to ``_ArrayLikeIntOrBool`` (#74 (comment)).
@person142 person142 merged commit 44de2bb into numpy:master May 1, 2020
@person142
Copy link
Member

Thanks @BvB93!

@BvB93 BvB93 deleted the fromnumeric2 branch May 1, 2020 09:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0