BUG: modelling various integer types as a specialized version of the generic class breaks type-checking in pyright #23007

ischurov · 2023-01-13T14:41:30Z

Describe the issue:

I want to make sure that some variable is an instance of np.uint64. To this end, I use the following:

assert isinstance(idx, np.uint64)

However, in that case I have the following error on that line during type-checking with pyright:

Second argument to "isinstance" must be a class or tuple of classes
  TypeVar or generic type with type arguments not allowed

As the developers of pyright explained, this is due to fact that type stubs model np.uint64 as a specialized version of the generic class unsignedinteger (uint64 = unsignedinteger[_64Bit]), while in fact it is a proper subclass of unsignedinteger. As specialized versions of a class (like list[int]) are not allowed in isinstance, the type-checker reports an error, making it impossible to use this assert at all.

Are there any reasons to model these types (like np.uint64, np.uint32, etc) in this way (as a specialized versions) instead of making separate types?

Reproduce the code example:

import numpy as np
import pandas as pd

def foo(x: np.uint64):
    pass
series = pd.Series([1, 2, 3], index=np.array([10, 20, 30], dtype='uint64'))
idx = series.index[0]
assert isinstance(idx, np.uint64)
foo(idx)

Error message:

(type alias) uint64: Type[unsignedinteger[_64Bit]]
Second argument to "isinstance" must be a class or tuple of classes
  TypeVar or generic type with type arguments not allowed

Runtime information:

1.24.1
3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0]
[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/vol/tcm10/ischurov/.conda/envs/latsym2/lib/libopenblasp-r0.3.21.so',
'internal_api': 'openblas',
'num_threads': 1,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.21'}]
None

Context for the issue:

Initially, I had a code that looked like the following:

def foo(x: np.uint64):
    pass
series = pd.Series([1, 2, 3], index=np.array([10, 20, 30], dtype='uint64'))
idx = series.index[0]
foo(idx)

On the type-checking, I have an error: Argument of type "Scalar | Unknown" cannot be assigned to parameter "x" of type "uint64" in function "foo". That's because the type-checker cannot infer that index of pd.Series has dtype uint64. Due to open issue in pandas, it is not yet possible to specify pd.Series.index's dtype using type hints. To give the type-checker information that idx has type np.uint64, I want to add the following assert before running foo:

assert isinstance(idx, np.uint64)

And that leads to type-checker error, as explained above.

The text was updated successfully, but these errors were encountered:

adampauls · 2023-06-20T18:43:12Z

Gentle nudge here. Is this a difficult change?

BvB93 · 2023-06-21T09:54:24Z

Gentle nudge here. Is this a difficult change?

Sort of. With how difficult it is to statically infer the bit sizes of returned numpy scalars & array dtypes it is currently not viable to model the various integer types as proper subclasses (rather than just parametrized subtypes). Returning the correct subclass is important as one can not freely downcast to a subclass without using heavy handed tools such as typing.cast or # type: ignore comments.

class Foo:
    pass

class Bar(Foo):
    pass

# Incompatible types in assignment (expression has type "Foo", variable has type "Bar")  [assignment]
var: Bar = Foo()

On the other hand, if the supertype is an Any-parametrized generic it is possible to freely cast this Any parameter to a subtype due its role as a wildcard. This was the motivation for modeling various np.number subclasses as parametrized generics rather than proper subclasses, and until we have better typing tools available for better capturing numpy's dtype promotion rules converting them to proper subclasses is simply not viable, requiring a substantial improvement in the quality of the statically inferred returned dtypes that we simply cannot provide.

There is one other solution I can think of here: avoid even trying to deal with bit sizes and treat the likes of np.int64, np.int32, etc. as np.signedinteger aliases. With the current state of the numpy's type annotations this would honestly be a rather minor regression (even more so in recent mypy versions due to the regression mentioned herein: #23764). This would remove the ability for type checkers to distinguish between e.g. np.int64 and np.int32, but in practice this more often than not too difficult already (forcing use of signedinteger[Any] as a compromise).

int8: typing.TypeAlias = signedinteger
int16: typing.TypeAlias = signedinteger
int32: typing.TypeAlias = signedinteger
int64: typing.TypeAlias = signedinteger
# etc.

adampauls · 2023-06-22T04:47:30Z

I would be happy with that solution, for whatever that is worth. Have you considered a solution that has two type annotations for each dtype:

class int8(signedInteger[_8Bit]):
    ...

Int8 =  signedinteger[_8Bit]

This requires some intelligence on behalf of the user, since they have to understand that sometimes a type that should be int8 can only be inferred to Int8, but I'm not sure how often this comes up. I expect it comes up infrequently relative to the annoyance of not being able to do an isinstance check without pyright complaining. I think this might be intuitive to Python typing users, however, since a similar issue happens with list/Sequence and dict/Mapping etc.

This second solution can also get better with time, either by adding increasingly precise overloads or in some cases by making use of new features like the Self type.

adampauls · 2023-06-27T17:01:36Z

I started the solution above here. I think it looks pretty good. There are still some test failures, but I think it's worth figuring out if there's any interest before fixing them.

ischurov added the 00 - Bug label Jan 13, 2023

seberg added the 41 - Static typing label Jan 14, 2023

adampauls mentioned this issue Jun 14, 2023

Numpy dtypes flagged as errors in isinstance checks. microsoft/pyright#5298

Closed

adampauls mentioned this issue Jun 20, 2023

TYP: Replace type aliases in stubs for unsigned integer types with subclasses #24007

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: modelling various integer types as a specialized version of the generic class breaks type-checking in pyright #23007

BUG: modelling various integer types as a specialized version of the generic class breaks type-checking in pyright #23007

BUG: modelling various integer types as a specialized version of the generic class breaks type-checking in pyright #23007

BUG: modelling various integer types as a specialized version of the generic class breaks type-checking in pyright #23007

Comments

Describe the issue:

Reproduce the code example:

Error message:

Runtime information:

Context for the issue: