typing.get_origin behaves differently between standard collections and typing collections #95539

danielkatzan · 2022-08-01T17:21:56Z

Bug report

typing.get_origin behaves differently between standard collections, and typing collections
more specifically

get_origin(dict) is None, while get_origin(Dict) is <class 'dict'>
the same holds for other collections as well

This is true from python3.9 where support for standard collections was added

to my understanding dict should be able to be a drop-in replacement for Dict, which is not true in case relying on get_origin

I suggest updating get_origin to return the relevant class for builtin collections as opposed to None which returns today

note, this might mean a breaking change, as pointed out here, but id does make Dict and dict behave more similarly

The text was updated successfully, but these errors were encountered:

serhiy-storchaka · 2022-08-01T18:05:15Z

The fact that get_origin(Dict) returns dict is not documented. It looks to me like an implementation artifact.

Many things in the typing module are implementation artifacts. The implementation was rewritten several times, so some behavior of the current code is now different from the behavior of the earlier implementations, and some behavior was reproduced for backward compatibility even if it now does not make much sense. types.GenericAlias and types.UnionType were written from scratch and do nor reproduce every peculiarity of the typing module.

What is the use case for get_origin(Dict) and get_origin(dict)?

danielkatzan · 2022-08-01T19:16:06Z

This had a side effect on the behavior of the pydantic package, which has some logic to determine the SHAPE of a type hint here, and it relies on get_originto do so, causing Dict and dict to be determined as different types, and with some later side effect in the rest of the package.

An example was documented here

serhiy-storchaka · 2022-08-01T20:14:20Z

Does this code work as intended? I see few suspicious things:

It contains the condition origin is dict or origin is Dict. But both get_origin(Dict[x, y]) and get_origin(Dict) return dict, not Dict. Is the second part of the condition really needed?
If this condition is true, the code evaluates get_args(self.type_)[1]. But get_args(Dict) is an empty tuple, and you get an IndexError. This code does not work with Dict, you should not expect it to work with dict either.

danielkatzan · 2022-08-01T20:35:10Z

the code actually works

Second part of the if might be redundant, I'm not sure
Note it doesn't call get_args from the typing package, but an internal implementation that wraps the get_args from typing

but now that you point it out, some adjustments will still need to be made to support get_args on dict

Anyway, regardless to that package code\usage or assumptions, I was just thinking that since dict and Dict are meant to be interchangeable (please correct me if I'm wrong), then it makes sense to try and make them behave as similar as possible (and specifically in the typing package)

also fine to leave as if there is a disagreement here
WDYT?

gvanrossum · 2022-08-01T20:54:36Z

I was just thinking that since dict and Dict are meant to be interchangea A961 ble (please correct me if I'm wrong), then it makes sense to try and make them behave as similar as possible (and specifically in the typing package)

Makes sense to me.

serhiy-storchaka · 2022-08-02T07:51:44Z

If you need to use a wrapper for get_args in any case, you can add a wrapper for get_origin as well.

If you make get_origin(dict) returning dict you can break some unrelated code, like many functions were broken when ``isinstance(list[int], type)returned True. On other hand, it can make some code simpler or more correct. We need to analyze the usage ofget_origin` in many different contexts before changing it.

It is also not clear what change do you need. Should it only work with a limited set of builtin Python types and types defined in the stdlib? Should it work with arbitrary type? Or any type with __class_getitem__? What if __class_getitem__ is None or a method always raising a TypeError.

danielkatzan · 2022-08-02T08:59:13Z

Yep, can definitely do that, but I think the discussion here is regardless of the usage of the pydantic package

I think it's a general discussion on what is the expected behaviour of get_origin

I think the guideline should be to make standard collections behave as similar as possible to the typing collections in order to allow for easier transition between them (as I understand the collections from the typing module are expected to be deprecated in the future)

so get_origin should return for any standard collection the equivalent that would have returned for object from the typing package, i.e this list

for context, I'm a user of the pydantic package, and just trying to make my first steps in open source contributing :), and came across that bug opened for pydantic, which led my to this issue. I'm not aware enough to all the implications or nuances such a change might impose

danielkatzan added the type-bug An unexpected behavior, bug, or error label Aug 1, 2022

danielkatzan mentioned this issue Aug 1, 2022

Support GenericAlias in typing #84576

Closed

kumaraditya303 added the topic-typing label Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

typing.get_origin behaves differently between standard collections and typing collections #95539

typing.get_origin behaves differently between standard collections and typing collections #95539

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

typing.get_origin behaves differently between standard collections and typing collections #95539

typing.get_origin behaves differently between standard collections and typing collections #95539

Comments

Bug report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!