8000 typing.get_origin behaves differently between standard collections and typing collections · Issue #95539 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

typing.get_origin behaves differently between standard collections and typing collections #95539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
danielkatzan opened this issue Aug 1, 2022 · 7 comments
Labels
topic-typing type-bug An unexpected behavior, bug, or error

Comments

@danielkatzan
Copy link

Bug report

typing.get_origin behaves differently between standard collections, and typing collections
more specifically

get_origin(dict) is None, while get_origin(Dict) is <class 'dict'>
the same holds for other collections as well

This is true from python3.9 where support for standard collections was added

to my understanding dict should be able to be a drop-in replacement for Dict, which is not true in case relying on get_origin

I suggest updating get_origin to return the relevant class for builtin collections as opposed to None which returns today

note, this might mean a breaking change, as pointed out here, but id does make Dict and dict behave more similarly

@danielkatzan danielkatzan added the type-bug An unexpected behavior, bug, or error label Aug 1, 2022
@serhiy-storchaka
Copy link
Member

The fact that get_origin(Dict) returns dict is not documented. It looks to me like an implementation artifact.

Many things in the typing module are implementation artifacts. The implementation was rewritten several times, so some behavior of the current code is now different from the behavior of the earlier implementations, and some behavior was reproduced for backward compatibility even if it now does not make much sense. types.GenericAlias and types.UnionType were written from scratch and do nor reproduce every peculiarity of the typing module.

What is the use case for get_origin(Dict) and get_origin(dict)?

@danielkatzan
Copy link
Author
danielkatzan commented Aug 1, 2022

This had a side effect on the behavior of the pydantic package, which has some logic to determine the SHAPE of a type hint here, and it relies on get_originto do so, causing Dict and dict to be determined as different types, and with some later side effect in the rest of the package.

An example was documented here

@serhiy-storchaka
Copy link
Member

Does this code work as intended? I see few suspicious things:

  1. It contains the condition origin is dict or origin is Dict. But both get_origin(Dict[x, y]) and get_origin(Dict) return dict, not Dict. Is the second part of the condition really needed?
  2. If this condition is true, the code evaluates get_args(self.type_)[1]. But get_args(Dict) is an empty tuple, and you get an IndexError. This code does not work with Dict, you should not expect it to work with dict either.

@danielkatzan
Copy link
Author

the code actually works

  1. Second part of the if might be redundant, I'm not sure
  2. Note it doesn't call get_args from the typing package, but an internal implementation that wraps the get_args from typing

but now that you point it out, some adjustments will still need to be made to support get_args on dict

Anyway, regardless to that package code\usage or assumptions, I was just thinking that since dict and Dict are meant to be interchangeable (please correct me if I'm wrong), then it makes sense to try and make them behave as similar as possible (and specifically in the typing package)

also fine to leave as if there is a disagreement here
WDYT?

@gvanrossum
Copy link
Member

I was just thinking that since dict and Dict are meant to be interchangea A961 ble (please correct me if I'm wrong), then it makes sense to try and make them behave as similar as possible (and specifically in the typing package)

Makes sense to me.

@serhiy-storchaka
Copy link
Member

If you need to use a wrapper for get_args in any case, you can add a wrapper for get_origin as well.

If you make get_origin(dict) returning dict you can break some unrelated code, like many functions were broken when ``isinstance(list[int], type)returned True. On other hand, it can make some code simpler or more correct. We need to analyze the usage ofget_origin` in many different contexts before changing it.

It is also not clear what change do you need. Should it only work with a limited set of builtin Python types and types defined in the stdlib? Should it work with arbitrary type? Or any type with __class_getitem__? What if __class_getitem__ is None or a method always raising a TypeError.

@danielkatzan
Copy link
Author

Yep, can definitely do that, but I think the discussion here is regardless of the usage of the pydantic package

I think it's a general discussion on what is the expected behaviour of get_origin

I think the guideline should be to make standard collections behave as similar as possible to the typing collections in order to allow for easier transition between them (as I understand the collections from the typing module are expected to be deprecated in the future)

so get_origin should return for any standard collection the equivalent that would have returned for object from the typing package, i.e this list

for context, I'm a user of the pydantic package, and just trying to make my first steps in open source contributing :), and came across that bug opened for pydantic, which led my to this issue. I'm not aware enough to all the implications or nuances such a change might impose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-typing type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants
0