-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Allow single-item discriminated unions #6405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Deploying with
|
| Latest commit: |
4213a62
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://a2761ace.pydantic-docs2.pages.dev |
| Branch Preview URL: | https://allow-single-item-discrimina.pydantic-docs2.pages.dev |
|
Update: the issue described in the comment linked above is resolved on the current However, I would still be in favor of merging this for the reasons described above. Obviously it's less urgent though given it's not necessary to resolve a reported issue. |
|
please review |
|
wouldn't the right thing to do be to just return the inner schema? why have a union at all if there's just 1 thing? |
|
I can imagine scenarios where it's useful to get the discriminated union behavior even for a single union case (note: this is in contrast with the behavior of "normal" unions, where I don't believe using a single-item-union would provide any even-theoretically-beneficial difference in behavior). In particular, rather than reporting all the possible errors if the discriminator is missing, it just reports one. More concretely, here is an example: from typing import Literal, Annotated
from pydantic import BaseModel, TypeAdapter, Field, ValidationError
class Model1(BaseModel):
kind: Literal['one']
x: str
y: int
z: float
w: bool
class Model2(BaseModel):
kind: Literal['two']
x: int
y: str
z: bool
w: float
data = {'kind': 'two', 'x': 1, 'y': 'a', 'z': False, 'w': 1.5}
adapter = TypeAdapter(Model1)
try:
adapter.validate_python(data)
except ValidationError as exc:
print(exc)
"""
4 validation errors for Model1
kind
Input should be 'one' [type=literal_error, input_value='two', input_type=str]
For further information visit https://errors.pydantic.dev/2.0.2/v/literal_error
x
Input should be a valid string [type=string_type, input_value=1, input_type=int]
For further information visit https://errors.pydantic.dev/2.0.2/v/string_type
y
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
For further information visit https://errors.pydantic.dev/2.0.2/v/int_parsing
w
Input should be a valid boolean [type=bool_type, input_value=1.5, input_type=float]
For further information visit https://errors.pydantic.dev/2.0.2/v/bool_type
"""
adapter_with_discriminator = TypeAdapter(Annotated[Model1, Field(..., discriminator='kind')])
try:
adapter_with_discriminator.validate_python(data)
except ValidationError as exc:
print(exc)
"""
1 validation error for tagged-union[Model1]
Input tag 'two' found using 'kind' does not match any of the expected tags: 'one' [type=union_tag_invalid, input_value={'kind': 'two', 'x': 1, '...', 'z': False, 'w': 1.5}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.0.2/v/union_tag_invalid
"""In a world where you know you'll get one of two inputs, and you can distinguish by the kind field, but you only want one of the two to pass validation, I can see the error from Another case that seems like it could be useful would be if you have logic for determining what items should be in a tagged union where having a single item is still sensible: from typing import Literal, Annotated, Union
from pydantic import BaseModel, TypeAdapter, Field, ValidationError
class Cat(BaseModel):
kind: Literal['cat'] = 'cat'
class Dog(BaseModel):
kind: Literal['dog'] = 'dog'
class Fish(BaseModel):
kind: Literal['fish'] = 'fish'
def get_pet_adapter(cats_allowed: bool, dogs_allowed: bool, fish_allowed: bool) -> TypeAdapter:
allowed_kinds = []
if cats_allowed:
allowed_kinds.append(Cat)
if dogs_allowed:
allowed_kinds.append(Dog)
if fish_allowed:
allowed_kinds.append(Fish)
return TypeAdapter(Annotated[Union[tuple(allowed_kinds)], Field(discriminator='kind')])
adapter = get_pet_adapter(cats_allowed=False, dogs_allowed=True, fish_allowed=False)
adapter.validate_python({'kind': 'dog'})While this could potentially be worked around by simplifying the union, I still feel there is utility in being able to say "if the discriminator field fails validation, don't bother validating anything else". Ultimately, the reasons why I think we should do this are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably this removes the distinction between treatment of a single type and a 1-element union when using a discriminator.
Presumably this would (very slightly) negatively impact performance for users who use this, because pydantic_core would first validate the tag and then prodeed to validate the whole type (which would include the tag again) - i.e. it's not an optimization in the one-member case, unlike for multiple members.
An argument to not do this would be that by forbidding it we don't give users the opportunity to deoptimize themselves. Users can already handle the case of single members differently in the examples @dmontagu illustrated above.
Overall this seems like a tradeoff of usability versus performance. I think because the error (of trying to use discriminant for a non-union type, i.e. single-element union) fails at model build time, I'm currently falling just on the side of not allowing this.
We could benchmark the performance impact; maybe the overhead added is irrelevant, in which case I'd be👍
@davidhewitt note that this is opt-in — you still have to say you want to use a discriminator to get this behavior. Given that, I feel like it's not really a tradeoff in performance, because you can just choose not to include the discriminator field in the Am I misunderstanding something, or do you think even with that it should still be an error? |
|
@davidhewitt also to be clear:
This PR would change that: from typing import Literal
from pydantic import BaseModel, Field
class Model1(BaseModel):
kind: Literal['one']
x: str
y: int
z: float
w: bool
class M(BaseModel):
x: Model1 = Field(discriminator='kind')
data_one = {'kind': 'one', 'x': 'a', 'y': 1, 'z': 1.5, 'w': True}
data_two = {'kind': 'two', 'x': 1, 'y': 'a', 'z': False, 'w': 1.5}
print(M(x=data_one))
#> x=Model1(kind='one', x='a', y=1, z=1.5, w=True)
M(x=data_two)
"""
pydantic_core._pydantic_core.ValidationError: 1 validation error for M
x
Input tag 'two' found using 'kind' does not match any of the expected tags: 'one' [type=union_tag_invalid, input_value={'kind': 'two', 'x': 1, '...', 'z': False, 'w': 1.5}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.0.2/v/union_tag_invalid
""" |
My perception was that the error at build time is good because it stops users adding a performance overhead. But on reflection I retract this opinion; I'm in favour of the usability improvement proposed here. If it's a performance hit, that's something we can solve in the implementation. |
I don't have proof yet as I can't reproduce the issue, but I suspect this may resolve the issue brought up in #5163 (comment).
Even if not, I think this is an improvement worth making, as I don't see any reason we should error if you produce a single-item discriminated union. In particular, it may be desirable to do this as a way to immediately fail validation if the tag is wrong rather than producing all the possible associated errors.
(The
test_discriminated_single_varianttest I've added shows this in action.)Selected Reviewer: @davidhewitt