8000 How to have an “optional” field but if present required to conform to non None value? · Issue #1223 · pydantic/pydantic · GitHub
[go: up one dir, main page]

Skip to content

How to have an “optional” field but if present required to conform to non None value? #1223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mgcdanny opened this issue Feb 12, 2020 · 38 comments
Labels

Comments

@mgcdanny
Copy link

How to have an “optional” field but if present required to conform to non None value?

How can I have an optional field where None is not allowed? Meaning a field may be missing but if it is present it should not be None.

from pydantic import BaseModel

class Foo(BaseModel):
    count: int
    size: float = None  # how to make this an optional float? 

 >>> Foo(count=5)
 Foo(count=5, size=None)  # GOOD - "size" is not present, value of None is OK

 >>> Foo(count=5, size=None)
 Foo(count=5, size=None) # BAD - if field "size" is present, it should be a float

 # BONUS
 >>> Foo(count=5)
 Foo(count=5)  # BEST - "size" is not present, it is not required to be present, so we don't care about about validating it all.  We are using Foo.json(exclude_unset=True) handles this for us which is fine.

I cross posted SO:
https://stackoverflow.com/questions/60191270/python-pydantic-how-to-have-an-optional-field-but-if-present-required-to-con

@samuelcolvin
Copy link
Member

use a validator

from pydantic import BaseModel, validator

class Foo(BaseModel):
    count: int
    size: Optional[float] = None

    @validator('size')
    def prevent_none(cls, v):
        assert v is not None, 'size may not be None'
        return v

@dmontagu
Copy link
Contributor
dmontagu commented Feb 12, 2020

Just to add onto @samuelcolvin's answer, this works because, by default, validators aren't called on arguments that are not provided (though there is a keyword argument on @validator that can make this happen). (This is described in the docs linked above.)

(Another route would be to use a sentinel value (e.g., object()) as the default (this might require some Config changes to make it work, I'm not 100% sure), and add an always=True validator that converts the exact default value to None, and raises an error if None was provided.)


Note that the partial lack of idempotency may cause trouble with certain frameworks (like FastAPI) which may convert the model to dict form (which may have the None), and then call the initializer again on the dumped data. In particular, as of now you are likely to run into this issue if you specify the model as a response_model to a FastAPI endpoint.

There may be a way to achieve a similar pattern that can track whether the root source of the value was default initialization (defeating the idempotency issues), but you'd have to avoid the use of None. (Something that you ensure JSON-encodes to None might work though, if that's the context you are working in.)

@connebs
Copy link
Contributor
connebs commented Feb 22, 2020

This is irking me at the moment. Passing in a field as None is fundamentally different from not passing in a field at all. Right now, pydantic conflates the two by using the Optional type for both use-cases. In code:

Foo(x=1, y=None)

is different (with regards to the intentions of the programmer) from

Foo(x=1)

There might be more use-cases, but my own and I think the most obvious one is that sometimes I want to do a partial update: pass in some values, validate them (with pydantic) and then update them elsewhere (say: in my datastore).

But. I don't want the values to be null. I want them to be either some type or I don't want them to exist at all. Having to use a custom validator for this wherever I need it is a lot of extra effort.

At a library level, it seems there are two ways of dealing with this.

One is to allow BaseModel's to have a special configuration option that changes the behaviour to be something like this:

from pydantic import Required

class Foo(BaseModel):
    a: Required[int]
    b: int
    c: Optional[int]
    d: Required[Optional[int]]
    class Config:
        require_by_default = False (default: True)

This would result in these requirements:

  • a to be present and only an int
  • b to be an int (but not necessarily present)
  • c does not need to be present, and if it is, it can be int or None
  • d needs to be present, and can be int or None

The other option would be to add a custom type that supports b, so that you don't need a custom config option.

from pydantic import NotNone

class Foo(BaseModel):
    a: int
    b: NotNone[int]
    c: Optional[int]

The problem with this solution is that it does not support use case d, which seems like a good use-case to support.

However, in both cases, b demonstrates the behaviour desired in this issue.

I would actually prefer it if the first option was the default behavior for pydantic, but at this point clearly that is not on the table.

@connebs
Copy link
Contributor
connebs commented Mar 2, 2020

@samuelcolvin @dmontagu Would there be any willingness to add this functionality to pydantic? I would be willing to start a PR if so. I personally am a big fan of option 1's functionality, as it allows for all possible iterations of providing data to a pydantic class, and I think is a better reflection of what Optional[x] truly is (just Union[x, None]). The current way of doing it is straightforward, but as can be seen in this issue falls short for some use-cases in terms of developer intent.

The set of fields actually passed into instantiation is already stored anyway, so doesn't seem like this would hurt much perf wise.

Some alternatives to calling the "must be present" pydantic type Required could be: Present, Provided, Needed, Mandatory, etc.

@dmontagu
Copy link
Contributor
dmontagu commented Mar 3, 2020

I wouldn't necessarily have a problem adding such functionality, but I'll warn you that if your goal is to use this with FastAPI, you may well run into problems until FastAPI is updated to make less use of dump-then-reparse in various places.


After some review of the specs, I think the approach described in your first bullet is a substantially better reflection of OpenAPI/JSON Schema semantics. Because of that, I'm more open to a config-flag-based approach than I might otherwise be.

However, I see two major obstacles:

  1. JSON Schema and OpenAPI's interpretation differs from the interpretation of the word Optional in the majority of statically typed languages, including mypy-compatible python (I think this stems from the fact that fields in structs can't just be missing, unlike in JSON/python objects). So I think we probably can't/shouldn't adopt this naming convention (i.e., using Optional and Nullable generic types with these semantics), convenient as it would be for OpenAPI/JSON Schema compatibility.

As a result, I think we should avoid the use of Optional alongside Nullable and Required, since its interpretation differs depending on context.

  1. Mypy, the built-in dataclasses library, and basically all python static analysis tools (including IDEs, etc.) treat types as "required" by default, rather than "optional" by default (where I am using the JSON schema notion of required vs. optional here). As a result, these semantics are likely to either work poorly with existing development tools, or add an enormous maintenance burden for plugins (mypy, pycharm, vscode, etc.).

As a result, I think we need to keep annotated fields required by default.

Between the above two points, I think it could make sense to add Unrequired and Nullable as new generic types with the following semantics:

  • Unrequired[X] is equivalent to Optional[X] with a validator that value is not None when specified.
  • Nullable[X] is like Optional[X], but the value must be specified. (This is similar to how Optional[X] works with the dataclasses package.)
  • Unrequired[Nullable[X]] would be equivalent to the semantics currently represented by Optional[X].
  • Nullable[Unrequired[X]] is undefined/disallowed; in general, Unrequired should never occur as a parameter of another generic. So ideally something like List[Unrequired[int]] would raise a TypeError during the model class creation.

Note that adding another config flag, especially one that modifies fundamental behavior like whether a type is required/optional by default, is likely to introduce many subtle bugs, add a lot of maintenance burden, and make it difficult to refactor things as we discover better approaches to implementation. So despite the comment I made above that I am more open to a config-based approach than I would normally be, I think in this case we should really, really try to avoid it if possible.

But I think maybe the approach using the Unrequired and Nullable generics described above might make this unnecessary.


Note that adding support for Unrequired and Nullable is likely to require a non-trivial amount of effort to add good support for PyCharm (and other IDEs) and mypy. That said, I think adding such support would at least be straightforward, if not quick and easy (unlike, for example, modifying the behavior of pydantic's GenericModel).

@connebs
Copy link
Contributor
connebs commented Mar 3, 2020

Thanks for the great thoughts @dmontagu! Need to go over in more detail, but I just had an idea that I thought I'd throw out there (and is somewhat inspired by your previous comment in the thread) before I go to bed and forget it:

What if, instead of Pydantic loading in missing fields as None, it loaded them in as some sort of sentinel object+type, let's say for simplicity an object/type called Missing. Then, Optional[X] would work for the "can't be missing but can be None" use case, while for "can be missing but not None", we'd introduce an Unrequired[X] (or otherwise named) generic type that under the hood is (similar to Optional[X] before it) just Union[X, Missing].

Then, for the "can be missing or None", you could either have Optional[Unrequired[X]], but perhaps more ergonomically the developer could just use Union[X, None, Missing]. Perhaps you could even have a shorthand for that which would just be Disposable[X] or something.

This would presumably alleviate some of the burden with regard to type checking with mypy and the like, no?

That way you don't technically even need generic types like Nullable, Unrequired or Disposable – you could just being loading in missing fields as the missing type and leave the pydantic user to do Union[X, Missing], etc.

@dmontagu
Copy link
Contributor
dmontagu commented Mar 4, 2020

Technically that information is already tracked -- you can look inside instance.__fields_set__ to see whether the value was missing or not. So I think things still boil down to what semantics we actually want, and given a decision there, what's the easiest way to implement that.

The obvious challenges I see with adding a new type to represent Missing are:

  1. It would likely throw mypy/IDEs for a loop (without extensive plugin work)
  2. It would likely add a lot of complexity related to converting Missing to None when dumping the data to a dict and/or JSON
  3. It would be a big departure from the way things work now, and I think it's probably a bad idea to break too much existing code, even with a major version bump. Maybe it would be possible to introduce in such a way that existing code was unlikely to break, but it seems like it would be challenging.

Note that adding additional generic types wouldn't require changing the behavior of any existing code. To the extent that we needed to change existing logic to avoid serializing Unrequired properties, that too would likely be a much more localized change than what would need to happen if we used a different approach to represent unspecified values.

@samuelcolvin
Copy link
Member
samuelcolvin commented Mar 4, 2020

I think this is a duplicate of #1223 (at least the latter part of this discussion).

I'm inclined to stick to the current logic that:

class Model(base):
    a: Optional[int]  # this field is required bit can be given None (to be CHANGED in v2)
    b: Optional[int] = None  # field is not required, can be given None or an int (current behaviour)
    c: int = None  # this field isn't required but must be an int if it is provided (current behaviour)

The only other thing I would consider (but am currently opposed to having read through #1223) is Nullable[] or RequiredOptional[] which is equivalent to Optional[] but requires a value to be provided, e.g. same as a above without a breaking change.

Unrequired is (no pun intended) not required as far I can see since it's the same as c above.

@samuelcolvin
Copy link
Member

Oh, I see that c would break mypy. Given that this case seems very rare, can't we stick with the validator approach?

If not, I think Unrequired is a confusing name, how about DisallowNone? Though I'm very far from convinced it needs to be added to pydantic, it would work perfectly well as a custom type.

@connebs
Copy link
Contributor
connebs commented Mar 4, 2020

The issue is that c is not very rare – it comes up any time someone wants to validate a partial update of some data with pydantic. Which for me happens a lot. Why pass through an entire representation of the data in question when I can instead only pass in the subset of data that I want to update? In fact, I guarantee this a fairly common pattern for many web apps at the very least.

The deserialization/serialization lib I used before pydantic (marshmallow) handled this by having a required param for fields that can't be missing and an allow_none param for "can be None". So the default behavior was actually c.

@samuelcolvin
Copy link
Member
samuelcolvin commented Mar 5, 2020

I disagree.

Since c can be None, e.g. when it's not provided. Therefore there's no harm in also allowing it to be None when it is provided - e.g. the a or b case.

I therefore continue to hold the opinion that "X is not required, but if it is supplied it may not be None" is not particularly common.

I'll wait to be proved wrong by 👍 on this issue or duplicate issues.


Since there are two workarounds (validators or a custom type), I'm not that interested in continuing this conversation or adding DisallowNone on the basis of opinion alone.

Let's wait to see if others agree with you.

@connebs
Copy link
Contributor
connebs commented Mar 5, 2020

The only reason c can be none is because pydantic returns missing fields as None and conflates disparate behaviours.

How common it is does not change the fact that explicitly passing in some field as None to a pydantic BaseModel is different from not passing in a value at all. Different inputs should have different outputs in the final object.

There is clearly a lot of confusion around how Optional behaves in pydantic. There are many issues in this repo related to that fact. I think a big reason for this is due to:

  1. more than one way to do some things
  2. all permutations not actually being easily possible.

If pydantic actually returned a Missing type for missing fields, you wouldn't need unintuitive magic syntax like here to allow for "Required Optional fields". The syntax would just be Optional[X] for that behavior, which is intuitive and makes sense given that Optional is just Union[X, None].

I recognize that arguably the real reason for all this is because Python's typing module decided to go with calling it Optional instead of Nullable which is not what most other languages would call it and becomes confusing when you throw more libs into the mix.

@dmontagu I'm not sure what you mean by concern (1): How is this not already covered by the functionality of type checkers? Pydantic would return either Missing or X for "CanBeMissing" fields and None or X for "Optional" fields. This seems well within what type checkers are already doing on their own, no?
As to (2), why not just not dump fields that weren't present at init in the first place, like other data serialization libraries do. If a field is missing, you just don't dump it to a dict or json, because if it wasn't in the incoming data (and you intentionally specified that you are OK with it being "missing") why would you put it in the output? The only time you would put it in the output is if you specified some default value, in which case you still wouldn't have a problem.

@dmontagu
Copy link
Contributor
dmontagu commented Mar 5, 2020

@acnebs I agree that it is unfortunate that both missing or specified-as-null are conflated. But for better or worse I think this is ultimately a fairly pythonic convention -- if you have a keyword argument with a default value, you can't tell whether the keyword argument provided was specified as the default value or just not provided. Some may see this as a flaw in the language's design (e.g. Rust users), but at this point it's certainly conventional/idiomatic python.

How common it is does not change the fact that explicitly passing in some field as None to a pydantic BaseModel is different from not passing in a value at all. Different inputs should have different outputs in the final object.

The way you are describing this makes me think you might not be aware that you can obtain the precise set of fields that were set during initialization by using model.dict(exclude_unset=True) or checking model.__fields_set__. If your point is that the current functionality is too un-ergonomic, that's a reasonable perspective, but I just want to make it clear that this capability does currently exist. (Note that it isn't especially well-supported by FastAPI right now though for FastAPI-specific reasons.)

If pydantic actually returned a Missing type for missing fields, you wouldn't need unintuitive magic syntax like here to allow for "Required Optional fields". The syntax would just be Optional[X] for that behavior, which is intuitive and makes sense given that Optional is just Union[X, None].

Personally I am inclined to agree that it might have resulted in less confusion to have x: Optional[X] in a pydantic model behave similarly to other types, and similar to how it would work in a dataclass (required unless a default (e.g. None) is explicitly specified). But in practice this has rarely been an issue. At this point I could go either way in terms of this behavior in v2; it would be a large enough breaking change that I could see an argument against, despite the fact that I personally find it to be the more intuitive/pythonic approach. (Also, perhaps not everyone agrees with that perspective anyway..)

I recognize that arguably the real reason for all this is because Python's typing module decided to go 8000 with calling it Optional instead of Nullable which is not what most other languages would call it and becomes confusing when you throw more libs into the mix.

You seem to be approaching the problem from a very JSON-centric perspective, but I would argue pydantic should be somewhat more concerned with type safety than following JSON conventions, and the approach used by pydantic is the approach used by mypy.

Also, I would contest the claim that "most other languages would call this concept Nullable" -- when type-checked with mypy, python's Optional types have essentially the same semantics as Rust's Option, Kotlin's Option, Swift's Optional, C++'s std::optional, etc. (more info here). The only languages I'm familiar with that prefer the term Nullable are C# and TypeScript, and I'm not familiar with any language/type-system besides that of TypeScript that even distinguishes between undefined and null as field values. (And I think most developers coming from other languages would sooner consider this a wart of JavaScript than a feature, despite the minor benefits around simplifying efficient serialization.) But I admittedly don't claim to be an expert on these issues.

At any rate, not everyone is using pydantic strictly for JSON-oriented parsing, so I'm not sure it makes sense to prioritize those conventions here.

@dmontagu I'm not sure what you mean by concern (1): How is this not already covered by the functionality of type checkers? Pydantic would return either Missing or X for "CanBeMissing" fields and None or X for "Optional" fields. This seems well within what type checkers are already doing on their own, no?

Yes, this is true, but the vast majority of existing pydantic code has been written to assume that missing Optional values translate to None, rather than some auxiliary type. And as I said above, this is idiomatic python for optional keyword arguments.

While it could certainly be type-safe (arguably more so than the current approach) to use a fundamentally different type to represent an unspecified value, it would add a large amount of boilerplate any time you didn't want to handle the cases differently, which I would argue is the case for most real applications.

As to (2), why not just not dump fields that weren't present at init in the first place, like other data serialization libraries do. If a field is missing, you just don't dump it to a dict or json, because if it wasn't in the incoming data (and you intentionally specified that you are OK with it being "missing") why would you put it in the output? The only time you would put it in the output is if you specified some default value, in which case you still wouldn't have a problem.

As I said above, this is possible now using exclude_unset=True. As far as I'm aware, outside of working with JSON, it isn't really conventional to remove "unset" fields from an unstructured representation of a class instance. I'd argue the use of the exclude_unset keyword argument is a fairly convenient compromise here.

@connebs
Copy link
Contributor
connebs commented Mar 5, 2020

Thank you for pointing out exclude_unset as a possible solution. Unfortunately I am aware that it exists, but my issue is that it is very unergonomic and excessively verbose to have to dump the object or manually inspect an internal attr like model.__fields_set__ if you want to get at only the fields that were specified. You should not have to dump the object before doing anything useful with it.

I'm not sure "the approach used by pydantic is the approach used by mypy". Mypy doesn't concern itself with things that don't exist at all, so it's not really comparable. Mypy can only type check things that exist – pydantic had to make a choice about how to handle missing fields which aren't even there.

This is a bit different from idiomatic python as well, because in idiomatic python (at least in the past), the reason that unspecified optional kwargs defaulted to None was because there wasn't actually a concept of optional kwargs – there were just kwargs that you could pretend were optional if you specified their default value as None. But for most purposes, this wasn't too different from specifying any other value as the default for the kwarg .

I think the real reason for my confusion is that to my mind it doesn't make much sense for a default to be arbitrarily chosen as None when a field is Optional[X]. Why not choose a random (probably falsey) value of X as the default? If I specify a field as Optional[int] and don't specify the field in construction, where is the logic in loading/dumping it as None rather than calling it 0? As the programmer, I have given no preference for one or the other. If it's Optional[str], why not default to '' instead of None? With the idea that Optional[X] means None or X but not missing, this arbitrary decision disappears. And then if you wanted that behavior, you could do the pythonic thing of setting the default as None, i.e. with Optional[X] = None. I think that is overall much more pythonic than the current behaviour, no?

I will admit that I am using pydantic for JSON-centric parsing, but I think I have made it clear that this is also more of a general obejction to conflation rather than a complaint that "this doesn't perfectly mirror JSON's behavior". I understand that this is a generic data library (builtin methods for dumping to JSON notwithstanding). I'm more looking at things like marshmallow (what I was using before in Python and which is very widely used), which is also not JSON-centric but easily allows for the behavior we are talking about (in fact it is the default).

@psippola
Copy link
psippola commented Aug 31, 2020

How about this idea?

Create a singleton for a missing property, so that the property is allowed to be Missing but not None.

from pydantic import BaseModel
from typing import Union

class MissingType:
    def __repr__(self):
        """Just for pretty printing"""
        return 'Missing'

Missing = MissingType()


class Foo(BaseModel):
    count: int
    size: Union[float, MissingType] = Missing

    class Config:
        arbitrary_types_allowed = True

foo = Foo(count=5)
print(foo)  # count=5 size=Missing
print(foo.size is Missing)  # True

foo = Foo(count=5, size=Missing)
print(foo)  # count=5 size=Missing
print(foo.size is Missing)  # True

foo = Foo(count=5, size=None) #  none is not an allowed value (type=type_error.none.not_allowed)


Actually, this kind of Missing object could be substituted for _missing = object() in pydantic.main, so that users could import Missing from pydantic.

Even better, I think that it could be great if pydantic.Field had an explicit required parameter (in line with OpenAPI) so that each field could be left Missing if there is no default value and if required==False, independent of the type hint. For example, if we want to field to be either None, float or not set at all, we could write

from pydantic import BaseModel
from typing import Optional

class Foo(BaseModel):
    field: Optional[float] = Field(required=False)

foo = Foo()
check_foo(foo)

def check_foo(foo: Foo):

    if foo.size is Missing:
        # field is not set at all
       pass
    elif foo.field is None:
        # field is explicitly set to None
        pass
    else:
        # Do something with the field
        pass

@yurikhan
Copy link
Contributor

A better Missing will also override its __new__ to be a true singleton. Otherwise, deepcopying a structure with embedded Missings will probably create additional instances of MissingType which won’t pass the is Missing test.

@ar45
Copy link
ar45 commented Feb 8, 2021
from pydantic import BaseModel
import inspect


def optional(*fields):
    def dec(_cls):
        for field in fields:
            _cls.__fields__[field].required = False
        return _cls

    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = cls.__fields__
        return dec(cls)
    return dec


class Book(BaseModel):
    author: str
    available: bool
    isbn: str


@optional
class BookUpdate(Book):
    pass


@optional('val1', 'val2')
class Model(BaseModel):
    val1: str
    val2: str
    captcha: str

@koiker
Copy link
koiker commented May 29, 2021

I've come to this link after reading the pydantic documentation and searching internet for answers about the same question from this issue.
It will be great to include those examples and explanations in the documentation. If you wish I can PR the doc changes.

@VianneyMI
Copy link

I am having trouble understanding the optional decorator above. If we use the @ decorator syntax on a pydantic BaseModel schema, isn't that the condition - if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel) - will always be true ?
What would be happening if the condition is not verified ? Returning dec ? That would subtitute our class definition by a function ?

@ar45
Copy link
ar45 commented Aug 9, 2021

@VianneyMI when using it as

@optional() or @optional("field_name")

@sylann
Copy link
sylann commented Aug 13, 2021

I'm writing APIs where I usually use patch requests instead of update requests.
Like @acnebs, I don't override the whole object, I only update given fields.

Problem is, some fields are nullable in the database, in which case the client apps should be able to pass null as a value.
But some fields are NOT, and null should be allowed.

I tend to use body.dict(exclude_unset=True) in all "input" requests to avoid computing fields that were not given and let database models decide what are the proper default values.

But with this situation I cannot be sure frontend apps won't give a None value and then I have to check it myself, which contradicts the purpose of a validation library in my opinion.

How would you solve the use case I described above?

Notes:

  • I am not a big fan of validators (verbose, not always clear, often redundant)
  • I have experimented several home made decorators similar to what @ar45 did above, but I'm not comfortable with the idea of adding "black magic" on top of an external library.

@VianneyMI
Copy link
VianneyMI commented Aug 24, 2021

@VianneyMI when using it as

@optional() or @optional("field_name")

Both actually isn't in both cases the first argument will be the decorated function anyway ? @ar45

@kamilglod
Copy link

@samuelcolvin so what's the final, recommended solution for such a use case? I read the whole discussion but it still looks like there is no one, final solution for partial objects that pydantic would support out of the box.

@sborovic
Copy link
sborovic commented Dec 2, 2021

In reference to: #1223 (comment)
I really liked the idea of using a decorator for this purpose. I've made some changes so that it can recursively make fields of nested models optional as well (so that I can pass arbitrarily chosen fields of any depth inside a PATCH request).

def optional(*fields, deep: bool = True):
    """
    Makes specified fields optional.
    If no fields are specified, makes all fields optional.
    To not recursively make all fields of nested models optional as well, pass deep=False
    """
    # Work is done inside optionalize
    def optionalize(_cls):
        for field in fields:
            subfield = _cls.__fields__[field]
            if deep and inspect.isclass(subfield.type_) and issubclass(subfield.type_, BaseModel):
                # Must pass through optional so that fields variable gets prepared
                optional(subfield.type_, deep=deep)
            subfield.required = False
        return _cls

    # Decorator (only used if parameters are passed to optional)
    def decorator(_cls):
        return optionalize(_cls)

    # If no parameters are passed to optional, return the result of optionalize (which is a class callable)
    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = cls.__fields__
        return optionalize(cls)
    # Else, return the generated decorator
    return decorator

Please do not hesitate to make suggestions for further improvement!

@kolypto
Copy link
kolypto commented Dec 18, 2021

Here's an improved version of @ar45 's decorator:

def partial(*fields):
    """ Make the object "partial": i.e. mark all fields as "skippable"

    In Pydantic terms, this means that they're not nullable, but not required either.

    Example:

        @partial
        class User(pd.BaseModel):
            id: int

        # `id` can be skipped, but cannot be `None`
        User()
        User(id=1)

    Example:

        @partial('id')
        class User(pd.BaseModel):
            id: int
            login: str

        # `id` can be skipped, but not `login`
        User(login='johnwick')
        User(login='johnwick', id=1)
    """
    # Call pattern: @partial class Model(pd.BaseModel):
    if len(fields) == 1 and lenient_issubclass(fields[0], pd.BaseModel):
        Model = fields[0]
        field_names = ()
    # Call pattern: @partial('field_name') class Model(pd.BaseModel):
    else:
        Model = None
        field_names = fields

    # Decorator
    def decorator(Model: type[pd.BaseModel] = Model, field_names: frozenset[str] = frozenset(field_names)):
        # Iter fields, set `required=False`
        for field in Model.__fields__.values():
            # All fields, or specific named fields
            if not field_names or field.name in field_names:
                field.required = False

        # Exclude unset
        # Otherwise non-nullable fields would have `{'field': None}` which is unacceptable
        dict_orig = Model.dict
        def dict_excludes_unset(*args, exclude_unset: bool = None, **kwargs):
            exclude_unset = True
            return dict_orig(*args, **kwargs, exclude_unset=exclude_unset)
        Model.dict = dict_excludes_unset

        # Done
        return Model

and a unit-test, if you wonder how it works:

import pydantic as pd
import pytest
from typing import Optional

def test_partial():
    # === Test: @partial() all
    @partial
    class UserPartial(pd.BaseModel):
        # Skippable, not nullable
        id: int
        # Skippable, nullable
        login: Optional[str]

    # Test: no arguments
    user = UserPartial()
    assert user.dict() == {}
    assert user.dict(exclude_unset=False) == {}  # cannot override

    # Test: skippable argument provided
    user = UserPartial(id=1)
    assert user.dict() == {'id': 1}

    # Test: optional argument provided
    user = UserPartial(login='qwerty')
    assert user.dict() == {'login': 'qwerty'}  # 'id' null skipped

    # Test: fails on None
    with pytest.raises(pd.ValidationError):
        UserPartial(id=None)

    # === Test: @partial() names
    @partial('id')
    class UserPartial(pd.BaseModel):
        # Skippable, not nullable
        id: int
        # Skippable, nullable
        login: Optional[str]

    # Test: no arguments
    user = UserPartial()
    assert user.dict() == {}

    # Test: skippable argument provided
    user = UserPartial(id=1)
    assert user.dict() == {'id': 1}

    # Test: optional argument provided
    user = UserPartial(login='qwerty')
    assert user.dict() == {'login': 'qwerty'}

@bbatliner
Copy link
bbatliner commented Dec 20, 2021

Building on the work on @ar45, @sborovic, and @kolypto, I've a solution that enables "Partial" models, including recursive model fields, but does so in a threadsafe way without modifying other model classes unnecessarily (@sborovic's recursive optionalizing modifies other Model classes globally, which was undesirable for me).

Changelog:

  • [2021-12-20] Initial comment
  • [2022-02-24] Assign the dict_exclude_method to the class as a bound method, rather than to the instance in __init__, so that it does not show in repr of the partial model instances
  • [2022-02-24] Check if a field has already had its model type optionalized to prevent "TemporaryPartialTemporaryPartial..." classes from being created
  • [2022-02-28] Transform kwargs that are themselves PartialModels to their dict() forms to avoid "None not allowed" validation errors
  • [2022-02-28] Cache "TemporaryPartial" models to preserve hash behavior of models created with partial sub-models
  • [2022-04-12] Support converting PartialModels in list and tuple subfields to their dict() forms to avoid "None not allowed" validation errors

Here is the metaclass that Partial models should use:

class PartialModelMetaclass(ModelMetaclass):
    def __new__(
        meta: Type["PartialModelMetaclass"], *args: Any, **kwargs: Any
    ) -> "PartialModelMetaclass":
        cls = super(PartialModelMetaclass, meta).__new__(meta, *args, *kwargs)
        cls_init = cls.__init__
        # Because the class will be modified temporarily, need to lock __init__
        init_lock = threading.Lock()
        # To preserve identical hashes of temporary nested partial models,
        # only one instance of each temporary partial class can exist
        temporary_partial_classes: Dict[str, ModelMetaclass] = {}

        def __init__(self: BaseModel, *args: Any, **kwargs: Any) -> None:
            with init_lock:
                fields = self.__class__.__fields__
                fields_map: Dict[ModelField, Tuple[Any, bool]] = {}

                def optionalize(
                    fields: Dict[str, ModelField], *, restore: bool = False
                ) -> None:
                    for _, field in fields.items():
                        if not restore:
                            assert not isinstance(field.required, UndefinedType)
                            fields_map[field] = (field.type_, field.required)
                            field.required = False
                            if (
                                inspect.isclass(field.type_)
                                and issubclass(field.type_, BaseModel)
                                and not field.type_.__name__.startswith(
                                    "TemporaryPartial"
                                )
                            ):
                                # Assign a temporary type to optionalize to avoid
                                # modifying *other* classes
                                class_name = f"TemporaryPartial{field.type_.__name__}"
                                if class_name in temporary_partial_classes:
                                    field.type_ = temporary_partial_classes[class_name]
                                else:
                                    field.type_ = ModelMetaclass(
                                        class_name,
                                        (field.type_,),
                                        {},
                                    )
                                    temporary_partial_classes[class_name] = field.type_
                                field.populate_validators()
                                if field.sub_fields is not None:
                                    for sub_field in field.sub_fields:
                                        sub_field.type_ = field.type_
                                        sub_field.populate_validators()
                                optionalize(field.type_.__fields__)
                        else:
                            # No need to recursively de-optionalize once original types
                            # are restored
                            field.type_, field.required = fields_map[field]
                            if field.sub_fields is not None:
                                for sub_field in field.sub_fields:
                                    sub_field.type_ = field.type_

                # Make fields and fields of nested model types optional
                optionalize(fields)
                # Transform kwargs that are PartialModels to their dict() forms. This
                # will exclude `None` (see below) from the dictionary used to construct
                # the temporarily-partial model field, avoiding ValidationErrors of
                # type type_error.none.not_allowed.
                for kwarg, value in kwargs.items():
                    if value.__class__.__class__ is PartialModelMetaclass:
                        kwargs[kwarg] = value.dict()
                    elif isinstance(value, (tuple, list)):
                        kwargs[kwarg] = value.__class__(
                            v.dict()
                            if v.__class__.__class__ is PartialModelMetaclass
                            else v
                            for v in value
                        )
                # Validation is performed in __init__, for which all fields are now optional
                cls_init(self, *args, **kwargs)
                # Restore requiredness
                optionalize(fields, restore=True)

        setattr(cls, "__init__", __init__)

        # Exclude unset (`None`) from dict(), which isn't allowed in the schema
        # but will be the default for non-required fields. This enables
        # PartialModel(**PartialModel().dict()) to work correctly.
        cls_dict = cls.dict

        def dict_exclude_unset(
            self: BaseModel, *args: Any, exclude_unset: bool = None, **kwargs: Any
        ) -> Dict[str, Any]:
            return cls_dict(self, *args, **kwargs, exclude_unset=True)

        cls.dict = dict_exclude_unset

        return cls

The intended usage of the metaclass is to define subclasses of your "real" models (where requiredness is desired) that can be partial. For me, this makes sense when composing, say, configurations from multiple sources, where each source defines incomplete configuration, but when combined/merged, they should validate successfully.

An example:

class InnerConfig(BaseModel):
    foo: int
    bar: int

class Config(BaseModel):
    name: str
    inner: InnerConfig
    do_something: bool

class PartialConfig(Config, metaclass=PartialModelMetaclass):
    pass

Config(**{})  # will raise ValidationError
PartialConfig(**{})  # OK

InnerConfig(**{})  # will raise ValidationError!
# This is important, because it shows that simply declaring
# a partial model does not cause other models to become
# partial themselves!

# But, when constructing partial models, nested model fields *are* partial
config_src_1 = PartialConfig(**{ "name": "Example", "inner": { "foo": 5 } })
config_src_2 = PartialConfig(**{ "inner": { "bar": 2 }, "do_something": False })

# Composing partial models to validate a complete model
# You will need a deep dictionary merging function:
# e.g. https://stackoverflow.com/questions/7204805/how-to-merge-dictionaries-of-dictionaries
config = Config(**merge(config_src_1.dict(), config_src_2.dict()))  # OK

I also patched the mypy plugin so that it knows about the metaclass and won't complain about required keyword arguments for models created with the PartialModelMetaclass metaclass:

Expand for mypy plugin details

Since we can't patch the plugin itself without depending on a fork of pydantic, I created a new file in my project that is itself a plugin, extended from pydantic's plugin. IMPORTANT: You will have to modify the full name of the metaclass in this code!

# An extension to pydantic's mypy plugin to support the `PartialModelMetaclass`
# introduced to support "partial" models to be merged into a full model at some future time

from typing import List, Type
from mypy.nodes import Argument, RefExpr
from mypy.plugin import ClassDefContext, Plugin
from pydantic.mypy import PydanticModelField, PydanticModelTransformer, PydanticPlugin


class PartialPydanticModelTransformer(PydanticModelTransformer):
    def get_field_arguments(
        self,
        fields: List[PydanticModelField],
        typed: bool,
        force_all_optional: bool,
        use_alias: bool,
    ) -> List[Argument]:
        # Register all fields as optional if the model is declared with a metaclass of PartialModelMetaclass
        if self._ctx.cls.metaclass is not None:
            assert isinstance(self._ctx.cls.metaclass, RefExpr)
            if self._ctx.cls.metaclass.fullname == "FULL.NAME.OF.PartialModelMetaclass":
                return super().get_field_arguments(fields, typed, True, use_alias)
        return super().get_field_arguments(fields, typed, force_all_optional, use_alias)


class PartialPydanticPlugin(PydanticPlugin):
    def _pydantic_model_class_maker_callback(self, ctx: ClassDefContext) -> None:
        transformer = PartialPydanticModelTransformer(ctx, self.plugin_config)
        transformer.transform()


def plugin(version: str) -> Type[Plugin]:
    return PartialPydanticPlugin

Then, modify your mypy config to use this plugin!

@LouisAmon
Copy link

There's a PR for this here:
#3179

@FyZzyss
Copy link
Contributor
FyZzyss commented Jun 10, 2022

I disagree.

Since c can be None, e.g. when it's not provided. Therefore there's no harm in also allowing it to be None when it is provided - e.g. the a or b case.

I therefore continue to hold the opinion that "X is not required, but if it is supplied it may not be None" is not particularly common.

I'll wait to be proved wrong by 👍 on this issue or duplicate issues.

Since there are two workarounds (validators or a custom type), I'm not that interested in continuing this conversation or adding DisallowNone on the basis of opinion alone.

Let's wait to see if others agree with you.

I have a frequent case when the field is optional, but you don't want None. I would like to be able to set the default value if None is sent to this field.

@mubtasimfuad
Copy link
mubtasimfuad commented Jul 13, 2023

In Pydantic v2, the approach provided by @ar45 might not be applicable due to changes in the newer version. However, the following solution worked for me:

You can use the provided function as a decorator to make the fields of a child model optional.

from typing import Optional
from pydantic import create_model

def optional(*fields):
    def dec(cls):
        fields_dict = {}
        for field in fields:
            field_info = cls.__annotations__.get(field)
            if field_info is not None:
                fields_dict[field] = (Optional[field_info], None)
        OptionalModel = create_model(cls.__name__, **fields_dict)
        OptionalModel.__module__ = cls.__module__

        return OptionalModel

    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = cls.__annotations__
        return dec(cls)

    return dec

@russell310
Copy link
russell310 commented Sep 20, 2023

@mubtasimfuad
Here some modifications of your snippet and worked for me

def optional(*fields):
    def dec(_cls):
        fields_dict = {}
        for field in fields:
            field_info = _cls.model_fields.get(field)
            if field_info is not None:
                fields_dict[field] = (Optional[field_info.annotation], None)

        OptionalModel = create_model(_cls.__name__, **fields_dict)
        OptionalModel.__module__ = _cls.__module__
        return OptionalModel

    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = cls.model_fields
        return dec(cls)
    return dec

sfoster1 added a commit to Opentrons/opentrons that referenced this issue Oct 17, 2023
Some combination of pydantic and fastapi (fastapi, I'm pretty sure, due
to things like pydantic/pydantic#1223 )
conflate "null" and "not present" in weird ways. This means that in the
health response pydantic class,
- If you create robot_serial in the proper way to have a pydantic field
that can be either a string or null but must always be passed in to the
constructor - annotation Optional[str] and do not provide a default -
then pydantic will say there's a field missing if you explicitly pass
in null (the bug this fixes, since this 500s the route)
- If you create robot_serial with a default value, which is _not
correct_ because there is no default, then something (probably fastapi)
will remove the _key_ from the model rather than issueing it with a
null, which is terrible, but is better than the other way I guess.

Updating fastapi might help with this maybe. I don't know.
sfoster1 added a commit to Opentrons/opentrons that referenced this issue Oct 17, 2023
Some combination of pydantic and fastapi (fastapi, I'm pretty sure, due
to things like pydantic/pydantic#1223 )
conflate "null" and "not present" in weird ways. This means that in the
health response pydantic class,
- If you create robot_serial in the proper way to have a pydantic field
that can be either a string or null but must always be passed in to the
constructor - annotation Optional[str] and do not provide a default -
then pydantic will say there's a field missing if you explicitly pass
in null (the bug this fixes, since this 500s the route)
- If you create robot_serial with a default value, which is _not
correct_ because there is no default, then something (probably fastapi)
will remove the _key_ from the model rather than issueing it with a
null, which is terrible, but is better than the other way I guess.

Updating fastapi might help with this maybe. I don't know.
@lyzohub
Copy link
lyzohub commented May 11, 2024

My solution for partial update.

from pydantic import BaseModel

class _UNSET:
    def __bool__(self):
        return False

    def __repr__(self):
        return "UNSET"


UNSET = _UNSET()


class UpdatePayload(BaseModel):
    name: str = UNSET
    description: str 
F438
= UNSET
>>> UpdatePayload(**{}).json(exclude_unset=True)
'{}'

>>> UpdatePayload(**{"name": "a"}).json(exclude_unset=True)
'{"name":"a"}'

>>> UpdatePayload(**{"name": "a"}).description
UNSET

>>> if not UpdatePayload(**{"name": "a"}).description:
...    print("Empty")
    
Empty

>>> UpdatePayload(**{"name": None}).json(exclude_unset=True)
Will raise ValidationError

@kamilglod
Copy link
kamilglod commented May 13, 2024

@lyzohub but then mypy would complain about assigning wrong type as default

Expression of type "_UNSET" cannot be assigned to declared type "str"  "_UNSET" is incompatible with "str"

@lyzohub
Copy link
lyzohub commented May 13, 2024

@kamilglod Cast to the Any type to make the checks pass.

UNSET = typing.cast(typing.Any, _UNSET())

@ferreteleco
Copy link

I disagree.

Since c can be None, e.g. when it's not provided. Therefore there's no harm in also allowing it to be None when it is provided - e.g. the a or b case.

I therefore continue to hold the opinion that "X is not required, but if it is supplied it may not be None" is not particularly common.

I'll wait to be proved wrong by 👍 on this issue or duplicate issues.

Since there are two workarounds (validators or a custom type), I'm not that interested in continuing this conversation or adding DisallowNone on the basis of opinion alone.

Let's wait to see if others agree with you.

I just ran into a use case that (I think) is not well covered by your reasoning:

GeoJSON specification defines a "bbox" field in its objects that is optional. It is allowed to be missing, but if present, it has to be a list of float values of length 4 (at least). Although the validation can be achieved with a validator function, it is not practical and allowing for optional values that cannot be null (None) if present would be nicer imho.

@dschro-1993
Copy link
from pydantic import BaseModel
import inspect


def optional(*fields):
    def dec(_cls):
        for field in fields:
            _cls.__fields__[field].required = False
        return _cls

    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = cls.__fields__
        return dec(cls)
    return dec


class Book(BaseModel):
    author: str
    available: bool
    isbn: str


@optional
class BookUpdate(Book):
    pass


@optional('val1', 'val2')
class Model(BaseModel):
    val1: str
    val2: str
    captcha: str

How would you do that in Pydantic v2? @ar45

@skaaptjop
Copy link

I just ran into a use case that (I think) is not well covered by your reasoning:

GeoJSON specification defines a "bbox" field in its objects that is optional. It is allowed to be missing, but if present, it has to be a list of float values of length 4 (at least). Although the validation can be achieved with a validator function, it is not practical and allowing for optional values that cannot be null (None) if present would be nicer imho.

The JSON:API spec also does something similar.
Various nested fields may be omitted but if present they can be required to contain certain fields.

@ar45
Copy link
ar45 commented Oct 7, 2024

My pydantic v2 version

import inspect
import typing
from typing import Optional, Annotated, Any
from typing import overload, Callable, Type

from pydantic import BaseModel
from pydantic import Field
from pydantic import create_model, BeforeValidator
from pydantic.fields import FieldInfo
from pydantic.json_schema import SkipJsonSchema
from pydantic_core import PydanticUndefined

T = typing.TypeVar('T')


@overload
def optional(*fields: str) -> Callable[[T], T]:
    ...


@overload
def optional(func: T) -> T:
    ...


def optional(*fields):
    def dec(_cls):
        fields_dict = {}

        for field in fields:
            field_info = _cls.model_fields.get(field)
            if field_info is not None:
                # if field_info.get_default() == PydanticUndefined:
                #     field_info.default = None
                if field_info.is_required():
                    new_field_info = FieldInfo.merge_field_infos(field_info, FieldInfo.from_field(None))
                    fields_dict[field] = (Optional[field_info.annotation], new_field_info)
        # create a new model only if any of the fields were notified
        if fields_dict:
            OptionalModel = create_model(_cls.__name__, __base__=_cls, **fields_dict) # noqa N806
            OptionalModel.__module__ = _cls.__module__
            return OptionalModel
        return _cls

    if fields and inspect.isclass(fields[0]) and issubclass(fields[0], BaseModel):
        cls = fields[0]
        fields = list(cls.model_fields.keys())
        return dec(cls)

    if not fields:
        return optional
    return dec


def exclude(*fields):
    def dec(_cls: Type[BaseModel]):

        fields_dict = {}

        for field in fields:
            field_info = _cls.model_fields.get(field)
            if field_info is not None:
                # if field_info.get_default() == PydanticUndefined:
                #     field_info.default = None
                field_info.exclude = True
                field_info.default = None
                fields_dict[field] = (SkipJsonSchema[Annotated[Any, BeforeValidator(lambda x: PydanticUndefined)]], Field(None, exclude=True, repr=False, validation_alias='____________a'))

        OptionalModel = create_model(_cls.__name__, __base__=_cls, **fields_dict) # noqa N806
        OptionalModel.__module__ = _cls.__module__
        return OptionalModel

    if not fields:
        return exclude
    return dec

@Zer0x00
Copy link
Zer0x00 commented Oct 31, 2024

@ar45 Could there be an error in your code? I would have expected an exception thrown in this code or at least that the None value is not included in the model_dump

from pydantic import BaseModel, Field

from test.lib import optional


@optional
class Test(BaseModel):
    name: str = Field(...)
    description: str = Field(...)


a = Test(name=None, description='description')

print(a.model_dump(exclude_unset=True))
>>> {'name': None, 'description': 'description'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

0