Replies: 2 comments 1 reply
-
|
You might be interested in the Or, using a from typing import Annotated, TypeVar
from pydantic import BaseModel, SerializationInfo, SerializerFunctionWrapHandler, WrapSerializer
def serializer_pii(value, handler: SerializerFunctionWrapHandler, info: SerializationInfo):
if info.context and info.context.get('hide_pii'):
return '***'
return handler(value)
T = TypeVar('T')
PiiType = Annotated[T, WrapSerializer(serializer_pii)]
class Model(BaseModel):
a: PiiType[str]
Model(a="test").model_dump()
#> {'a': 'test'}
Model(a="test").model_dump(context={'hide_pii': True})
#> {'a': '***'} |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for this thread. It inspired me to come up with another solution. The short version is from typing import Annotated
from pydantic import (
BaseModel,
PlainSerializer,
SecretStr,
)
PIIStr = Annotated[
SecretStr,
PlainSerializer(
lambda x: x.get_secret_value(),
return_type=str,
when_used="json",
),
]
class PersonalInfo(BaseModel):
name: PIIStr
other_value: str
personal_info = PersonalInfo(name="Frank-Mich", other_value="foo bar")
print("raw:", personal_info)
#> raw: name=SecretStr('**********') other_value='foo bar'
print("default (python) model dump:", personal_info.model_dump())
#> default (python) model dump: {'name': SecretStr('**********'), 'other_value': 'foo bar'}
print("json model dump:", personal_info.model_dump(mode="json"))
#> json model dump: {'name': 'Frank-Mich', 'other_value': 'foo bar'}The major advantage is that it defaults to safety. Unless explicitly dumping as JSON, the PII remains hidden. Here is how PIIEmailStr = Annotated[
SecretStr,
BeforeValidator(lambda x: EmailStr._validate(x)),
PlainSerializer(
lambda x: x.get_secret_value(),
return_type=str,
),
]For more detailed explanations and examples in FastAPI, I made a blog post. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there an idiomatic way to mark Pydantic fields as containing PII so that, during serialization (on demand, e.g. with a context flag like
a.model_dump(hide_pii=True)), these fields are automatically obfuscated/masked (recursively for nested models) without breaking OpenAPI schema generation (as with FastAPI)?Background & What I’ve Tried:
A model like this one:
I tried creating a
PIIMixinand attaching it like this:The idea was to do something like
person.model_dump(hide_pii=True).From here on, in the mixin I tried:
I attached a
@model_serializer(mode="wrap")to a PII mixin and marked fields as containing PII, which seems to work, but it breaks FastAPI’s OpenAPI schema generation (no details about the schema due to a custom serializer). Something like this:Overriding
model_dumpworked for a flat model but fails for nested models since Pydantic doesn’t seem to call the override recursively.Is there a better, more Pydantic way of achieving this? I can easily hack my way through this, but what I’m looking for is a more robust and built-in or idiomatic approach.
Beta Was this translation helpful? Give feedback.
All reactions