Add pipeline API #9459

adriangb · 2024-05-20T16:01:40Z

cloudflare-workers-and-pages · 2024-05-20T16:03:03Z

codspeed-hq · 2024-05-20T16:04:33Z

sydney-runkle

pydantic/transform_experimental.py

docs/concepts/types.md

pydantic/transform_experimental.py

sydney-runkle · 2024-05-28T17:26:49Z

adriangb · 2024-05-28T19:20:23Z

sydney-runkle · 2024-05-29T12:29:37Z

sydney-runkle · 2024-05-29T17:27:11Z

davidhewitt

docs/concepts/types.md

davidhewitt · 2024-05-30T12:16:32Z

docs/concepts/types.md

+    username: Annotated[str, parse(str).str.pattern(r'[a-z]+')]  # (3)!
+    password: Annotated[
+        str,
+        parse(str).transform(str.lower).predicate(lambda x: x != 'password')]  # (4)!


Two observations here:

It feels pretty verbose to have this pipeline inline. Should we encourage users to make it reusable?

validate_password = parse(str).transform(str.lower).predicate(lambda x: x != 'password') class User(BaseModel): password: Annotated[str, validate_password]

Is there a way to not need the Annotated? I can't see an obvious one, but I'd love for it to work somehow. e.g. Pipeline[str, transform(str.lower).predicate(...)]

Pipeline[str, transform(str.lower).predicate(...)] would work if we just export Pipeline = Annotated (I think) but that's not much better? I would love for foo: parse(str).transform(int) to work and infer the output type as int but that just doesn't work.

Should we encourage users to make it reusable?

I agree, making examples reusable makes sense. I would suggest:

# or call some password check/validation Password = Annotated[str, parse(str).predicate(lambda x: x != 'password')] class User(BaseModel): password: Password

docs/concepts/types.md

tests/test_docs.py

davidhewitt · 2024-05-30T12:23:52Z

docs/concepts/types.md

+5. Use the `|` or `&` operators to combine steps (like a logical OR or AND).
+6. Calling `parse()` with no arguments implies `parse(<field type>)`. Use `parse(Any)` to accept any type.
+7. For recursive types you can use `parse_defer` to reference the type itself before it's defined.
+8. You can call `parse()` before or after other steps to do pre or post processing.


Can you explain what this means? I assume that this is interactions with BeforeValidator. How do we implement wrap validators in this model?

The point is you can do Annotated[int, parse(str).transform(str.strip).parse().transform(lambda x: x * 2) which is a wrap validator since it does (1) validate as a string and strip whitespace, (2) parse as an integer and (3) multiple by 2.

Oh I see. .parse() like that was totally not obvious to me, is it a significant ergonomic win to allow the argument to be inferred. Otherwise I might argue to make it required for now to simplify.

Or write an explicit section which is like

BeforeValidator -> parse(str).transform(bv).parse() AfterValidator -> parse().transform(av) WrapValidator -> parse(str).transform(bv).parse().transform(av)

... @sydney-runkle and I had an interesting use case for a wrap validator the other day which split IANA timezone off a timestamp so that we could do the timestamp in Rust and then reconstruct a tz-aware datetime. It would be interesting to see how to implement that in this API, as it's not obvious to me.

Could you elaborate on that example and we can try and see how it would work out?

is it a significant ergonomic win to allow the argument to be inferred

I'd say so, consider if the type is list[MyGeneric[dict[str, set[bool]]]] or something...

had an interesting use case for a wrap validator the other day which split IANA timezone off a timestamp so that we could do the timestamp in Rust and then reconstruct a tz-aware datetime

We could also add an explicit wrap validator: parse(str).transform(lambda x, h: h(x.strip())).parse(int)?

I don't really like that because it breaks the linearity and typing.

tests/test_transform.py

pydantic/experimental/pipeline.py

dmontagu · 2024-06-05T17:59:05Z

pydantic/experimental/pipeline.py

+        """Pipe the result of one validation chain into another."""
+        return _Pipeline([_PipelineAnd(self, other)])
+
+    __and__ = then


I think I understand why this API was added — as long as the input and output types of a pipeline are the same, then it's basically equivalent to do them in order. However, I'll note that sequencing the items like this may end up being unintuitive, in particular if you expect to get an error for each failure in the case of multiple independent validators, rather than just the first failure. I understand it's hard to "fix" that given that transformations are possible though.

I take it that you mean that for validate_as(int).gt(0) & validate_as(int).gt(1) you’d expect -1 to give two errors? Indeed it would only give one. I think that’s reasonable behavior.

I also think you’re saying that for the case where you have a chain of constraints you could error for all of them eg validate_as(int).gt(0).gt(1) and indeed that will happen if they’re all one after another and are known constraints, but not if they’re custom predicates or there’s a transformation between them. And also not when you use &. I think that’s okay.

The one improvement we could make is “collapsing” sequential constraints into one level eg validate_as(int).predicate(lambda x: x > 0).predicate(lambda x: x % 2 == 0) could give both errors despite it being arbitrary user code. That’s a reasonable future feature request that shouldn’t be too hard to implement.

Yeah, my point was that I might expect validate_as(int).predicate(lambda x: x > 0) & validate_as(int).predicate(lambda x: x % 2 == 0) to produce two errors. (Well, knowing it's doing chaining, I'm not even sure if that's valid as written there since it repeats the validate_as(int), but that's what my intuition would be for how I'd expect to use &.) I think it's reasonable for validate_as(int).predicate(lambda x: x > 0).predicate(lambda x: x % 2 == 0) to just produce one though.

My expectation would be the exact opposite: & is often a greedy operation e.g. False & 1 / 0. So I don't think there's a valid general intuition here. If we could make it behave as you expect I suspect there'd be complaints because it's unintuitive or because it is doing unnecessary work. I don't know why you'd expect validate_as(int).predicate(lambda x: x > 0).predicate(lambda x: x % 2 == 0) to produce just one error. Do you also expect Field(min_length=10, pattern=...) to produce a single error? In any case given that we can't really change the behavior and that no users have complained yet I suspect trying to determine what is most intuitive here is not a productive debate.

I'm late to the conversation, but I'd expect it to short-circuit the same way it would for an if expression or any other useage (that I know of) for and.

@grantmwilliams I think it works as you'd expect then, right?

@adriangb I've only tested it a bit, but it seems to work exactly as expected.

pydantic/experimental/pipeline.py

adriangb · 2024-06-05T18:14:19Z

sydney-runkle · 2024-06-05T18:15:05Z

pydantic/experimental/pipeline.py

sydney-runkle · 2024-06-06T01:24:58Z

adriangb · 2024-06-06T12:49:59Z

sydney-runkle · 2024-06-06T12:52:45Z

Add pipeline API

e7eb43c

github-actions bot added the relnotes-fix Used for bugfixes. label May 20, 2024

adriangb added 14 commits May 20, 2024 17:18

port to python 3.10

7d78407

port to python 3.10

6a1622d

fix syntax

947e920

handle slots

b3fe1c4

Remove match

a6d56cc

Remove match

42505df

ignore warning

b2855d0

fix import

aafb856

fix union

0bd8b39

fix union

697833f

sort imports

d767731

move

8abb6e4

move

217b11d

add missing file

89c46a1

sydney-runkle reviewed May 22, 2024

View reviewed changes

namespace

8000

6e91a32

sydney-runkle mentioned this pull request May 29, 2024

Write new feature pattern docs #9522

Closed

initial tests

8e4d535

sydney-runkle mentioned this pull request May 29, 2024

Increase tests coverage for new pipeline API #9524

Closed

davidhewitt reviewed May 30, 2024

View reviewed changes

adriangb added 2 commits May 30, 2024 22:45

add more operators

ada5853

Add json schema tests, add section mapping existing validators

8742e9e

dmontagu reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Outdated Show resolved Hide resolved

dmontagu reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Show resolved Hide resolved

dmontagu reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Outdated Show resolved Hide resolved

dmontagu reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Outdated Show resolved Hide resolved

fix type hint for _Pipeline.then

021604f

sydney-runkle and others added 5 commits June 5, 2024 13:16

Apply suggestions from code review

38a2730

Co-authored-by: David Montague <35119617+dmontagu@users.noreply.github.com>

Update pydantic/experimental/pipeline.py

0c36b7c

Co-authored-by: David Montague <35119617+dmontagu@users.noreply.github.com>

add public todo

8d46b21

move predicate up

a46c2e3

new idea for overload

7386d69

adriangb commented Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Show resolved Hide resolved

test fixes

dc07b50

sydney-runkle reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Show resolved Hide resolved

adriangb commented Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Outdated Show resolved Hide resolved

sydney-runkle added 2 commits June 5, 2024 17:13

update test cases with comments

cbb216b

no freeze notes

581cbe8

dmontagu reviewed Jun 5, 2024

View reviewed changes

pydantic/experimental/pipeline.py Outdated Show resolved Hide resolved

sydney-runkle and others added 3 commits June 5, 2024 18:21

suggested frozen change

c3a008f

add test

26c5325

add more assertions

166df3d

sydney-runkle merged commit 777cff0 into main Jun 6, 2024

sydney-runkle deleted the transform branch June 6, 2024 12:52

sydney-runkle mentioned this pull request Jun 10, 2024

StringConstraints: Bug #9381

Closed

github-actions bot mentioned this pull request Jul 1, 2024

py3-pydantic/2.8.0 package update wolfi-dev/os#22934

Merged

Uh oh!

Add pipeline API #9459

Add pipeline API #9459

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Deploying pydantic-docs with Cloudflare Pages

Uh oh!

Uh oh!

CodSpeed Performance Report

Merging #9459 will not alter performance

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!