PydanticAI Docs
PydanticAI Docs
md
## Introduction
In some use cases a single Agent will control an entire application or component,
but multiple agents can also interact to embody more complex workflows.
| **Component** | **Description**
|
|-------------------------------------------------|--------------------------------
--------------------------------------------------------------------------|
| [System prompt(s)](#system-prompts) | A set of instructions for the
LLM written by the developer. |
| [Function tool(s)](tools.md) | Functions that the LLM may call
to get information while generating a response. |
| [Structured result type](results.md) | The structured datatype the LLM
must return at the end of a run, if specified. |
| [Dependency type constraint](dependencies.md) | System prompt functions, tools,
and result validators may all use dependencies when they're run. |
| [LLM model](api/models/base.md) | Optional default LLM model
associated with the agent. Can also be specified when running the agent. |
| [Model Settings](#additional-configuration) | Optional default model settings
to help fine tune requests. Can also be specified when running the agent.|
In typing terms, agents are generic in their dependency and result types, e.g., an
agent which required dependencies of type `#!python Foobar` and returned results of
type `#!python list[str]` would have type `Agent[Foobar, list[str]]`. In practice,
you shouldn't need to care about this, it should just mean your IDE can tell you
when you have the right type, and if you choose to use [static type checking]
(#static-type-checking) it should work well with PydanticAI.
```python {title="roulette_wheel.py"}
from pydantic_ai import Agent, RunContext
@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str: # (2)!
"""check if the square is a winner"""
return 'winner' if square == ctx.deps else 'loser'
!!! tip "Agents are designed for reuse, like FastAPI Apps"
Agents are intended to be instantiated once (frequently as module globals) and
reused throughout your application, similar to a small [FastAPI][fastapi.FastAPI]
app or an [APIRouter][fastapi.APIRouter].
## Running Agents
```python {title="run_agent.py"}
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
You can also pass messages from previous runs to continue a conversation or provide
context, as described in [Messages and Chat History](message-history.md).
You can apply these settings by passing the `usage_limits` argument to the
`run{_sync,_stream}` functions.
Consider the following example, where we limit the number of response tokens:
```py
from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits
agent = Agent('claude-3-5-sonnet-latest')
result_sync = agent.run_sync(
'What is the capital of Italy? Answer with just the city.',
usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.data)
#> Rome
print(result_sync.usage())
"""
Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63,
details=None)
"""
try:
result_sync = agent.run_sync(
'What is the capital of Italy? Answer with a paragraph.',
usage_limits=UsageLimits(response_tokens_limit=10),
)
except UsageLimitExceeded as e:
print(e)
#> Exceeded the response_tokens_limit of 10 (response_tokens=32)
```
```py
from typing_extensions import TypedDict
class NeverResultType(TypedDict):
"""
Never ever coerce data to this type.
"""
never_use_this: str
agent = Agent(
'claude-3-5-sonnet-latest',
result_type=NeverResultType,
system_prompt='Any time you get a response, call the `infinite_retry_tool` to
produce another response.',
)
@agent.tool_plain(retries=5) # (1)!
def infinite_retry_tool() -> int:
raise ModelRetry('Please try again.')
try:
result_sync = agent.run_sync(
'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3) #
(2)!
)
except UsageLimitExceeded as e:
print(e)
#> The next request would exceed the request_limit of 3
```
1. This tool has the ability to retry 5 times before erroring, simulating a tool
that might get stuck in a loop.
2. This run will error after 3 requests, preventing the infinite tool calling.
!!! note
This is especially relevant if you're registered a lot of tools,
`request_limit` can be used to prevent the model from choosing to make too many of
these calls.
```py
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
result_sync = agent.run_sync(
'What is the capital of Italy?', model_settings={'temperature': 0.0}
)
print(result_sync.data)
#> Rome
```
agent = Agent('openai:gpt-4o')
# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.
1. Continue the conversation; without `message_history` the model would not know
who "his" was referring to.
PydanticAI is designed to work well with static type checkers, like mypy and
pyright.
That said, because PydanticAI uses Pydantic, and Pydantic uses type hints as
the definition for schema and validation, some types (specifically type hints on
parameters to tools, and the `result_type` arguments to [`Agent`]
[pydantic_ai.Agent]) are used at runtime.
We (the library developers) have messed up if type hints are confusing you more
than helping you, if you find this, please create an
[issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying
you!
In particular, agents are generic in both the type of their dependencies and the
type of results they return, so you can use the type hints to ensure you're using
the right types.
@dataclass
class User:
name: str
agent = Agent(
'test',
deps_type=User, # (1)!
result_type=bool,
)
@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str: # (2)!
return f"The user's name is {ctx.deps}."
```bash
➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has
incompatible type "Callable[[RunContext[str]], str]"; expected
"Callable[[RunContext[User]], str]" [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool";
expected "bytes" [arg-type]
Found 2 errors in 1 file (checked 1 source file)
```
## System Prompts
System prompts might seem simple at first glance since they're just strings (or
sequences of strings that are concatenated), but crafting the right system prompt
is key to getting the model to behave as you want.
1. **Static system prompts**: These are known when writing the code and can be
defined via the `system_prompt` parameter of the [`Agent` constructor]
[pydantic_ai.Agent.__init__].
2. **Dynamic system prompts**: These depend in some way on context that isn't known
until runtime, and should be defined via functions decorated with
[`@agent.system_prompt`][pydantic_ai.Agent.system_prompt].
You can add both to a single agent; they're appended in the order they're defined
at runtime.
```python {title="system_prompts.py"}
from datetime import date
agent = Agent(
'openai:gpt-4o',
deps_type=str, # (1)!
system_prompt="Use the customer's name while replying to them.", # (2)!
)
@agent.system_prompt # (3)!
def add_the_users_name(ctx: RunContext[str]) -> str:
return f"The user's name is {ctx.deps}."
@agent.system_prompt
def add_the_date() -> str: # (4)!
return f'The date is {date.today()}.'
Validation errors from both function tool parameter validation and [structured
result validation](results.md#structured-result-validation) can be passed back to
the model with a request to retry.
- The default retry count is **1** but can be altered for the [entire agent]
[pydantic_ai.Agent.__init__], a [specific tool][pydantic_ai.Agent.tool], or a
[result validator][pydantic_ai.Agent.__init__].
- You can access the current retry count from within a tool or result validator via
[`ctx.retry`][pydantic_ai.tools.RunContext].
Here's an example:
```python {title="tool_retry.py"}
from pydantic import BaseModel
class ChatResult(BaseModel):
user_id: int
message: str
agent = Agent(
'openai:gpt-4o',
deps_type=DatabaseConn,
result_type=ChatResult,
)
@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
"""Get a user's ID from their full name."""
print(name)
#> John
#> John Doe
user_id = ctx.deps.users.get(name=name)
if user_id is None:
raise ModelRetry(
f'No user found with name {name!r}, remember to provide their full
name'
)
return user_id
result = agent.run_sync(
'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.data)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week?
Let me know what works for you!'
"""
```
## Model errors
If models behave unexpectedly (e.g., the retry limit is exceeded, or their API
returns `503`), agent runs will raise [`UnexpectedModelBehavior`]
[pydantic_ai.exceptions.UnexpectedModelBehavior].
```python
from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior,
capture_run_messages
agent = Agent('openai:gpt-4o')
@agent.tool_plain
def calc_volume(size: int) -> int: # (1)!
if size == 42:
return size**3
else:
raise ModelRetry('Please try again.')
!!! note
If you call [`run`][pydantic_ai.Agent.run], [`run_sync`]
[pydantic_ai.Agent.run_sync], or [`run_stream`][pydantic_ai.Agent.run_stream] more
than once within a single `capture_run_messages` context, `messages` will represent
the messages exchanged during the first call only.
------- ○ -------
contributing.md
```bash
git clone git@github.com:<your username>/pydantic.git
cd pydantic-ai
```
```bash
pipx install uv pre-commit
```
```bash
make install
```
```bash
make help
```
To run code formatting, linting, static type checks, and tests with coverage report
generation, run:
```bash
make
```
* To add a new model with an extra dependency, that dependency needs > 500k monthly
downloads from PyPI consistently over 3 months or more
* To add a new model which uses another models logic internally and has no extra
dependencies, that model's GitHub org needs > 20k stars in total
* For any other model that's just a custom URL and API key, we're happy to add a
one-paragraph description with a link and instructions on the URL to use
* For any other model that requires more logic, we recommend you release your own
Python package `pydantic-ai-xxx`, which depends on [`pydantic-ai-slim`]
(install.md#slim-install) and implements a model that inherits from our [`Model`]
[pydantic_ai.models.Model] ABC
------- ○ -------
dependencies.md
# Dependencies
PydanticAI uses a dependency injection system to provide data and services to your
agent's [system prompts](agents.md#system-prompts), [tools](tools.md) and [result
validators](results.md#result-validators-functions).
## Defining Dependencies
Dependencies can be any python type. While in simple cases you might be able to
pass a single object as a dependency (e.g. an HTTP connection), [dataclasses][] are
generally a convenient container when your dependencies included multiple objects.
```python {title="unused_dependencies.py"}
from dataclasses import dataclass
import httpx
@dataclass
class MyDeps: # (1)!
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps, # (2)!
)
## Accessing Dependencies
import httpx
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt # (1)!
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)!
response = await ctx.deps.http_client.get( # (3)!
'https://example.com',
headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)!
)
response.raise_for_status()
return f'Prompt: {response.text}'
If these functions are not coroutines (e.g. `async def`) they are called with
[`run_in_executor`][asyncio.loop.run_in_executor] in a thread pool, it's therefore
marginally preferable
to use `async` methods where dependencies perform IO, although synchronous
dependencies should work fine too.
!!! note "`run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies"
Whether you use synchronous or asynchronous dependencies, is completely
independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper
around `run` and agents are always run in an async context.
```python {title="sync_dependencies.py"}
from dataclasses import dataclass
import httpx
@dataclass
class MyDeps:
api_key: str
http_client: httpx.Client # (1)!
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt
def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)!
response = ctx.deps.http_client.get(
'https://example.com', headers={'Authorization': f'Bearer
{ctx.deps.api_key}'}
)
response.raise_for_status()
return f'Prompt: {response.text}'
async def main():
deps = MyDeps('foobar', httpx.Client())
result = await agent.run(
'Tell me a joke.',
deps=deps,
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
```
## Full Example
import httpx
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
response = await ctx.deps.http_client.get('https://example.com')
response.raise_for_status()
return f'Prompt: {response.text}'
@agent.tool # (1)!
async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str:
response = await ctx.deps.http_client.get(
'https://example.com#jokes',
params={'subject': subject},
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
)
response.raise_for_status()
return response.text
@agent.result_validator # (2)!
async def validate_result(ctx: RunContext[MyDeps], final_response: str) -> str:
response = await ctx.deps.http_client.post(
'https://example.com#validate',
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
params={'query': final_response},
)
if response.status_code == 400:
raise ModelRetry(f'invalid response: {response.text}')
response.raise_for_status()
return final_response
## Overriding Dependencies
While this can sometimes be done by calling the agent directly within unit tests,
we can also override dependencies
while calling application code which in turn calls the agent.
```python {title="joke_app.py"}
from dataclasses import dataclass
import httpx
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
@joke_agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
return await ctx.deps.system_prompt_factory() # (2)!
1. Define a method on the dependency to make the system prompt easier to customise.
2. Call the system prompt factory from within the system prompt function.
3. Application code that calls the agent, in a real application this might be an
API endpoint.
4. Call the agent from within the application code, in a real application this call
might be deep within a call stack. Note `app_deps` here will NOT be used when deps
are overridden.
## Examples
- [Weather Agent](examples/weather-agent.md)
- [SQL Generation](examples/sql-gen.md)
- [RAG](examples/rag.md)
------- ○ -------
favicon.ico
------- ○ -------
help.md
# Getting Help
If you need help getting started with PydanticAI or with advanced usage, the
following sources may be useful.
## :simple-slack: Slack
If you're on a [Logfire][logfire] Pro plan, you can also get a dedicated private
slack collab channel with us.
[slack]: https://join.slack.com/t/pydanticlogfire/shared_invite/zt-2war8jrjq-
w_nWG6ZX7Zm~gnzY7cXSog
[github-issues]: https://github.com/pydantic/pydantic-ai/issues
[logfire]: https://pydantic.dev/logfire
------- ○ -------
index.md
# Introduction {.hide}
--8<-- "docs/.partials/index-header.html"
Similarly, virtually every agent framework and LLM library in Python uses Pydantic,
yet when we began to use LLMs in [Pydantic Logfire](https://pydantic.dev/logfire),
we couldn't find anything that gave us the same feeling.
We built PydanticAI with one simple aim: to bring that FastAPI feeling to GenAI app
development.
## Why use PydanticAI
The exchange should be very short: PydanticAI will send the system prompt and the
user query to the LLM, the model will return a text response.
Not very interesting yet, but we can easily add "tools", dynamic system prompts,
and structured responses to build more powerful agents.
Here is a concise example using PydanticAI to build a support agent for a bank:
```python {title="bank_support.py"}
from dataclasses import dataclass
@dataclass
class SupportDependencies: # (3)!
customer_id: int
db: DatabaseConn # (12)!
@support_agent.system_prompt # (5)!
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
return f"The customer's name is {customer_name!r}"
@support_agent.tool # (6)!
async def customer_balance(
ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
"""Returns the customer's current account balance.""" # (7)!
return await ctx.deps.db.customer_balance(
id=ctx.deps.customer_id,
include_pending=include_pending,
)
... # (11)!
To understand the flow of the above runs, we can watch the agent in action using
Pydantic Logfire.
To do this, we need to set up logfire, and add the following to our code:
import logfire
logfire.configure() # (1)!
logfire.instrument_asyncpg() # (2)!
...
```
## Next Steps
------- ○ -------
install.md
# Installation
```bash
pip/uv-add pydantic-ai
```
This installs the `pydantic_ai` package, core dependencies, and libraries required
to use all the models
included in PydanticAI. If you want to use a specific model, you can install the
["slim"](#slim-install) version of PydanticAI.
```bash
pip/uv-add 'pydantic-ai[logfire]'
```
## Running Examples
We distribute the [`pydantic_ai_examples`](https://github.com/pydantic/pydantic-
ai/tree/main/pydantic_ai_examples) directory as a separate PyPI package
([`pydantic-ai-examples`](https://pypi.org/project/pydantic-ai-examples/)) to make
examples extremely easy to customize and run.
```bash
pip/uv-add 'pydantic-ai[examples]'
```
## Slim Install
If you know which model you're going to use and want to avoid installing
superfluous packages, you can use the
[`pydantic-ai-slim`](https://pypi.org/project/pydantic-ai-slim/) package.
For example, if you're using just [`OpenAIModel`]
[pydantic_ai.models.openai.OpenAIModel], you would run:
```bash
pip/uv-add 'pydantic-ai-slim[openai]'
```
You can also install dependencies for multiple models and use cases, for example:
```bash
pip/uv-add 'pydantic-ai-slim[openai,vertexai,logfire]'
```
------- ○ -------
logfire.md
Applications that use LLMs have some challenges that are well known and understood:
LLMs are **slow**, **unreliable** and **expensive**.
These applications also have some challenges that most developers have encountered
much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a
prompt can completely change a model's performance, and there's no `EXPLAIN` query
you can run to understand why.
To build successful applications with LLMs, we need new tools to understand both
model performance, and the behavior of applications that rely on them.
LLM Observability tools that just let you understand how your model is performing
are useless: making API calls to an LLM is easy, it's building that into an
application that's hard.
## Pydantic Logfire
PydanticAI has built-in (but optional) support for Logfire via the [`logfire-api`]
(https://github.com/pydantic/logfire/tree/main/logfire-api) no-op package.
## Using Logfire
```bash
pip/uv-add 'pydantic-ai[logfire]'
```
```bash
py-cli logfire auth
```
```bash
py-cli logfire projects new
```
```python {title="adding_logfire.py"}
import logfire
logfire.configure()
```
The [logfire documentation](https://logfire.pydantic.dev/docs/) has more details on
how to use logfire, including how to instrument other libraries like Pydantic,
HTTPX and FastAPI.
Once you have logfire set up, there are two primary ways it can help you understand
your application:
* **Debugging** — Using the live view to see what's happening in your application
in real-time.
* **Monitoring** — Using SQL and dashboards to observe the behavior of your
application, Logfire is effectively a SQL database that stores information about
how your application is running.
### Debugging
To demonstrate how Logfire can let you visualise the flow of a PydanticAI run,
here's the view you get from Logfire while running the [chat app examples]
(examples/chat-app.md):
{{ video('a764aff5840534dc77eba7d028707bfa', 25) }}
We can also query data with SQL in Logfire to monitor the performance of an
application. Here's a real world example of using Logfire to monitor PydanticAI
runs inside Logfire itself:
------- ○ -------
message-history.md
After running an agent, you can access the messages exchanged during that run from
the `result` object.
Both [`RunResult`][pydantic_ai.result.RunResult]
(returned by [`Agent.run`][pydantic_ai.Agent.run], [`Agent.run_sync`]
[pydantic_ai.Agent.run_sync])
and [`StreamedRunResult`][pydantic_ai.result.StreamedRunResult] (returned by
[`Agent.run_stream`][pydantic_ai.Agent.run_stream]) have the following methods:
* [`StreamedRunResult.stream()`][pydantic_ai.result.StreamedRunResult.stream]
* [`StreamedRunResult.stream_text()`]
[pydantic_ai.result.StreamedRunResult.stream_text]
* [`StreamedRunResult.stream_structured()`]
[pydantic_ai.result.StreamedRunResult.stream_structured]
* [`StreamedRunResult.get_data()`]
[pydantic_ai.result.StreamedRunResult.get_data]
**Note:** The final result message will NOT be added to result messages if you
use [`.stream_text(delta=True)`][pydantic_ai.result.StreamedRunResult.stream_text]
since in this case the result content is never built as one string.
If `message_history` is set and not empty, a new system prompt is not generated —
we assume the existing message history includes a system prompt.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it
Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invent by Samuel Colvin, it
needs no explanation.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
```
_(This example is complete, it can be run "as is")_
Since messages are defined by simple dataclasses, you can manually create and
manipulate, e.g. for testing.
The message format is independent of the model used, so you can use messages in
different agents, or the same agent with different models.
```python
from pydantic_ai import Agent
result2 = agent.run_sync(
'Explain?', model='gemini-1.5-pro', message_history=result1.new_messages()
)
print(result2.data)
#> This is an excellent joke invent by Samuel Colvin, it needs no explanation.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it
Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invent by Samuel Colvin, it
needs no explanation.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
```
## Examples
For a more complete example of using messages in conversations, see the [chat app]
(examples/chat-app.md) example.
------- ○ -------
models.md
PydanticAI is Model-agnostic and has built in support for the following model
providers:
* [OpenAI](#openai)
* [Anthropic](#anthropic)
* Gemini via two different APIs: [Generative Language API](#gemini) and [VertexAI
API](#gemini-via-vertexai)
* [Ollama](#ollama)
* [Groq](#groq)
* [Mistral](#mistral)
To use each model provider, you need to configure your local environment and make
sure you have the right packages installed.
## OpenAI
### Install
```bash
pip/uv-add 'pydantic-ai-slim[openai]'
```
### Configuration
Once you have the API key, you can set it as an environment variable:
```bash
export OPENAI_API_KEY='your-api-key'
```
```python {title="openai_model_by_name.py"}
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
...
```
```python {title="openai_model_init.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel('gpt-4o')
agent = Agent(model)
...
```
If you don't want to or can't set the environment variable, you can pass it at
runtime via the [`api_key` argument]
[pydantic_ai.models.openai.OpenAIModel.__init__]:
```python {title="openai_model_api_key.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
```python {title="openai_model_base_url.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
'anthropic/claude-3.5-sonnet',
base_url='https://openrouter.ai/api/v1',
api_key='your-api-key',
)
agent = Agent(model)
...
```
client = AsyncAzureOpenAI(
azure_endpoint='...',
api_version='2024-07-01-preview',
api_key='your-api-key',
)
## Anthropic
### Install
```bash
pip/uv-add 'pydantic-ai-slim[anthropic]'
```
### Configuration
[`AnthropicModelName`][pydantic_ai.models.anthropic.AnthropicModelName] contains a
list of available Anthropic models.
Once you have the API key, you can set it as an environment variable:
```bash
export ANTHROPIC_API_KEY='your-api-key'
```
```py title="anthropic_model_by_name.py"
from pydantic_ai import Agent
agent = Agent('claude-3-5-sonnet-latest')
...
```
```py title="anthropic_model_init.py"
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel('claude-3-5-sonnet-latest')
agent = Agent(model)
...
```
If you don't want to or can't set the environment variable, you can pass it at
runtime via the [`api_key` argument]
[pydantic_ai.models.anthropic.AnthropicModel.__init__]:
```py title="anthropic_model_api_key.py"
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
## Gemini
If you want to run Gemini models in production, you should use the [VertexAI
API](#gemini-via-vertexai) described below.
### Install
### Configuration
Once you have the API key, you can set it as an environment variable:
```bash
export GEMINI_API_KEY=your-api-key
```
You can then use [`GeminiModel`][pydantic_ai.models.gemini.GeminiModel] by name:
```python {title="gemini_model_by_name.py"}
from pydantic_ai import Agent
agent = Agent('gemini-1.5-flash')
...
```
```python {title="gemini_model_init.py"}
from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel
model = GeminiModel('gemini-1.5-flash')
agent = Agent(model)
...
```
If you don't want to or can't set the environment variable, you can pass it at
runtime via the [`api_key` argument]
[pydantic_ai.models.gemini.GeminiModel.__init__]:
```python {title="gemini_model_api_key.py"}
from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel
### Install
```bash
pip/uv-add 'pydantic-ai-slim[vertexai]'
```
### Configuration
The big disadvantage is that for local development you may need to create and
configure a "service account", which I've found extremely painful to get right in
the past.
Whichever way you authenticate, you'll need to have VertexAI enabled in your GCP
account.
Luckily if you're running PydanticAI inside GCP, or you have the [`gcloud` CLI]
(https://cloud.google.com/sdk/gcloud) installed and configured, you should be able
to use `VertexAIModel` without any additional setup.
```python {title="vertexai_application_default_credentials.py"}
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
...
```
Once you have the JSON file, you can use it thus:
```python {title="vertexai_service_account.py"}
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel(
'gemini-1.5-flash',
service_account_file='path/to/service-account.json',
)
agent = Agent(model)
...
```
Whichever way you authenticate, you can specify which region requests will be sent
to via the [`region` argument][pydantic_ai.models.vertexai.VertexAIModel.__init__].
Using a region close to your application can improve latency and might be important
from a regulatory perspective.
```python {title="vertexai_region.py"}
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
## Ollama
### Install
```bash
pip/uv-add 'pydantic-ai-slim[openai]'
```
### Configuration
You must also ensure the Ollama server is running when trying to make requests to
it. For more information, please see the [Ollama
documentation](https://github.com/ollama/ollama/tree/main/docs).
For detailed setup and example, please see the [Ollama setup documentation]
(https://github.com/pydantic/pydantic-ai/blob/main/docs/api/models/ollama.md).
## Groq
### Install
```bash
pip/uv-add 'pydantic-ai-slim[groq]'
```
### Configuration
Once you have the API key, you can set it as an environment variable:
```bash
export GROQ_API_KEY='your-api-key'
```
```python {title="groq_model_by_name.py"}
from pydantic_ai import Agent
agent = Agent('groq:llama-3.1-70b-versatile')
...
```
```python {title="groq_model_init.py"}
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
model = GroqModel('llama-3.1-70b-versatile')
agent = Agent(model)
...
```
If you don't want to or can't set the environment variable, you can pass it at
runtime via the [`api_key` argument][pydantic_ai.models.groq.GroqModel.__init__]:
```python {title="groq_model_api_key.py"}
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
## Mistral
### Install
```bash
pip/uv-add 'pydantic-ai-slim[mistral]'
```
### Configuration
[`NamedMistralModels`][pydantic_ai.models.mistral.NamedMistralModels] contains a
list of the most popular Mistral models.
Once you have the API key, you can set it as an environment variable:
```bash
export MISTRAL_API_KEY='your-api-key'
```
```python {title="mistral_model_by_name.py"}
from pydantic_ai import Agent
agent = Agent('mistral:mistral-large-latest')
...
```
```python {title="mistral_model_init.py"}
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
model = MistralModel('mistral-small-latest')
agent = Agent(model)
...
```
```python {title="mistral_model_api_key.py"}
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
To implement support for models not already supported, you will need to subclass
the [`Model`][pydantic_ai.models.Model] abstract base class.
This in turn will require you to implement the following other abstract base
classes:
* [`AgentModel`][pydantic_ai.models.AgentModel]
* [`StreamTextResponse`][pydantic_ai.models.StreamTextResponse]
* [`StreamStructuredResponse`][pydantic_ai.models.StreamStructuredResponse]
The best place to start is to review the source code for existing implementations,
e.g. [`OpenAIModel`](https://github.com/pydantic/pydantic-ai/blob/main/
pydantic_ai_slim/pydantic_ai/models/openai.py).
For details on when we'll accept contributions adding new models to PydanticAI, see
the [contributing guidelines](contributing.md#new-model-rules).
------- ○ -------
multi-agent-applications.md
# Multi-agent Applications
There are roughly four levels of complexity when building applications with
PydanticAI:
## Agent delegation
"Agent delegation" refers to the scenario where an agent delegates work to another
agent, then takes back control when the delegate agent (the agent called from
within a tool) finishes.
Since agents are stateless and designed to be global, you do not need to include
the agent itself in agent [dependencies](dependencies.md).
```python {title="agent_delegation_simple.py"}
from pydantic_ai import Agent, RunContext
from pydantic_ai.usage import UsageLimits
@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[None], count: int) -> list[str]:
r = await joke_generation_agent.run( # (3)!
f'Please generate {count} jokes.',
usage=ctx.usage, # (4)!
)
return r.data # (5)!
result = joke_selection_agent.run_sync(
'Tell me a joke.',
usage_limits=UsageLimits(request_limit=5, total_tokens_limit=300),
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage())
"""
Usage(
requests=3, request_tokens=204, response_tokens=24, total_tokens=228,
details=None
)
"""
```
The control flow for this example is pretty simple and can be summarised as
follows:
```mermaid
graph TD
START --> joke_selection_agent
joke_selection_agent --> joke_factory["joke_factory (tool)"]
joke_factory --> joke_generation_agent
joke_generation_agent --> joke_factory
joke_factory --> joke_selection_agent
joke_selection_agent --> END
```
Generally the delegate agent needs to either have the same [dependencies]
(dependencies.md) as the calling agent, or dependencies which are a subset of the
calling agent's dependencies.
```python {title="agent_delegation_deps.py"}
from dataclasses import dataclass
import httpx
@dataclass
class ClientAndKey: # (1)!
http_client: httpx.AsyncClient
api_key: str
joke_selection_agent = Agent(
'openai:gpt-4o',
deps_type=ClientAndKey, # (2)!
system_prompt=(
'Use the `joke_factory` tool to generate some jokes on the given subject, '
'then choose the best. You must return just a single joke.'
),
)
joke_generation_agent = Agent(
'gemini-1.5-flash',
deps_type=ClientAndKey, # (4)!
result_type=list[str],
system_prompt=(
'Use the "get_jokes" tool to get some jokes on the given subject, '
'then extract each joke into a list.'
),
)
@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]:
r = await joke_generation_agent.run(
f'Please generate {count} jokes.',
deps=ctx.deps, # (3)!
usage=ctx.usage,
)
return r.data
@joke_generation_agent.tool # (5)!
async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str:
response = await ctx.deps.http_client.get(
'https://example.com',
params={'count': count},
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
)
response.raise_for_status()
return response.text
This example shows how even a fairly simple agent delegation can lead to a complex
control flow:
```mermaid
graph TD
START --> joke_selection_agent
joke_selection_agent --> joke_factory["joke_factory (tool)"]
joke_factory --> joke_generation_agent
joke_generation_agent --> get_jokes["get_jokes (tool)"]
get_jokes --> http_request["HTTP request"]
http_request --> get_jokes
get_jokes --> joke_generation_agent
joke_generation_agent --> joke_factory
joke_factory --> joke_selection_agent
joke_selection_agent --> END
```
"Programmatic agent hand-off" refers to the scenario where multiple agents are
called in succession, with application code and/or a human in the loop responsible
for deciding which agent to call next.
Here we show two agents used in succession, the first to find a flight and the
second to extract the user's seat preference.
```python {title="programmatic_handoff.py"}
from typing import Literal, Union
class FlightDetails(BaseModel):
flight_number: str
class Failed(BaseModel):
"""Unable to find a satisfactory choice."""
@flight_search_agent.tool # (2)!
async def flight_search(
ctx: RunContext[None], origin: str, destination: str
) -> Union[FlightDetails, None]:
# in reality, this would call a flight search API or
# use a browser to scrape a flight search website
return FlightDetails(flight_number='AK456')
usage_limits = UsageLimits(request_limit=15) # (3)!
class SeatPreference(BaseModel):
row: int = Field(ge=1, le=30)
seat: Literal['A', 'B', 'C', 'D', 'E', 'F']
1. Define the first agent, which finds a flight. We use an explicit type annotation
until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured results]
(results.md#structured-result-validation). We use a union as the result type so the
model can communicate if it's unable to find a satisfactory choice; internally,
each member of the union will be registered as a separate tool.
2. Define a tool on the agent to find a flight. In this simple case we could
dispense with the tool and just define the agent to return structured data, then
search for a flight, but in more complex scenarios the tool would be necessary.
3. Define usage limits for the entire app.
4. Define a function to find a flight, which asks the user for their preferences
and then calls the agent to find a flight.
5. As with `flight_search_agent` above, we use an explicit type annotation to
define the agent.
6. Define a function to find the user's seat preference, which asks the user for
their seat preference and then calls the agent to extract the seat preference.
7. Now that we've put our logic for running each agent into separate functions, our
main app becomes very simple.
```mermaid
graph TB
START --> ask_user_flight["ask user for flight"]
subgraph find_flight
flight_search_agent --> ask_user_flight
ask_user_flight --> flight_search_agent
end
subgraph find_seat
seat_preference_agent --> ask_user_seat
ask_user_seat --> seat_preference_agent
end
## PydanticAI Graphs
## Examples
- [Flight booking](examples/flight-booking.md)
------- ○ -------
results.md
Both `RunResult` and `StreamedRunResult` are generic in the data they wrap, so
typing information about the data returned by the agent is preserved.
```python {title="olympics.py"}
from pydantic import BaseModel
class CityLocation(BaseModel):
city: str
country: str
Runs end when either a plain text response is received or the model calls a tool
associated with one of the structured result types. We will add limits to make sure
a run doesn't go on indefinitely, see [#70](https://github.com/pydantic/pydantic-
ai/issues/70).
When the result type is `str`, or a union including `str`, plain text responses are
enabled on the model, and the raw text response from the model is used as the
response data.
If the result type is a union with multiple members (after remove `str` from the
members), each member is registered as a separate tool with the model in order to
reduce the complexity of the tool schemas and maximise the changes a model will
respond correctly.
If the result type schema is not of type `"object"`, the result type is wrapped in
a single element object, so the schema of all tools registered with the model are
object schemas.
Structured results (like tools) use Pydantic to build the JSON schema used for the
tool, and to validate the data returned by the model.
When creating the agent we need to `# type: ignore` the `result_type` argument,
and add a type hint to tell type checkers about the type of the agent.
```python {title="box_or_error.py"}
from typing import Union
class Box(BaseModel):
width: int
height: int
depth: int
units: str
Here's an example of using a union return type which registered multiple tools, and
wraps non-object schemas in an object:
```python {title="colors_or_sizes.py"}
from typing import Union
result = agent.run_sync('square size 10, circle size 20, triangle size 30')
print(result.data)
#> [10, 20, 30]
```
```python {title="sql_gen.py"}
from typing import Union
class Success(BaseModel):
sql_query: str
class InvalidRequest(BaseModel):
error_message: str
@agent.result_validator
async def validate_result(ctx: RunContext[DatabaseConn], result: Response) ->
Response:
if isinstance(result, InvalidRequest):
return result
try:
await ctx.deps.execute(f'EXPLAIN {result.sql_query}')
except QueryError as e:
raise ModelRetry(f'Invalid query: {e}') from e
else:
return result
result = agent.run_sync(
'get me uses who were last active yesterday.', deps=DatabaseConn()
)
print(result.data)
#> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1
day'
```
## Streamed Results
We can also stream text as deltas rather than the entire text in each item:
```python {title="streamed_delta_hello_world.py"}
from pydantic_ai import Agent
agent = Agent('gemini-1.5-flash')
Not all types are supported with partial validation in Pydantic, see
[pydantic/pydantic#10748](https://github.com/pydantic/pydantic/pull/10748),
generally for model-like structures it's currently best to use `TypeDict`.
agent = Agent(
'openai:gpt-4o',
result_type=UserProfile,
system_prompt='Extract a user profile from the input',
)
1. [`stream_structured`][pydantic_ai.result.StreamedRunResult.stream_structured]
streams the data as [`ModelResponse`][pydantic_ai.messages.ModelResponse] objects,
thus iteration can't fail with a `ValidationError`.
2. [`validate_structured_result`]
[pydantic_ai.result.StreamedRunResult.validate_structured_result] validates the
data, `allow_partial=True` enables pydantic's [`experimental_allow_partial` flag on
`TypeAdapter`][pydantic.type_adapter.TypeAdapter.validate_json].
## Examples
- [Stream markdown](examples/stream-markdown.md)
- [Stream Whales](examples/stream-whales.md)
------- ○ -------
testing-evals.md
With PydanticAI and LLM integrations in general, there are two distinct kinds of
test:
1. **Unit tests** — tests of your application code, and whether it's behaving
correctly
2. **Evals** — tests of the LLM, and how good or bad its responses are
For the most part, these two kinds of tests have pretty separate goals and
considerations.
## Unit tests
Unit tests for PydanticAI code are just like unit tests for any other Python code.
Because for the most part they're nothing new, we have pretty well established
tools and patterns for writing and running these kinds of tests.
Unless you're really sure you know better, you'll probably want to follow roughly
this strategy:
The simplest and fastest way to exercise most of your application code is using
[`TestModel`][pydantic_ai.models.test.TestModel], this will (by default) call all
tools in the agent, then return either plain text or a structured response
depending on the return type of the agent.
The resulting data won't look pretty or relevant, but it should pass Pydantic's
validation in most cases.
If you want something more sophisticated, use [`FunctionModel`]
[pydantic_ai.models.function.FunctionModel] and write your own data generation
logic.
```python {title="weather_app.py"}
import asyncio
from datetime import date
weather_agent = Agent(
'openai:gpt-4o',
deps_type=WeatherService,
system_prompt='Providing a weather forecast at the locations the user
provides.',
)
@weather_agent.tool
def weather_forecast(
ctx: RunContext[WeatherService], location: str, forecast_date: date
) -> str:
if forecast_date < date.today(): # (3)!
return ctx.deps.get_historic_weather(location, forecast_date)
else:
return ctx.deps.get_forecast(location, forecast_date)
**We want to test this code without having to mock certain objects or modify our
code so we can pass test objects in.**
The above tests are a great start, but careful readers will notice that the
`WeatherService.get_forecast` is never called since `TestModel` calls
`weather_forecast` with a date in the past.
import pytest
pytestmark = pytest.mark.anyio
models.ALLOW_MODEL_REQUESTS = False
If you're writing lots of tests that all require model to be overridden, you can
use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override
the model with [`TestModel`][pydantic_ai.models.test.TestModel] or
[`FunctionModel`][pydantic_ai.models.function.FunctionModel] in a reusable way.
@pytest.fixture
def override_weather_agent():
with weather_agent.override(model=TestModel()):
yield
## Evals
Evals are generally more like benchmarks than unit tests, they never "pass"
although they do "fail"; you care mostly about how they change over time.
Since evals need to be run against the real model, then can be slow and expensive
to run, you generally won't want to run them in CI for every commit.
The hardest part of evals is measuring how well the model has performed.
In some cases (e.g. an agent to generate SQL) there are simple, easy to run tests
that can be used to measure performance (e.g. is the SQL valid? Does it return the
right results? Does it return just the right results?).
In other cases (e.g. an agent that gives advice on quitting smoking) it can be very
hard or impossible to make quantitative measures of performance — in the smoking
case you'd really need to run a double-blind trial over months, then wait 40 years
and observe health outcomes to know if changes to your prompt were an improvement.
There are a few different strategies you can use to measure performance:
* **End to end, self-contained tests** — like the SQL example, we can test the
final result of the agent near-instantly
* **Synthetic self-contained tests** — writing unit test style checks that the
output is as expected, checks like `#!python 'chewing gum' in response`, while
these checks might seem simplistic they can be helpful, one nice characteristic is
that it's easy to tell what's wrong when they fail
* **LLMs evaluating LLMs** — using another models, or even the same model with a
different prompt to evaluate the performance of the agent (like when the class
marks each other's homework because the teacher has a hangover), while the
downsides and complexities of this approach are obvious, some think it can be a
useful tool in the right circumstances
* **Evals in prod** — measuring the end results of the agent in production, then
creating a quantitative measure of performance, so you can easily measure changes
over time as you change the prompt or model used, [logfire](logfire.md) can be
extremely useful in this case since you can write a custom query to measure the
performance of your agent
Let's assume we have the following app for running SQL generated from a user prompt
(this examples omits a lot of details for brevity, see the [SQL gen](examples/sql-
gen.md) example for a more complete code):
```python {title="sql_app.py"}
import json
from pathlib import Path
from typing import Union
self.db = db
Database schema:
CREATE TABLE records (
...
);
@staticmethod
def format_example(example: dict[str, str]) -> str: # (3)!
return f"""\
<example>
<request>{example['request']}</request>
<sql>{example['sql']}</sql>
</example>
"""
sql_agent = Agent(
'gemini-1.5-flash',
deps_type=SqlSystemPrompt,
)
@sql_agent.system_prompt
async def system_prompt(ctx: RunContext[SqlSystemPrompt]) -> str:
return ctx.deps.build_prompt()
```json {title="examples.json"}
{
"examples": [
{
"request": "Show me all records",
"sql": "SELECT * FROM records;"
},
{
"request": "Show me all records from 2021",
"sql": "SELECT * FROM records WHERE date_trunc('year', date) = '2021-01-01';"
},
{
"request": "show me error records with the tag 'foobar'",
"sql": "SELECT * FROM records WHERE level = 'error' and 'foobar' =
ANY(tags);"
},
...
]
}
```
Now we want a way to quantify the success of the SQL generation so we can judge how
changes to the agent affect its performance.
We use 5-fold cross-validation to judge the performance of the agent using our
existing set of examples.
```python {title="sql_app_evals.py"}
import json
import statistics
from pathlib import Path
from itertools import chain
# each return value that matches the expected value has a score of
3
fold_score += 5 * len(set(agent_ids) & expected_ids)
scores.append(fold_score)
overall_score = statistics.mean(scores)
print(f'Overall score: {overall_score:0.2f}')
#> Overall score: 12.00
```
We can then change the prompt, the model, or the examples and see how the score
changes over time.
------- ○ -------
tools.md
# Function Tools
Function tools provide a mechanism for models to retrieve extra information to help
them generate a response.
They're useful when it is impractical or impossible to put all the context an agent
might need into the system prompt, or when you want to make agents' behavior more
deterministic or reliable by deferring some of the logic required to generate a
response to another (not necessarily AI-powered) tool.
The main semantic difference between PydanticAI Tools and RAG is RAG is
synonymous with vector search, while PydanticAI tools are more general-purpose.
(Note: we may add support for vector search functionality in the future,
particularly an API for generating embeddings. See
[#58](https://github.com/pydantic/pydantic-ai/issues/58))
```python {title="dice_game.py"}
import random
agent = Agent(
'gemini-1.5-flash', # (1)!
deps_type=str, # (2)!
system_prompt=(
"You're a dice game, you should roll the die and see if the number "
"you get back matches the user's guess. If so, tell them they're a winner.
"
"Use the player's name in the response."
),
)
@agent.tool_plain # (3)!
def roll_die() -> str:
"""Roll a six-sided die and return the result."""
return str(random.randint(1, 6))
@agent.tool # (4)!
def get_player_name(ctx: RunContext[str]) -> str:
"""Get the player's name."""
return ctx.deps
1. This is a pretty simple task, so we can use the fast and cheap Gemini flash
model.
2. We pass the user's name as the dependency, to keep things simple we use just the
name as a string as the dependency.
3. This tool doesn't need any context, it just returns a random number. You could
probably use a dynamic system prompt in this case.
4. This tool needs the player's name, so it uses `RunContext` to access
dependencies which are just the player's name in this case.
5. Run the agent, passing the player's name as the dependency.
Let's print the messages from that game to see what happened:
```python {title="dice_game_messages.py"}
from dice_game import dice_result
print(dice_result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content="You're a dice game, you should roll the die and see if the
number you get back matches the user's guess. If so, tell them they're a winner.
Use the player's name in the response.",
part_kind='system-prompt',
),
UserPromptPart(
content='My guess is 4',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='roll_die',
args=ArgsDict(args_dict={}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
ToolReturnPart(
tool_name='roll_die',
content='4',
tool_call_id=None,
timestamp=datetime.datetime(...),
part_kind='tool-return',
)
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='get_player_name',
args=ArgsDict(args_dict={}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
ToolReturnPart(
tool_name='get_player_name',
content='Anne',
tool_call_id=None,
timestamp=datetime.datetime(...),
part_kind='tool-return',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content="Congratulations Anne, you guessed correctly! You're a
winner!",
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
```
```mermaid
sequenceDiagram
participant Agent
participant LLM
As well as using the decorators, we can register tools via the `tools` argument to
the [`Agent` constructor][pydantic_ai.Agent.__init__]. This is useful when you want
to re-use tools, and can also give more fine-grained control over the tools.
```python {title="dice_game_tool_kwarg.py"}
import random
agent_a = Agent(
'gemini-1.5-flash',
deps_type=str,
tools=[roll_die, get_player_name], # (1)!
)
agent_b = Agent(
'gemini-1.5-flash',
deps_type=str,
tools=[ # (2)!
Tool(roll_die, takes_ctx=False),
Tool(get_player_name, takes_ctx=True),
],
)
dice_result = agent_b.run_sync('My guess is 4', deps='Anne')
print(dice_result.data)
#> Congratulations Anne, you guessed correctly! You're a winner!
```
1. The simplest way to register tools via the `Agent` constructor is to pass a list
of functions, the function signature is inspected to determine if the tool takes
[`RunContext`][pydantic_ai.tools.RunContext].
2. `agent_a` and `agent_b` are identical — but we can use [`Tool`]
[pydantic_ai.tools.Tool] to reuse tool definitions and give more fine-grained
control over how tools are defined, e.g. setting their name or description, or
using a custom [`prepare`](#tool-prepare) method.
As the name suggests, function tools use the model's "tools" or "functions" API to
let the model know what is available to call. Tools or functions are also used to
define the schema(s) for structured responses, thus a model might have access to
many tools, some of which call function tools while others end the run and return a
result.
Function parameters are extracted from the function signature, and all parameters
except `RunContext` are used to build the schema for that tool call.
Even better, PydanticAI extracts the docstring from functions and (thanks to
[griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions
from the docstring and adds them to the schema.
[Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/
#docstrings) extracting parameter descriptions from `google`, `numpy` and `sphinx`
style docstrings, and PydanticAI will infer the format to use based on the
docstring. We plan to add support in the future to explicitly set the style to use,
and warn/error if not all parameters are documented; see
[#59](https://github.com/pydantic/pydantic-ai/issues/59).
```python {title="tool_schema.py"}
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse
from pydantic_ai.models.function import AgentInfo, FunctionModel
agent = Agent()
@agent.tool_plain
def foobar(a: int, b: str, c: dict[str, list[float]]) -> str:
"""Get me foobar.
Args:
a: apple pie
b: banana cake
c: carrot smoothie
"""
return f'{a} {b} {c}'
The return type of tool can be anything which Pydantic can serialize to JSON as
some models (e.g. Gemini) support semi-structured return values, some expect text
(OpenAI) but seem to be just as good at extracting meaning from the data. If a
Python object is returned and the model expects a string, the value will be
serialized to JSON.
```python {title="single_parameter_tool.py"}
from pydantic import BaseModel
agent = Agent()
class Foobar(BaseModel):
"""This is a Foobar"""
x: int
y: str
z: float = 3.14
@agent.tool_plain
def foobar(f: Foobar) -> str:
return str(f)
test_model = TestModel()
result = agent.run_sync('hello', model=test_model)
print(result.data)
#> {"foobar":"x=0 y='a' z=3.14"}
print(test_model.agent_model_function_tools)
"""
[
ToolDefinition(
name='foobar',
description='This is a Foobar',
parameters_json_schema={
'properties': {
'x': {'title': 'X', 'type': 'integer'},
'y': {'title': 'Y', 'type': 'string'},
'z': {'default': 3.14, 'title': 'Z', 'type': 'number'},
},
'required': ['x', 'y'],
'title': 'Foobar',
'type': 'object',
},
outer_typed_dict_key=None,
)
]
"""
```
Tools can optionally be defined with another function: `prepare`, which is called
at each step of a run to
customize the definition of the tool passed to the model, or omit the tool
completely from that step.
A `prepare` method can be registered via the `prepare` kwarg to any of the tool
registration mechanisms:
* [`@agent.tool`][pydantic_ai.Agent.tool] decorator
* [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain] decorator
* [`Tool`][pydantic_ai.tools.Tool] dataclass
Here's a simple `prepare` method that only includes the tool if the value of the
dependency is `42`.
```python {title="tool_only_if_42.py"}
from typing import Union
agent = Agent('test')
@agent.tool(prepare=only_if_42)
def hitchhiker(ctx: RunContext[int], answer: str) -> str:
return f'{ctx.deps} {answer}'
result = agent.run_sync('testing...', deps=41)
print(result.data)
#> success (no tool calls)
result = agent.run_sync('testing...', deps=42)
print(result.data)
#> {"hitchhiker":"42 a"}
```
Here's a more complex example where we change the description of the `name`
parameter to based on the value of `deps`
For the sake of variation, we create this tool using the [`Tool`]
[pydantic_ai.tools.Tool] dataclass.
```python {title="customize_name.py"}
from __future__ import annotations
------- ○ -------
troubleshooting.md
# Troubleshooting
Below are suggestions on how to fix some common errors you might encounter while
using PydanticAI. If the issue you're experiencing is not listed below or addressed
in the documentation, please feel free to ask in the [Pydantic Slack](help.md) or
create an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues).
This error is caused by conflicts between the event loops in Jupyter notebook and
PydanticAI's. One way to manage these conflicts is by using [`nest-asyncio`]
(https://pypi.org/project/nest-asyncio/). Namely, before you execute any agent
runs, do the following:
```python {test="skip"}
import nest_asyncio
nest_asyncio.apply()
```
Note: This fix also applies to Google Colab.
### `UserError: API key must be provided or set in the [MODEL]_API_KEY environment
variable`
If you're running into issues with setting the API key for your model, visit the
[Models](models.md) page to learn more about how to set an environment variable
and/or pass in an `api_key` argument.
------- ○ -------
_worker.js
// cloudflare worker to building warning if the docs are ahead of the latest
release
// see https://developers.cloudflare.com/pages/functions/advanced-mode/
export default {
async fetch(request, env) {
const url = new URL(request.url)
if (url.pathname === '/version-warning.html') {
try {
const html = await versionWarning(request, env)
const headers = {
'Content-Type': 'text/plain',
'Cache-Control': 'max-age=2592000', // 30 days
}
return new Response(html, { headers })
} catch (e) {
console.error(e)
return new Response(
`Error getting ahead HTML: ${e}`,
{ status: 500, headers: {'Content-Type': 'text/plain'} }
)
}
} else {
return env.ASSETS.fetch(request)
}
},
}
if (ahead_by === 0) {
return `<div class="admonition success" style="margin: 0">
<p class="admonition-title">Version</p>
<p>Showing documentation for the latest release <a
href="${html_url}">${name}</a>.</p>
</div>`
}
------- ○ -------
.hooks\main.py
import re
import time
import urllib.parse
from pathlib import Path
def on_page_markdown(markdown: str, page: Page, config: Config, files: Files) ->
str:
"""Called on each file after it is read and before it is converted to HTML."""
markdown = replace_uv_python_run(markdown)
markdown = render_examples(markdown)
markdown = render_video(markdown)
return markdown
```bash
{prefix}{pip_base}{suffix}
```
=== "uv"
```bash
{prefix}{uv_base}{suffix}
```"""
return content
domain = 'https://customer-nmegqx24430okhaq.cloudflarestream.com'
poster = f'{domain}/{video_id}/thumbnails/thumbnail.jpg?time={time}&height=600'
return f"""
<div style="position: relative; padding-top: {padding_top}%;">
<iframe
src="{domain}/{video_id}/iframe?poster={urllib.parse.quote_plus(poster)}"
loading="lazy"
style="border: none; position: absolute; top: 0; left: 0; height: 100%; width:
100%;"
allow="accelerometer; gyroscope; autoplay; encrypted-media; picture-in-
picture;"
allowfullscreen="true"
></iframe>
</div>
"""
------- ○ -------
.overrides\main.html
{% extends "base.html" %}
{% block content %}
<div id="version-warning"></div>
<script>
fetch('/version-warning.html?v={{ build_timestamp }}').then(r => {
if (r.ok) {
r.text().then(text => { document.getElementById('version-
warning').innerHTML = text })
} else {
r.text().then(text => { console.error('failed to fetch ahead-
warning.html:', {r, text})})
}
})
</script>
{{ super() }}
{% endblock %}
------- ○ -------
.overrides\.icons\logfire\logo.svg
------- ○ -------
.partials\index-header.html
<div class="text-center">
<img class="index-header off-glb" src="./img/pydantic-ai-dark.svg#only-dark"
alt="PydanticAI">
</div>
<div class="text-center">
<img class="index-header off-glb" src="./img/pydantic-ai-light.svg#only-light"
alt="PydanticAI">
</div>
<p class="text-center">
<em>Agent Framework / shim to use Pydantic with LLMs</em>
</p>
<p class="text-center">
<a href="https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml?
query=branch%3Amain">
<img src="https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml/
badge.svg?event=push" alt="CI">
</a>
<a href="https://coverage-badge.samuelcolvin.workers.dev/redirect/pydantic/
pydantic-ai">
<img src="https://coverage-badge.samuelcolvin.workers.dev/pydantic/pydantic-
ai.svg" alt="Coverage">
</a>
<a href="https://pypi.python.org/pypi/pydantic-ai">
<img src="https://img.shields.io/pypi/v/pydantic-ai.svg" alt="PyPI">
</a>
<a href="https://github.com/pydantic/pydantic-ai">
<img src="https://img.shields.io/pypi/pyversions/pydantic-ai.svg"
alt="versions">
</a>
<a href="https://github.com/pydantic/pydantic-ai/blob/main/LICENSE">
<img src="https://img.shields.io/github/license/pydantic/pydantic-ai.svg"
alt="license">
</a>
</p>
<p class="text-emphasis">
PydanticAI is a Python agent framework designed to make it less painful to
build production grade applications with Generative AI.
</p>
------- ○ -------
api\agent.md
# `pydantic_ai.agent`
::: pydantic_ai.agent
------- ○ -------
api\exceptions.md
# `pydantic_ai.exceptions`
::: pydantic_ai.exceptions
------- ○ -------
api\messages.md
# `pydantic_ai.messages`
```mermaid
graph RL
SystemPromptPart(SystemPromptPart) --- ModelRequestPart
UserPromptPart(UserPromptPart) --- ModelRequestPart
ToolReturnPart(ToolReturnPart) --- ModelRequestPart
RetryPromptPart(RetryPromptPart) --- ModelRequestPart
TextPart(TextPart) --- ModelResponsePart
ToolCallPart(ToolCallPart) --- ModelResponsePart
ModelRequestPart("ModelRequestPart<br>(Union)") --- ModelRequest
ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage
ModelResponsePart("ModelResponsePart<br>(Union)") --- ModelResponse
ModelResponse("ModelResponse(parts=list[...])") ---
ModelMessage("ModelMessage<br>(Union)")
```
::: pydantic_ai.messages
------- ○ -------
api\result.md
# `pydantic_ai.result`
::: pydantic_ai.result
options:
inherited_members: true
------- ○ -------
api\settings.md
# `pydantic_ai.settings`
::: pydantic_ai.settings
options:
inherited_members: true
members:
- ModelSettings
- UsageLimits
------- ○ -------
api\tools.md
# `pydantic_ai.tools`
::: pydantic_ai.tools
------- ○ -------
api\usage.md
# `pydantic_ai.usage`
::: pydantic_ai.usage
------- ○ -------
api\models\anthropic.md
# `pydantic_ai.models.anthropic`
## Setup
For details on how to set up authentication with this model, see [model
configuration for Anthropic](../../models.md#anthropic).
::: pydantic_ai.models.anthropic
------- ○ -------
api\models\base.md
# `pydantic_ai.models`
::: pydantic_ai.models
options:
members:
- KnownModelName
- Model
- AgentModel
- AbstractToolDefinition
- StreamTextResponse
- StreamStructuredResponse
- ALLOW_MODEL_REQUESTS
- check_allow_model_requests
- override_allow_model_requests
------- ○ -------
api\models\function.md
# `pydantic_ai.models.function`
[`FunctionModel`][pydantic_ai.models.function.FunctionModel] is similar to
[`TestModel`](test.md),
but allows greater control over the model's behavior.
Its primary use case is for more advanced unit testing than is possible with
`TestModel`.
my_agent = Agent('openai:gpt-4o')
::: pydantic_ai.models.function
------- ○ -------
api\models\gemini.md
# `pydantic_ai.models.gemini`
Despite these shortcomings, the Gemini model is actually quite powerful and very
fast.
## Setup
For details on how to set up authentication with this model, see [model
configuration for Gemini](../../models.md#gemini).
::: pydantic_ai.models.gemini
------- ○ -------
api\models\groq.md
# `pydantic_ai.models.groq`
## Setup
For details on how to set up authentication with this model, see [model
configuration for Groq](../../models.md#groq).
::: pydantic_ai.models.groq
------- ○ -------
api\models\mistral.md
# `pydantic_ai.models.mistral`
## Setup
For details on how to set up authentication with this model, see [model
configuration for Mistral](../../models.md#mistral).
::: pydantic_ai.models.mistral
------- ○ -------
api\models\ollama.md
# `pydantic_ai.models.ollama`
## Setup
For details on how to set up authentication with this model, see [model
configuration for Ollama](../../models.md#ollama).
With `ollama` installed, you can run the server with the model you want to use:
```bash {title="terminal-run-ollama"}
ollama run llama3.2
```
(this will pull the `llama3.2` model if you don't already have it downloaded)
```python {title="ollama_example.py"}
from pydantic import BaseModel
class CityLocation(BaseModel):
city: str
country: str
```python {title="ollama_example_with_remote_server.py"}
from pydantic import BaseModel
ollama_model = OllamaModel(
model_name='qwen2.5-coder:7b', # (1)!
base_url='http://192.168.1.74:11434/v1', # (2)!
)
class CityLocation(BaseModel):
city: str
country: str
::: pydantic_ai.models.ollama
------- ○ -------
api\models\openai.md
# `pydantic_ai.models.openai`
## Setup
For details on how to set up authentication with this model, see [model
configuration for OpenAI](../../models.md#openai).
::: pydantic_ai.models.openai
------- ○ -------
api\models\test.md
# `pydantic_ai.models.test`
::: pydantic_ai.models.test
------- ○ -------
api\models\vertexai.md
# `pydantic_ai.models.vertexai`
## Setup
For details on how to set up authentication with this model as well as a comparison
with the `generativelanguage.googleapis.com` API used by [`GeminiModel`]
[pydantic_ai.models.gemini.GeminiModel],
see [model configuration for Gemini via VertexAI](../../models.md#gemini-via-
vertexai).
## Example Usage
With the default google project already configured in your environment using
"application default credentials":
```python {title="vertex_example_env.py"}
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
```
```python {title="vertex_example_service_account.py"}
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel(
'gemini-1.5-flash',
service_account_file='path/to/service-account.json',
)
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
```
::: pydantic_ai.models.vertexai
------- ○ -------
examples\bank-support.md
Small but complete example of using PydanticAI to build a support agent for a bank.
Demonstrates:
```bash
python/uv-run -m pydantic_ai_examples.bank_support
```
## Example Code
```python {title="bank_support.py"}
#! examples/pydantic_ai_examples/bank_support.py
```
------- ○ -------
examples\chat-app.md
Demonstrates:
This demonstrates storing chat history between requests and using it to give the
model context for new responses.
Most of the complex logic here is between `chat_app.py` which streams the response
to the browser,
and `chat_app.ts` which renders messages in the browser.
```bash
python/uv-run -m pydantic_ai_examples.chat_app
```
TODO screenshot.
## Example Code
```html {title="chat_app.html"}
#! examples/pydantic_ai_examples/chat_app.html
```
TypeScript to handle rendering the messages, to keep this simple (and at the risk
of offending frontend developers) the typescript code is passed to the browser as
plain text and transpiled in the browser.
```ts {title="chat_app.ts"}
#! examples/pydantic_ai_examples/chat_app.ts
```
------- ○ -------
examples\flight-booking.md
Example of a multi-agent flow where one agent delegates work to another, then hands
off control to a third agent.
Demonstrates:
* [agent delegation](../multi-agent-applications.md#agent-delegation)
* [programmatic agent hand-off](../multi-agent-applications.md#programmatic-agent-
hand-off)
* [usage limits](../agents.md#usage-limits)
In this scenario, a group of agents work together to find the best flight for a
user.
```mermaid
graph TD
START --> search_agent("search agent")
search_agent --> extraction_agent("extraction agent")
extraction_agent --> search_agent
search_agent --> human_confirm("human confirm")
human_confirm --> search_agent
search_agent --> FAILED
human_confirm --> find_seat_function("find seat function")
find_seat_function --> human_seat_choice("human seat choice")
human_seat_choice --> find_seat_agent("find seat agent")
find_seat_agent --> find_seat_function
find_seat_function --> buy_flights("buy flights")
buy_flights --> SUCCESS
```
## Example Code
```python {title="flight_booking.py"}
#! examples/pydantic_ai_examples/flight_booking.py
```
------- ○ -------
examples\index.md
# Examples
## Usage
These examples are distributed with `pydantic-ai` so you can run them either by
cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by
simply installing `pydantic-ai` from PyPI with `pip` or `uv`.
Either way you'll need to install extra dependencies to run some examples, you just
need to install the `examples` optional dependency group.
If you've installed `pydantic-ai` via pip/uv, you can install the extra
dependencies with:
```bash
pip/uv-add 'pydantic-ai[examples]'
```
If you clone the repo, you should instead use `uv sync --extra examples` to install
extra dependencies.
These examples will need you to set up authentication with one or more of the LLMs,
see the [model configuration](../models.md) docs for details on how to do this.
TL;DR: in most cases you'll need to set one of the following environment variables:
=== "OpenAI"
```bash
export OPENAI_API_KEY=your-api-key
```
```bash
export GEMINI_API_KEY=your-api-key
```
### Running Examples
To run the examples (this will work whether you installed `pydantic_ai`, or cloned
the repo), run:
```bash
python/uv-run -m pydantic_ai_examples.<example_module_name>
```
```bash
python/uv-run -m pydantic_ai_examples.pydantic_model
```
If you like one-liners and you're using uv, you can run a pydantic-ai example with
zero setup:
```bash
OPENAI_API_KEY='your-api-key' \
uv run --with 'pydantic-ai[examples]' \
-m pydantic_ai_examples.pydantic_model
```
---
You'll probably want to edit examples in addition to just running them. You can
copy the examples to a new directory with:
```bash
python/uv-run -m pydantic_ai_examples --copy-to examples/
```
------- ○ -------
examples\pydantic-model.md
# Pydantic Model
Simple example of using PydanticAI to construct a Pydantic model from a text input.
Demonstrates:
* [structured `result_type`](../results.md#structured-result-validation)
```bash
python/uv-run -m pydantic_ai_examples.pydantic_model
```
This examples uses `openai:gpt-4o` by default, but it works well with other models,
e.g. you can run it
with Gemini using:
```bash
PYDANTIC_AI_MODEL=gemini-1.5-pro python/uv-run -m
pydantic_ai_examples.pydantic_model
```
## Example Code
```python {title="pydantic_model.py"}
#! examples/pydantic_ai_examples/pydantic_model.py
```
------- ○ -------
examples\rag.md
# RAG
RAG search example. This demo allows you to ask question of the
[logfire](https://pydantic.dev/logfire) documentation.
Demonstrates:
* [tools](../tools.md)
* [agent dependencies](../dependencies.md)
* RAG search
Logic for extracting sections from markdown files and a JSON file with that data is
available in
[this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992).
```bash
mkdir postgres-data
docker run --rm \
-e POSTGRES_PASSWORD=postgres \
-p 54320:5432 \
-v `pwd`/postgres-data:/var/lib/postgresql/data \
pgvector/pgvector:pg17
```
(Note building the database doesn't use PydanticAI right now, instead it uses the
OpenAI SDK directly.)
```bash
python/uv-run -m pydantic_ai_examples.rag search "How do I configure logfire to
work with FastAPI?"
```
## Example Code
```python {title="rag.py"}
#! examples/pydantic_ai_examples/rag.py
```
------- ○ -------
examples\sql-gen.md
# SQL Generation
Example demonstrating how to use PydanticAI to generate SQL queries based on user
input.
Demonstrates:
```bash
docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres
```
_(we run postgres on port `54320` to avoid conflicts with any other postgres
instances you may have running)_
```bash
python/uv-run -m pydantic_ai_examples.sql_gen
```
```bash
python/uv-run -m pydantic_ai_examples.sql_gen "find me errors"
```
This model uses `gemini-1.5-flash` by default since Gemini is good at single shot
queries of this kind.
## Example Code
```python {title="sql_gen.py"}
#! examples/pydantic_ai_examples/sql_gen.py
```
------- ○ -------
examples\stream-markdown.md
This example shows how to stream markdown from an agent, using the [`rich`]
(https://github.com/Textualize/rich) library to highlight the output in the
terminal.
It'll run the example with both OpenAI and Google Gemini models if the required
environment variables are set.
Demonstrates:
```bash
python/uv-run -m pydantic_ai_examples.stream_markdown
```
## Example Code
```python
#! examples/pydantic_ai_examples/stream_markdown.py
```
------- ○ -------
examples\stream-whales.md
Demonstrates:
This script streams structured responses from GPT-4 about whales, validates the
data
and displays it as a dynamic table using
[`rich`](https://github.com/Textualize/rich) as the data is received.
{{ video('53dd5e7664c20ae90ed90ae42f606bf3', 25) }}
## Example Code
```python {title="stream_whales.py"}
#! examples/pydantic_ai_examples/stream_whales.py
```
------- ○ -------
examples\weather-agent.md
Example of PydanticAI with multiple tools which the LLM needs to call in turn to
answer a question.
Demonstrates:
* [tools](../tools.md)
* [agent dependencies](../dependencies.md)
* [streaming text responses](../results.md#streaming-text)
In this case the idea is a "weather" agent — the user can ask for the weather in
multiple locations,
the agent will use the `get_lat_lng` tool to get the latitude and longitude of the
locations, then use
the `get_weather` tool to get the weather for those locations.
To run this example properly, you might want to add two extra API keys **(Note if
either key is missing, the code will fall back to dummy data, so they're not
required)**:
```bash
python/uv-run -m pydantic_ai_examples.weather_agent
```
## Example Code
```python {title="pydantic_ai_examples/weather_agent.py"}
#! examples/pydantic_ai_examples/weather_agent.py
```
------- ○ -------
extra\tweaks.css
.hide {
display: none;
}
.text-center {
text-align: center;
}
img.index-header {
width: 70%;
max-width: 500px;
}
.pydantic-pink {
color: #FF007F;
}
.team-blue {
color: #0072CE;
}
.secure-green {
color: #00A86B;
}
.shapes-orange {
color: #FF7F32;
}
.puzzle-purple {
color: #652D90;
}
.wheel-gray {
color: #6E6E6E;
}
.vertical-middle {
vertical-align: middle;
}
.text-emphasis {
font-size: 1rem;
font-weight: 300;
font-style: italic;
}
#version-warning {
min-height: 120px;
margin-bottom: 10px;
}
.mermaid {
text-align: center;
}
------- ○ -------
img\logfire-monitoring-pydanticai.png
------- ○ -------
img\logfire-weather-agent.png
------- ○ -------
img\logo-white.svg
------- ○ -------
img\pydantic-ai-dark.svg
------- ○ -------
img\pydantic-ai-light.svg
------- ○ -------