Building Enterprise-Grade AI Agents with Pydantic AI: Outputting Structured Data
This guide demonstrates how to create AI agents using Pydantic AI that output structured data in the form of Pydantic models. This approach significantly improves reliability and predictability, making Pydantic AI ideal for enterprise-grade AI applications.
Why Structured Data Matters
In enterprise applications, dealing with data that is reliable, predictable, and consistent is paramount. When AI agents, especially those powered by LLMs, output unstructured text, it can introduce significant challenges. Pydantic AI, by enabling agents to directly output structured data, addresses these challenges head-on:
- Reliability: Strongly-typed, validated data objects, defined by Pydantic models, are absolutely crucial for building robust enterprise applications. They provide a contract for the data your agents produce.
- Predictability: By enforcing structured output, Pydantic AI dramatically reduces unexpected errors and crashes that can arise from the often unpredictable nature of raw, unfiltered LLM outputs. You gain control over the shape and type of data your agents return.
- Data Integrity: Structured data ensures data consistency across your AI-driven systems. This consistency is vital for seamless downstream processing, analysis, and integration with other enterprise systems and databases.
Let's explore practical examples of how to build Pydantic AI agents that output structured data.
Example 1: Hello World (Simple Calculation)
This first example demonstrates a very basic agent that performs a simple calculation and outputs the result as a structured Pydantic model.
Import Libraries:
from pydantic import BaseModel
from pydantic_ai import Agent
Configure Logging (Optional):
For robust enterprise applications, detailed logging is essential. You can use a logging framework like Logfire for detailed tracking of your agent’s execution. (Note: Logfire implementation details are not shown in this extract, but it’s a valuable consideration for production systems).
Define the Output Model:
class Calculation(BaseModel):
result: int
Here, we define a Pydantic model named Calculation
. This model is incredibly simple, containing a single field: result
, which is an integer. This model will structure the output of our agent.
Create the Agent:
agent = Agent('openai:gpt-4-0613', result_type=Calculation)
This line initializes our Pydantic AI Agent. Crucially, we pass two arguments:
'openai:gpt-4-0613'
: This specifies the underlying LLM model we want to use. In this case, we’re usinggpt-4-0613
from OpenAI. You can replace this with your preferred model name (e.g., a model from Gemini, Anthropic, or even a locally run model). Make sure you have the necessary API keys or configurations set up for your chosen model.result_type=Calculation
: This is the key to structured output! We explicitly tell the Agent that we expect it to return data that conforms to ourCalculation
Pydantic model. Pydantic AI will handle the process of guiding the LLM to produce output that can be validated against this model.
Run the Agent and Get the Result:
result = await agent.run(prompt="What is 100 + 300?")
print(result)
print(result.data)
await agent.run(prompt="What is 100 + 300?")
: This line executes the agent. We use agent.run()
for asynchronous execution, passing the prompt “What is 100 + 300?”. Pydantic AI takes care of sending this prompt to the configured LLM.
print(result)
: This will print the full RunResult
object, which contains metadata about the agent run in addition to the structured data.
print(result.data)
: This is where we access the structured output! result.data
will contain an instance of our Calculation
Pydantic model, populated with the LLM’s response, validated by Pydantic. In this case, it will be a Calculation
object where result
is likely set to 400
.
Example 2: Capital Cities (Multi-field Model)
Let’s move to a slightly more complex example. Here, we’ll create a Pydantic model with multiple fields to represent information about capital cities.
Import Libraries:
from pydantic import BaseModel
from pydantic_ai import Agent
Define the Output Model:
class Capital(BaseModel):
name: str
year_founded: int
short_history: str
This time, our Capital
Pydantic model has three fields:
name
: The name of the capital city (string).year_founded
: The year the city was founded (integer).short_history
: A brief historical description of the city (string).
Create the Agent:
agent = Agent('openai:gpt-4-0613', result_type=Capital)
We create the Agent in the same way as before, but now we set result_type=Capital
, indicating that we expect structured output conforming to the Capital
model.
Run the Agent:
result = await agent.run(prompt="What is the capital of the United States?")
print(result)
print(result.data)
We run the agent with the prompt “What is the capital of the United States?”. Pydantic AI, guided by the result_type=Capital
, will instruct the LLM to provide information that can be parsed and validated into a Capital
object. result.data
will then contain a Capital
instance, with fields like name
set to “Washington, D.C.”, year_founded
to 1790 (or thereabouts), and short_history
containing a brief history of the city.
Example 3: Invoice Parsing
Now let’s tackle a more practical enterprise use case: parsing unstructured invoice data and extracting it into a structured format.
Prepare Invoice Data:
For this example, we’ll assume you have a sample invoice available, perhaps in a text-based format like Markdown, or even as a raw text extract. Let’s imagine our invoice.md
contains invoice information structured like this (or even less structured raw text):
## Invoice
**Invoice Number:** INV-2024-10-27
**Date Issued:** 2024-10-27
### Services Provided:
- Consulting Services - 10 hours
- Project Management - 5 hours
**Subtotal:** $1500.00
**Tax Rate:** 0.08
**Tax Amount:** $120.00
**Total Amount Due:** $1620.00
### Payment Instructions:
Bank Name: Example Bank
Account Number: 1234567890
Generate Pydantic Model (Optional - Using an LLM):
While you can manually define the Pydantic model for an invoice, for complex documents, you can leverage an LLM itself to help you generate the model structure! You could prompt an LLM like GPT-4 (or a similar model) with a sample invoice and ask it to “generate a Pydantic model in Python to represent this invoice data.” This can be a great starting point, which you can then refine.
Import Libraries:
from pydantic import BaseModel, Field
from typing import List, Dict
from pydantic_ai import Agent
Paste/Define the Pydantic Model:
class Invoice(BaseModel):
invoice_number: str
date_issued: str
services_provided: List[str]
subtotal: float
tax_rate: float
tax_amount: float
total_amount_due: float
payment_instructions: Dict[str, str]
Here’s our Invoice
Pydantic model, designed to capture the key fields from an invoice:
invoice_number
: Invoice identifier (string).date_issued
: Date of invoice issuance (string - you might want to usedatetime.date
with appropriate parsing for real-world applications).services_provided
: A list of strings describing services rendered.subtotal
: The subtotal amount (float).tax_rate
: The tax rate applied (float).tax_amount
: The calculated tax amount (float).total_amount_due
: The final total amount due (float).payment_instructions
: A dictionary to hold payment details (e.g., bank name, account number).
Create the Agent:
agent = Agent('openai:gpt-4-0613', result_type=Invoice)
We set up our agent, specifying result_type=Invoice
.
Run with Textual Prompt (First Run):
result = await agent.run(prompt="Parse the following invoice and extract the details:\n\n[Invoice Text Here]") # Replace [Invoice Text Here] with the actual invoice text
print(result)
print(result.data)
Replace [Invoice Text Here]
with the actual text content of your invoice (e.g., read from invoice.md
). The prompt instructs the agent to “parse the invoice”. Pydantic AI will guide the LLM to process the invoice text and attempt to extract the information into the Invoice
Pydantic model structure. result.data
will then contain a validated Invoice
object, if successful.
Example 4: Resume Parsing
Let’s extend this to another common enterprise task: resume parsing.
Prepare Resume Data:
Similar to the invoice example, prepare a sample resume, perhaps in Markdown format (e.g., resume.md
).
Generate Pydantic Model (Optional):
Again, consider using an LLM to assist in generating a starting Pydantic model structure based on a sample resume’s content.
Import Libraries:
from pydantic import BaseModel, Field
from typing import List, Optional
from pydantic_ai import Agent
Paste/Define the Pydantic Model:
class Experience(BaseModel):
company: str
position: str
start_date: str
end_date: str
class Education(BaseModel):
institution_name: str
degree: str
start_date: str
end_date: str
class Resume(BaseModel):
full_name: str
contact_information: str
experience: List[Experience]
education: List[Education]
certifications: List[str]
skills: List[str]
summary: str = Field("") #Optional Field
This example uses nested Pydantic models to represent the hierarchical structure of a resume:
Experience
: Model for representing a work experience entry.Education
: Model for representing an education entry.Resume
: The mainResume
model, containing fields for:full_name
,contact_information
(strings)experience
: A list ofExperience
models.education
: A list ofEducation
models.certifications
,skills
: Lists of strings.summary
: An optional summary field (string, usingField("")
to set a default empty string).
Create the Agent:
agent = Agent('openai:gpt-4-0613', result_type=Resume, system_prompt="You are an HR representative skilled at extracting information from resumes and CVs. Extract the key details into structured JSON format.")
Here, when creating the Agent, we not only set result_type=Resume
but also provide a system_prompt
. System prompts are crucial for guiding the LLM’s behavior. In this case, the system prompt sets the persona of the agent as an “HR representative” and instructs it to extract resume information into a “structured JSON format” (which Pydantic AI will then validate against our Resume
model).
Run with Textual Prompt (First Run):
result = await agent.run(prompt="Parse the following resume:\n\n[Resume Text Here]") # Replace [Resume Text Here] with the actual resume text
print(result)
print(result.data)
Replace [Resume Text Here]
with the resume text. The agent will attempt to parse the resume and output a Resume
Pydantic model instance in result.data
.
Example 5: Real Estate Listing Parsing
For our final example, let’s consider parsing data from a real estate listing webpage. This demonstrates how Pydantic AI agents can interact with real-world web data.
Choose a Listing Page:
Select a real estate listing webpage from a site like homes.com, Redfin, or Zillow. You’ll need the URL of this page.
Generate Pydantic Model (Optional):
As before, consider using an LLM to help generate a Pydantic model structure that captures the key information you want to extract from real estate listings.
Import Libraries:
from pydantic import BaseModel, Field
from typing import List, Optional
from pydantic_ai import Agent
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
Here, we import additional libraries:
playwright.async_api
: Playwright is a powerful library for browser automation, allowing us to fetch the HTML content of webpages.bs4.BeautifulSoup
: BeautifulSoup is a Python library for parsing HTML and XML.
Paste/Define the Pydantic Model:
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str
class PropertyFeatures(BaseModel):
bedrooms: int
bathrooms: float
square_footage: int
# Add other features as needed (e.g., lot_size, year_built, etc.)
class AdditionalInformation(BaseModel):
price: str
listing_agent: str
last_updated: str
class Property(BaseModel):
address: Address
info: AdditionalInformation
type: str
mls_id: str
features: PropertyFeatures
We define a set of nested Pydantic models to represent real estate property data:
Address
: Model for the property address.PropertyFeatures
: Model for key property features (bedrooms, bathrooms, square footage, etc. - you can extend this with more features).AdditionalInformation
: Model for listing-specific information (price, agent, last updated date).Property
: The mainProperty
model, combiningAddress
,AdditionalInformation
,PropertyFeatures
, and also includingtype
(e.g., “House”, “Condo”) andmls_id
.
Create the Agent:
agent = Agent('openai:gpt-4-0613', result_type=Property, system_prompt="You are a real estate data extraction specialist. Your task is to extract structured information from real estate listings.")
We create an Agent with result_type=Property
and a system prompt that sets the agent’s role as a “real estate data extraction specialist.”
Run with Textual Prompt (First Run):
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
listing_url = "YOUR_REAL_ESTATE_LISTING_URL_HERE" # Replace with a real listing URL
await page.goto(listing_url)
html_content = await page.content()
await browser.close()
prompt_text = f"Extract details from this real estate listing:\n\n{html_content}"
agent_result = await agent.run(prompt=prompt_text)
print(agent_result)
print(agent_result.data)
In this example, we first use Playwright to:
- Launch a Chromium browser instance.
- Open a new page and navigate to
YOUR_REAL_ESTATE_LISTING_URL_HERE
(replace this with a real URL). - Get the HTML content of the page using
page.content()
. - Close the browser.
Then, we construct a prompt_text
that includes the fetched HTML content and run the agent using await agent.run(prompt=prompt_text)
. Pydantic AI will guide the LLM to parse the HTML and extract property information into the Property
Pydantic model. agent_result.data
will contain the structured Property
object, if successful.
Key Takeaways and Further Steps
Building AI agents with Pydantic AI to output structured data opens up a world of possibilities for enterprise applications. Here are some key takeaways and directions for further exploration:
- System Prompts are Powerful: Experiment extensively with different system prompts. Well-crafted system prompts are crucial for fine-tuning your agent’s behavior and ensuring it understands its task and output format.
- Robust Error Handling is Essential: Implement error handling using
try-except
blocks. Be prepared to handle cases where the LLM might not always produce output that perfectly conforms to your Pydantic model. Pydantic’s validation will raiseValidationError
exceptions when the output doesn’t match the model, which you can catch and handle gracefully. - Integrate into Data Pipelines: These structured-data-outputting agents are ideal for integration into larger data pipelines and workflows. Use them to process data at scale, feed structured information into databases, trigger downstream processes, and build complex AI-driven systems.
- Model Choice Matters: The choice of the underlying LLM is critical for performance and accuracy. Thoroughly test with various models (GPT-4, Gemini Pro, Claude 3, etc.) to determine which best fulfills the specific needs of each project and use case.
- Security Best Practices: Always prioritize security. Never hardcode API keys directly into your code! Use environment variables or secure configuration management to handle API keys and sensitive credentials.
- Iterative Refinement is Key: The process of defining Pydantic models, crafting effective prompts, and achieving robust structured output is often iterative. Expect to refine your models, prompts, and agent logic based on the results you observe and the specific challenges of your data and tasks.
This comprehensive guide, with its detailed steps and practical examples, provides a solid foundation for building enterprise-grade AI agents that produce reliable and structured data using Pydantic AI. Remember to adapt these code examples to your specific use cases, data structures, and enterprise requirements. By leveraging the power of Pydantic AI and structured data output, you can unlock the true potential of AI agents in your organization.