Pydantic AI Part 3

Building Enterprise-Grade AI Agents with Pydantic AI: Outputting Structured Data

This guide demonstrates how to create AI agents using Pydantic AI that output structured data in the form of Pydantic models. This approach significantly improves reliability and predictability, making Pydantic AI ideal for enterprise-grade AI applications.

Why Structured Data Matters

In enterprise applications, dealing with data that is reliable, predictable, and consistent is paramount. When AI agents, especially those powered by LLMs, output unstructured text, it can introduce significant challenges. Pydantic AI, by enabling agents to directly output structured data, addresses these challenges head-on:

  • Reliability: Strongly-typed, validated data objects, defined by Pydantic models, are absolutely crucial for building robust enterprise applications. They provide a contract for the data your agents produce.
  • Predictability: By enforcing structured output, Pydantic AI dramatically reduces unexpected errors and crashes that can arise from the often unpredictable nature of raw, unfiltered LLM outputs. You gain control over the shape and type of data your agents return.
  • Data Integrity: Structured data ensures data consistency across your AI-driven systems. This consistency is vital for seamless downstream processing, analysis, and integration with other enterprise systems and databases.

Let's explore practical examples of how to build Pydantic AI agents that output structured data.

Example 1: Hello World (Simple Calculation)

This first example demonstrates a very basic agent that performs a simple calculation and outputs the result as a structured Pydantic model.

Import Libraries:

from pydantic import BaseModel
from pydantic_ai import Agent

Configure Logging (Optional):

For robust enterprise applications, detailed logging is essential. You can use a logging framework like Logfire for detailed tracking of your agent’s execution. (Note: Logfire implementation details are not shown in this extract, but it’s a valuable consideration for production systems).

Define the Output Model:

class Calculation(BaseModel):
    result: int

Here, we define a Pydantic model named Calculation. This model is incredibly simple, containing a single field: result, which is an integer. This model will structure the output of our agent.

Create the Agent:

agent = Agent('openai:gpt-4-0613', result_type=Calculation)

This line initializes our Pydantic AI Agent. Crucially, we pass two arguments:

  • 'openai:gpt-4-0613': This specifies the underlying LLM model we want to use. In this case, we’re using gpt-4-0613 from OpenAI. You can replace this with your preferred model name (e.g., a model from Gemini, Anthropic, or even a locally run model). Make sure you have the necessary API keys or configurations set up for your chosen model.
  • result_type=Calculation: This is the key to structured output! We explicitly tell the Agent that we expect it to return data that conforms to our Calculation Pydantic model. Pydantic AI will handle the process of guiding the LLM to produce output that can be validated against this model.

Run the Agent and Get the Result:

result = await agent.run(prompt="What is 100 + 300?")
print(result)
print(result.data)

await agent.run(prompt="What is 100 + 300?"): This line executes the agent. We use agent.run() for asynchronous execution, passing the prompt “What is 100 + 300?”. Pydantic AI takes care of sending this prompt to the configured LLM.

print(result): This will print the full RunResult object, which contains metadata about the agent run in addition to the structured data.

print(result.data): This is where we access the structured output! result.data will contain an instance of our Calculation Pydantic model, populated with the LLM’s response, validated by Pydantic. In this case, it will be a Calculation object where result is likely set to 400.

Example 2: Capital Cities (Multi-field Model)

Let’s move to a slightly more complex example. Here, we’ll create a Pydantic model with multiple fields to represent information about capital cities.

Import Libraries:

from pydantic import BaseModel
from pydantic_ai import Agent

Define the Output Model:

class Capital(BaseModel):
    name: str
    year_founded: int
    short_history: str

This time, our Capital Pydantic model has three fields:

  • name: The name of the capital city (string).
  • year_founded: The year the city was founded (integer).
  • short_history: A brief historical description of the city (string).

Create the Agent:

agent = Agent('openai:gpt-4-0613', result_type=Capital)

We create the Agent in the same way as before, but now we set result_type=Capital, indicating that we expect structured output conforming to the Capital model.

Run the Agent:

result = await agent.run(prompt="What is the capital of the United States?")
print(result)
print(result.data)

We run the agent with the prompt “What is the capital of the United States?”. Pydantic AI, guided by the result_type=Capital, will instruct the LLM to provide information that can be parsed and validated into a Capital object. result.data will then contain a Capital instance, with fields like name set to “Washington, D.C.”, year_founded to 1790 (or thereabouts), and short_history containing a brief history of the city.

Example 3: Invoice Parsing

Now let’s tackle a more practical enterprise use case: parsing unstructured invoice data and extracting it into a structured format.

Prepare Invoice Data:

For this example, we’ll assume you have a sample invoice available, perhaps in a text-based format like Markdown, or even as a raw text extract. Let’s imagine our invoice.md contains invoice information structured like this (or even less structured raw text):

## Invoice

**Invoice Number:** INV-2024-10-27
**Date Issued:** 2024-10-27

### Services Provided:
- Consulting Services - 10 hours
- Project Management - 5 hours

**Subtotal:** $1500.00
**Tax Rate:** 0.08
**Tax Amount:** $120.00
**Total Amount Due:** $1620.00

### Payment Instructions:
Bank Name: Example Bank
Account Number: 1234567890

Generate Pydantic Model (Optional - Using an LLM):

While you can manually define the Pydantic model for an invoice, for complex documents, you can leverage an LLM itself to help you generate the model structure! You could prompt an LLM like GPT-4 (or a similar model) with a sample invoice and ask it to “generate a Pydantic model in Python to represent this invoice data.” This can be a great starting point, which you can then refine.

Import Libraries:

from pydantic import BaseModel, Field
from typing import List, Dict
from pydantic_ai import Agent

Paste/Define the Pydantic Model:

class Invoice(BaseModel):
    invoice_number: str
    date_issued: str
    services_provided: List[str]
    subtotal: float
    tax_rate: float
    tax_amount: float
    total_amount_due: float
    payment_instructions: Dict[str, str]

Here’s our Invoice Pydantic model, designed to capture the key fields from an invoice:

  • invoice_number: Invoice identifier (string).
  • date_issued: Date of invoice issuance (string - you might want to use datetime.date with appropriate parsing for real-world applications).
  • services_provided: A list of strings describing services rendered.
  • subtotal: The subtotal amount (float).
  • tax_rate: The tax rate applied (float).
  • tax_amount: The calculated tax amount (float).
  • total_amount_due: The final total amount due (float).
  • payment_instructions: A dictionary to hold payment details (e.g., bank name, account number).

Create the Agent:

agent = Agent('openai:gpt-4-0613', result_type=Invoice)

We set up our agent, specifying result_type=Invoice.

Run with Textual Prompt (First Run):

result = await agent.run(prompt="Parse the following invoice and extract the details:\n\n[Invoice Text Here]") # Replace [Invoice Text Here] with the actual invoice text
print(result)
print(result.data)

Replace [Invoice Text Here] with the actual text content of your invoice (e.g., read from invoice.md). The prompt instructs the agent to “parse the invoice”. Pydantic AI will guide the LLM to process the invoice text and attempt to extract the information into the Invoice Pydantic model structure. result.data will then contain a validated Invoice object, if successful.

Example 4: Resume Parsing

Let’s extend this to another common enterprise task: resume parsing.

Prepare Resume Data:

Similar to the invoice example, prepare a sample resume, perhaps in Markdown format (e.g., resume.md).

Generate Pydantic Model (Optional):

Again, consider using an LLM to assist in generating a starting Pydantic model structure based on a sample resume’s content.

Import Libraries:

from pydantic import BaseModel, Field
from typing import List, Optional
from pydantic_ai import Agent

Paste/Define the Pydantic Model:

class Experience(BaseModel):
    company: str
    position: str
    start_date: str
    end_date: str

class Education(BaseModel):
    institution_name: str
    degree: str
    start_date: str
    end_date: str

class Resume(BaseModel):
    full_name: str
    contact_information: str
    experience: List[Experience]
    education: List[Education]
    certifications: List[str]
    skills: List[str]
    summary: str = Field("") #Optional Field

This example uses nested Pydantic models to represent the hierarchical structure of a resume:

  • Experience: Model for representing a work experience entry.
  • Education: Model for representing an education entry.
  • Resume: The main Resume model, containing fields for:
    • full_name, contact_information (strings)
    • experience: A list of Experience models.
    • education: A list of Education models.
    • certifications, skills: Lists of strings.
    • summary: An optional summary field (string, using Field("") to set a default empty string).

Create the Agent:

agent = Agent('openai:gpt-4-0613', result_type=Resume, system_prompt="You are an HR representative skilled at extracting information from resumes and CVs. Extract the key details into structured JSON format.")

Here, when creating the Agent, we not only set result_type=Resume but also provide a system_prompt. System prompts are crucial for guiding the LLM’s behavior. In this case, the system prompt sets the persona of the agent as an “HR representative” and instructs it to extract resume information into a “structured JSON format” (which Pydantic AI will then validate against our Resume model).

Run with Textual Prompt (First Run):

result = await agent.run(prompt="Parse the following resume:\n\n[Resume Text Here]") # Replace [Resume Text Here] with the actual resume text
print(result)
print(result.data)

Replace [Resume Text Here] with the resume text. The agent will attempt to parse the resume and output a Resume Pydantic model instance in result.data.

Example 5: Real Estate Listing Parsing

For our final example, let’s consider parsing data from a real estate listing webpage. This demonstrates how Pydantic AI agents can interact with real-world web data.

Choose a Listing Page:

Select a real estate listing webpage from a site like homes.com, Redfin, or Zillow. You’ll need the URL of this page.

Generate Pydantic Model (Optional):

As before, consider using an LLM to help generate a Pydantic model structure that captures the key information you want to extract from real estate listings.

Import Libraries:

from pydantic import BaseModel, Field
from typing import List, Optional
from pydantic_ai import Agent
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup

Here, we import additional libraries:

  • playwright.async_api: Playwright is a powerful library for browser automation, allowing us to fetch the HTML content of webpages.
  • bs4.BeautifulSoup: BeautifulSoup is a Python library for parsing HTML and XML.

Paste/Define the Pydantic Model:

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str

class PropertyFeatures(BaseModel):
    bedrooms: int
    bathrooms: float
    square_footage: int
    # Add other features as needed (e.g., lot_size, year_built, etc.)

class AdditionalInformation(BaseModel):
    price: str
    listing_agent: str
    last_updated: str

class Property(BaseModel):
    address: Address
    info: AdditionalInformation
    type: str
    mls_id: str
    features: PropertyFeatures

We define a set of nested Pydantic models to represent real estate property data:

  • Address: Model for the property address.
  • PropertyFeatures: Model for key property features (bedrooms, bathrooms, square footage, etc. - you can extend this with more features).
  • AdditionalInformation: Model for listing-specific information (price, agent, last updated date).
  • Property: The main Property model, combining Address, AdditionalInformation, PropertyFeatures, and also including type (e.g., “House”, “Condo”) and mls_id.

Create the Agent:

agent = Agent('openai:gpt-4-0613', result_type=Property, system_prompt="You are a real estate data extraction specialist. Your task is to extract structured information from real estate listings.")

We create an Agent with result_type=Property and a system prompt that sets the agent’s role as a “real estate data extraction specialist.”

Run with Textual Prompt (First Run):

async with async_playwright() as p:
    browser = await p.chromium.launch()
    page = await browser.new_page()
    listing_url = "YOUR_REAL_ESTATE_LISTING_URL_HERE" # Replace with a real listing URL
    await page.goto(listing_url)
    html_content = await page.content()
    await browser.close()

    prompt_text = f"Extract details from this real estate listing:\n\n{html_content}"
    agent_result = await agent.run(prompt=prompt_text)
    print(agent_result)
    print(agent_result.data)

In this example, we first use Playwright to:

  • Launch a Chromium browser instance.
  • Open a new page and navigate to YOUR_REAL_ESTATE_LISTING_URL_HERE (replace this with a real URL).
  • Get the HTML content of the page using page.content().
  • Close the browser.

Then, we construct a prompt_text that includes the fetched HTML content and run the agent using await agent.run(prompt=prompt_text). Pydantic AI will guide the LLM to parse the HTML and extract property information into the Property Pydantic model. agent_result.data will contain the structured Property object, if successful.

Key Takeaways and Further Steps

Building AI agents with Pydantic AI to output structured data opens up a world of possibilities for enterprise applications. Here are some key takeaways and directions for further exploration:

  • System Prompts are Powerful: Experiment extensively with different system prompts. Well-crafted system prompts are crucial for fine-tuning your agent’s behavior and ensuring it understands its task and output format.
  • Robust Error Handling is Essential: Implement error handling using try-except blocks. Be prepared to handle cases where the LLM might not always produce output that perfectly conforms to your Pydantic model. Pydantic’s validation will raise ValidationError exceptions when the output doesn’t match the model, which you can catch and handle gracefully.
  • Integrate into Data Pipelines: These structured-data-outputting agents are ideal for integration into larger data pipelines and workflows. Use them to process data at scale, feed structured information into databases, trigger downstream processes, and build complex AI-driven systems.
  • Model Choice Matters: The choice of the underlying LLM is critical for performance and accuracy. Thoroughly test with various models (GPT-4, Gemini Pro, Claude 3, etc.) to determine which best fulfills the specific needs of each project and use case.
  • Security Best Practices: Always prioritize security. Never hardcode API keys directly into your code! Use environment variables or secure configuration management to handle API keys and sensitive credentials.
  • Iterative Refinement is Key: The process of defining Pydantic models, crafting effective prompts, and achieving robust structured output is often iterative. Expect to refine your models, prompts, and agent logic based on the results you observe and the specific challenges of your data and tasks.

This comprehensive guide, with its detailed steps and practical examples, provides a solid foundation for building enterprise-grade AI agents that produce reliable and structured data using Pydantic AI. Remember to adapt these code examples to your specific use cases, data structures, and enterprise requirements. By leveraging the power of Pydantic AI and structured data output, you can unlock the true potential of AI agents in your organization.

Explore More

Pydantic AI Course - Part 2

Building robust and reliable AI agents is a challe...