How to Get Structured JSON Output from Claude API

If you've ever tried to build a production AI application on top of Claude, you've run into this problem: Claude gives you a beautifully written paragraph when you need a clean JSON object your code can parse.

Unstructured text responses are fine for chatbots. But the moment you're building a data extraction pipeline, a classification system, or any backend integration, you need Claude to return data in a predictable, machine-readable format — every single time.

This guide covers every technique for getting structured JSON out of Claude's API, from basic prompting tricks to production-grade approaches using tool use and Pydantic schemas.

Why JSON Output From LLMs Is Hard (and Why Claude Makes It Easier)

Language models are trained to generate natural language, not programming language. Ask Claude "what is the sentiment of this review?" and it will say "The sentiment is positive, with the reviewer expressing enthusiasm about the product quality." That's great for a human — useless for a database insert.

The naive fix is to add "respond with JSON only" to your prompt. This works maybe 70% of the time. The other 30%, Claude wraps the JSON in a markdown code block, adds a helpful explanation before it, or decides to use slightly different field names than you asked for.

For hobby projects, that's annoying. For production pipelines processing thousands of records, it's a showstopper.

Claude's API has two proper mechanisms for forcing structured output: tool use (function calling) and a JSON-enforced system prompt pattern. Both are reliable enough for production. Tool use is the gold standard.

Method 1: Tool Use (The Production-Grade Approach)

Anthropic's tool use feature was designed primarily for function calling, but it has a superpower: when you define a tool, Claude must return a structured response that matches your JSON schema. It physically cannot deviate from the schema — the API enforces it at the output layer.

Here's the core concept: you define a "fake" tool that represents the output format you want. Claude "calls" the tool with the data extracted from the input, and you capture the tool's arguments as your structured output.

Python Example: Extracting Product Data

pythonimport anthropic
import json

client = anthropic.Anthropic()

# Define the schema for your expected output
extract_product_tool = {
    "name": "extract_product",
    "description": "Extract structured product information from unstructured text",
    "input_schema": {
        "type": "object",
        "properties": {
            "product_name": {
                "type": "string",
                "description": "The name of the product"
            },
            "price": {
                "type": "number",
                "description": "Price in USD, as a float"
            },
            "category": {
                "type": "string",
                "enum": ["electronics", "clothing", "food", "other"],
                "description": "Product category"
            },
            "in_stock": {
                "type": "boolean",
                "description": "Whether the product is currently available"
            },
            "features": {
                "type": "array",
                "items": {"type": "string"},
                "description": "List of key product features"
            }
        },
        "required": ["product_name", "price", "category", "in_stock"]
    }
}

def extract_product_info(raw_text: str) -> dict:
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        tools=[extract_product_tool],
        # Force Claude to use the tool
        tool_choice={"type": "tool", "name": "extract_product"},
        messages=[
            {
                "role": "user",
                "content": f"Extract the product information from this text:\n\n{raw_text}"
            }
        ]
    )
    
    # The tool call is always in the first content block
    tool_use_block = response.content[0]
    return tool_use_block.input  # Already a dict — no json.loads() needed

# Test it
raw = """
The Sony WH-1000XM5 headphones are back in stock at $279.99. 
These wireless noise-cancelling cans feature 30-hour battery life, 
multipoint connection for two devices, and industry-leading ANC. 
They come in Black and Platinum Silver.
"""

product = extract_product_info(raw)
print(json.dumps(product, indent=2))

Output:

json{
  "product_name": "Sony WH-1000XM5",
  "price": 279.99,
  "category": "electronics",
  "in_stock": true,
  "features": [
    "30-hour battery life",
    "Multipoint connection for two devices",
    "Industry-leading active noise cancellation",
    "Available in Black and Platinum Silver"
  ]
}

The key line is tool_choice={"type": "tool", "name": "extract_product"}. This tells Claude it must call that specific tool — no free-text response, no markdown wrapping, just a clean structured call.

TypeScript Example: Sentiment Classification

typescriptimport Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const classifySentimentTool: Anthropic.Tool = {
  name: "classify_sentiment",
  description: "Classify the sentiment and key signals from a customer review",
  input_schema: {
    type: "object" as const,
    properties: {
      sentiment: {
        type: "string",
        enum: ["positive", "negative", "neutral", "mixed"],
      },
      confidence: {
        type: "number",
        description: "Confidence score from 0.0 to 1.0",
      },
      key_themes: {
        type: "array",
        items: { type: "string" },
        description: "Main topics mentioned in the review",
      },
      would_recommend: {
        type: "boolean",
        description: "Whether the reviewer would recommend the product",
      },
    },
    required: ["sentiment", "confidence", "key_themes", "would_recommend"],
  },
};

async function classifyReview(reviewText: string) {
  const response = await client.messages.create({
    model: "claude-opus-4-6",
    max_tokens: 512,
    tools: [classifySentimentTool],
    tool_choice: { type: "tool", name: "classify_sentiment" },
    messages: [
      {
        role: "user",
        content: `Classify this review: "${reviewText}"`,
      },
    ],
  });

  const toolUse = response.content[0] as Anthropic.ToolUseBlock;
  return toolUse.input as {
    sentiment: string;
    confidence: number;
    key_themes: string[];
    would_recommend: boolean;
  };
}

// Usage
const result = await classifyReview(
  "Great build quality but the software is buggy and battery life disappointed me after the first month."
);
console.log(result);
// { sentiment: 'mixed', confidence: 0.92, key_themes: ['build quality', 'software bugs', 'battery life'], would_recommend: false }

Method 2: Pydantic Integration (Python Power Users)

If you're building Python data pipelines, combining Claude's tool use with Pydantic models gives you full type safety and automatic validation. The pattern is: define a Pydantic model, convert it to a JSON schema, use it as your tool definition.

pythonfrom pydantic import BaseModel, Field
from typing import Optional, List
import anthropic
import json

client = anthropic.Anthropic()

class JobPosting(BaseModel):
    job_title: str = Field(description="The job title or role name")
    company: str = Field(description="Company name")
    salary_min: Optional[float] = Field(None, description="Minimum salary in USD per year")
    salary_max: Optional[float] = Field(None, description="Maximum salary in USD per year")
    required_skills: List[str] = Field(default_factory=list, description="Required technical skills")
    experience_years: Optional[int] = Field(None, description="Minimum years of experience required")
    is_remote: bool = Field(description="Whether the role is fully remote")
    location: Optional[str] = Field(None, description="Office location if not remote")

def pydantic_to_tool(model: type[BaseModel], tool_name: str, description: str) -> dict:
    """Convert a Pydantic model to an Anthropic tool definition."""
    schema = model.model_json_schema()
    # Anthropic expects input_schema, not the full JSON schema
    return {
        "name": tool_name,
        "description": description,
        "input_schema": schema
    }

def extract_job_posting(raw_text: str) -> JobPosting:
    tool = pydantic_to_tool(
        JobPosting, 
        "extract_job_posting",
        "Extract structured job posting data from unstructured text"
    )
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        tools=[tool],
        tool_choice={"type": "tool", "name": "extract_job_posting"},
        messages=[{"role": "user", "content": f"Extract job info:\n{raw_text}"}]
    )
    
    raw_data = response.content[0].input
    return JobPosting(**raw_data)  # Validated Pydantic model

# Test
job_text = """
We're hiring a Senior ML Engineer at Anthropic (San Francisco or Remote).
Competitive salary $180k-$240k DOE. You'll need 5+ years with Python, PyTorch, 
and distributed training. Experience with RLHF is a big plus.
"""

job = extract_job_posting(job_text)
print(job.model_dump_json(indent=2))
print(f"\nIs remote: {job.is_remote}")
print(f"Salary range: ${job.salary_min:,.0f} - ${job.salary_max:,.0f}")

This approach validates the output at the Python layer — if Claude somehow returns a salary_min that isn't a number, Pydantic raises an error immediately rather than letting bad data propagate downstream.

Method 3: System Prompt JSON Mode (Simpler Cases)

For simpler use cases where you control the full prompt and don't need hard schema enforcement, a well-crafted system prompt plus prefilling the assistant response can reliably produce JSON:

pythondef ask_claude_for_json(prompt: str, schema_description: str) -> dict:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="""You are a data extraction API. You ALWAYS respond with valid JSON only.
Never include explanations, markdown formatting, or code blocks.
Return raw JSON that exactly matches the requested schema.""",
        messages=[
            {"role": "user", "content": f"{schema_description}\n\nInput: {prompt}"},
            # Prefilling the assistant response forces JSON output
            {"role": "assistant", "content": "{"}
        ]
    )
    
    # Prepend the "{" we used to prefill
    raw = "{" + response.content[0].text
    return json.loads(raw)

The trick here is response prefilling — by starting the assistant turn with {, you prevent Claude from ever writing a preamble. It can only complete valid JSON.

When to use this: Works well for simple schemas, saves token overhead compared to tool use. Risky for complex schemas or when schema compliance is critical. When NOT to use this: High-volume production pipelines where a malformed JSON would crash your application. Use tool use instead.

Handling Nested Schemas and Arrays

Tool use handles nested objects and arrays natively. Here's a more complex example — extracting a multi-section document:

pythonextract_report_tool = {
    "name": "extract_financial_report",
    "description": "Extract key metrics from a financial report",
    "input_schema": {
        "type": "object",
        "properties": {
            "company": {"type": "string"},
            "fiscal_period": {"type": "string"},
            "revenue": {
                "type": "object",
                "properties": {
                    "amount": {"type": "number"},
                    "currency": {"type": "string"},
                    "yoy_change_pct": {"type": "number"}
                },
                "required": ["amount", "currency"]
            },
            "segments": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "revenue": {"type": "number"},
                        "growth_pct": {"type": "number"}
                    },
                    "required": ["name", "revenue"]
                }
            },
            "guidance": {
                "type": "object",
                "properties": {
                    "next_quarter_low": {"type": "number"},
                    "next_quarter_high": {"type": "number"}
                }
            }
        },
        "required": ["company", "fiscal_period", "revenue"]
    }
}

Claude handles arbitrarily deep nesting without issues. The schema is passed directly to Anthropic's API, which validates the output at the model level.

Batch Processing at Scale

When processing large datasets, use the Anthropic Batch API to avoid rate limits and reduce costs by up to 50%:

pythonimport anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

def build_extraction_request(record_id: str, text: str) -> Request:
    return {
        "custom_id": record_id,
        "params": MessageCreateParamsNonStreaming(
            model="claude-sonnet-4-6",  # Use Sonnet for batch — cheaper
            max_tokens=512,
            tools=[extract_product_tool],
            tool_choice={"type": "tool", "name": "extract_product"},
            messages=[{"role": "user", "content": f"Extract product info: {text}"}]
        )
    }

# Build batch of up to 10,000 requests
records = [
    ("prod_001", "Nike Air Max 90 in White, $120, in stock..."),
    ("prod_002", "Adidas Ultraboost 22, $180, limited sizes..."),
    # ... thousands more
]

batch_requests = [build_extraction_request(id, text) for id, text in records]

# Submit batch
batch = client.messages.batches.create(requests=batch_requests)
print(f"Batch submitted: {batch.id}")
# Poll batch.processing_status until 'ended'

Batch processing is the right tool when you're running structured extraction on large corpora — product catalogs, document archives, review datasets.

Choosing the Right Model

Use Case	Recommended Model	Reason
Complex multi-field extraction	claude-opus-4-6	Best schema adherence on ambiguous data
High-volume classification	claude-sonnet-4-6	80% cheaper, 95%+ accuracy on clear schemas
Simple key-value extraction	claude-haiku-4-5	Fastest and cheapest for trivial schemas
Batch processing at scale	claude-sonnet-4-6	Cost-performance sweet spot

For schemas with more than 10 fields or conditional logic, test on Opus first before switching to Sonnet for cost optimization.

Common Pitfalls and How to Avoid Them

1. Optional fields causing missing keys

If you mark fields as optional in your schema but still expect them in every response, Claude may omit them when it can't find the data. Either make them required (Claude will return null) or handle missing keys explicitly in your code.

2. Enum mismatches

If your text uses "positive" but your enum only allows "POSITIVE", Claude may return whatever matches the data. Normalize after extraction: result["sentiment"].lower().

3. Number strings vs numbers

Prices like "$29.99" sometimes come back as the string "29.99" instead of the float 29.99. Add a Pydantic validator or coerce types after extraction.

4. Schema too large

Schemas with 30+ fields consume significant token budget and can degrade extraction accuracy. Break large schemas into focused sub-extractions and join the results.

5. Not using tool_choice: {type: "tool"}

If you provide tools but don't force their use with tool_choice, Claude may decide to answer in free text. Always set tool_choice when structured output is required.

Key Takeaways

Tool use is the production standard — the API enforces schema compliance, no parsing hacks required
Pydantic + tool use is the cleanest pattern for Python data pipelines, giving you type safety on top of schema enforcement
Response prefilling ({"role": "assistant", "content": "{"}) is a lightweight fallback for simple schemas
tool_choice: {type: "tool", name: "..."} is mandatory — without it, Claude may answer in free text
Use Sonnet for volume, Opus for complex extraction — the cost difference is 5x

Next Steps

Getting reliable structured output from Claude is the foundation for building serious AI-powered applications: data pipelines, document processing systems, classification engines, and more.

If you're preparing for the Claude Certified Architect (CCA-F) exam, structured output patterns appear in the Integration Patterns and Agentic Design sections. Knowing when to use tool use vs. response prefilling vs. prompt-only approaches is a core exam topic.

Practice your Claude API skills:

Take a free CCA-F practice quiz on AI for Anything — covers tool use, multi-turn conversations, and production patterns
Read our Claude API Production Best Practices guide — rate limiting, error handling, and cost optimization

For more advanced patterns, the Anthropic documentation on tool use is the authoritative reference.

How to Get Structured JSON Output from Claude API: Complete 2026 Tutorial