Claude API with Python: Complete Tutorial with Real-World Examples

If you've been using Claude through the web interface and want to integrate it into your own applications, Python is the fastest path forward. The Anthropic Python SDK is well-documented, actively maintained, and takes minutes to set up — but most tutorials stop at "hello world."

This guide goes further. You'll learn streaming, multi-turn conversations, tool use (function calling), error handling, and cost-efficient patterns used in production apps. By the end, you'll have everything you need to build real Claude-powered features.

What You'll Need

Before writing a single line of code:

An Anthropic API key — get one at console.anthropic.com

Python 3.8+ installed on your machine

Basic Python familiarity — this is a tutorial, not an intro to Python

Set your API key as an environment variable (never hardcode it):

bashexport ANTHROPIC_API_KEY="sk-ant-your-key-here"

Or if you're using a .env file with python-dotenv:

bashpip install python-dotenv anthropic

pythonfrom dotenv import load_dotenv
load_dotenv()

Setting Up the Anthropic Python SDK

Install the official SDK:

bashpip install anthropic

That's it. No heavy dependencies, no complex configuration. The SDK handles authentication automatically by reading ANTHROPIC_API_KEY from your environment.

Your First API Call

pythonimport anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what an API is in two sentences."}
    ]
)

print(message.content[0].text)

Output:

An API (Application Programming Interface) is a set of rules and protocols 
that allows different software applications to communicate with each other. 
It acts as a contract between two systems, defining how requests should be 
made and what kind of responses to expect.

Understanding the Response Object

The message object contains more than just the text:

pythonprint(message.model)          # "claude-sonnet-4-6"
print(message.stop_reason)    # "end_turn"
print(message.usage.input_tokens)   # tokens you sent
print(message.usage.output_tokens)  # tokens in the response

Track usage carefully — it's how you calculate your API costs.

System Prompts: Shaping Claude's Behavior

A system prompt is the most powerful tool for controlling how Claude responds. It sets context, persona, and constraints that persist across the entire conversation.

pythonclient = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    system="You are a senior Python engineer reviewing code for production readiness. \
Be direct, specific, and prioritize security and performance issues first.",
    messages=[
        {"role": "user", "content": "Review this: x = input('Enter password: ')"}
    ]
)

print(message.content[0].text)

Good system prompts are:

Specific about role and expertise level — not just "you are a helpful assistant"
Clear about output format — "respond in bullet points", "use markdown headers"
Bounded in scope — tell Claude what to focus on and what to ignore

Multi-Turn Conversations

Unlike single-shot queries, real applications need conversational memory. You manage this yourself by building the messages array:

pythonimport anthropic

client = anthropic.Anthropic()

def chat(conversation_history, user_message):
    """Send a message and get a response, maintaining conversation history."""
    conversation_history.append({
        "role": "user",
        "content": user_message
    })
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a Python tutor helping beginners learn programming.",
        messages=conversation_history
    )
    
    assistant_message = response.content[0].text
    conversation_history.append({
        "role": "assistant",
        "content": assistant_message
    })
    
    return assistant_message, conversation_history

# Usage
history = []
reply, history = chat(history, "What is a list in Python?")
print(reply)

reply, history = chat(history, "How is it different from a tuple?")
print(reply)  # Claude remembers the previous context

Key pattern: You own the conversation history. Pass it with every request. This gives you full control over context window usage — you can summarize old messages, drop irrelevant turns, or persist history to a database.

Streaming Responses

For any user-facing app, streaming is essential. It makes responses feel instant rather than waiting for the full reply to generate.

pythonimport anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a Python function to parse CSV files with error handling."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    
    # Get final message after streaming completes
    final_message = stream.get_final_message()
    print(f"\n\nTokens used: {final_message.usage.input_tokens} in / {final_message.usage.output_tokens} out")

Async Streaming (FastAPI / async apps)

If you're building a web API, use the async client:

pythonimport asyncio
import anthropic

async def stream_response(user_prompt: str):
    client = anthropic.AsyncAnthropic()
    
    async with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": user_prompt}]
    ) as stream:
        async for text in stream.text_stream:
            yield text  # yield to your FastAPI streaming response

# In a FastAPI endpoint:
# from fastapi.responses import StreamingResponse
# return StreamingResponse(stream_response(prompt), media_type="text/plain")

Tool Use (Function Calling)

Tool use lets Claude call functions you define — the foundation for building agents, data pipelines, and automated workflows.

Here's a practical example: a weather assistant that can call a weather API.

pythonimport anthropic
import json

client = anthropic.Anthropic()

# Define the tools Claude can use
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city. Returns temperature, conditions, and humidity.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g. 'San Francisco'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units"
                }
            },
            "required": ["city"]
        }
    }
]

def get_weather(city: str, units: str = "celsius") -> dict:
    """Simulate a weather API call."""
    # In production, call a real weather API here
    return {
        "city": city,
        "temperature": 22,
        "units": units,
        "conditions": "Partly cloudy",
        "humidity": 65
    }

def run_weather_agent(user_message: str):
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )
        
        # If Claude wants to use a tool
        if response.stop_reason == "tool_use":
            # Add Claude's response to history
            messages.append({"role": "assistant", "content": response.content})
            
            # Process each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    # Execute the function
                    result = get_weather(**block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })
            
            # Add tool results to history
            messages.append({"role": "user", "content": tool_results})
        
        # Claude has finished
        elif response.stop_reason == "end_turn":
            return response.content[0].text
        else:
            break

# Usage
answer = run_weather_agent("What's the weather like in Tokyo and should I bring an umbrella?")
print(answer)

This pattern — loop until end_turn, execute tools when stop_reason == "tool_use" — is the backbone of every Claude agent.

Real-World Project: Document Summarizer

Let's build something practical: a script that reads a text file and generates a structured summary with key points, action items, and a TL;DR.

pythonimport anthropic
from pathlib import Path

client = anthropic.Anthropic()

SUMMARIZER_SYSTEM = """You are a document analyst. When given a document, respond with:

## TL;DR
[2-3 sentence summary]

## Key Points
- [Point 1]
- [Point 2]
- [Point 3]

## Action Items
- [Actionable item if any, otherwise "None identified"]

## Sentiment
[Positive/Neutral/Negative and why in one sentence]

Always use this exact structure. Be concise."""

def summarize_document(file_path: str) -> dict:
    """Summarize a text document using Claude."""
    path = Path(file_path)
    
    if not path.exists():
        raise FileNotFoundError(f"File not found: {file_path}")
    
    content = path.read_text(encoding="utf-8")
    
    # Trim if document is too long (rough token estimate: 1 token ≈ 4 chars)
    max_chars = 180_000  # ~45K tokens, safe for claude-sonnet-4-6
    if len(content) > max_chars:
        content = content[:max_chars] + "\n\n[Document truncated for length]"
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=SUMMARIZER_SYSTEM,
        messages=[
            {
                "role": "user",
                "content": f"Please summarize this document:\n\n{content}"
            }
        ]
    )
    
    return {
        "summary": response.content[0].text,
        "input_tokens": response.usage.input_tokens,
        "output_tokens": response.usage.output_tokens,
        "file": path.name
    }

# Usage
result = summarize_document("meeting_notes.txt")
print(result["summary"])
print(f"\nTokens: {result['input_tokens']} in / {result['output_tokens']} out")

Error Handling in Production

The API can fail. Your application needs to handle it gracefully.

pythonimport anthropic
import time

client = anthropic.Anthropic()

def resilient_completion(prompt: str, retries: int = 3) -> str:
    """API call with retry logic for rate limits and transient errors."""
    
    for attempt in range(retries):
        try:
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=512,
                messages=[{"role": "user", "content": prompt}]
            )
            return response.content[0].text
        
        except anthropic.RateLimitError:
            if attempt < retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff: 1s, 2s, 4s
                print(f"Rate limited. Waiting {wait_time}s before retry {attempt + 1}...")
                time.sleep(wait_time)
            else:
                raise
        
        except anthropic.APITimeoutError:
            if attempt < retries - 1:
                print(f"Timeout on attempt {attempt + 1}. Retrying...")
                time.sleep(1)
            else:
                raise
        
        except anthropic.AuthenticationError:
            raise  # Don't retry auth errors — the key is wrong
        
        except anthropic.BadRequestError as e:
            print(f"Bad request: {e}")
            raise  # Don't retry bad requests — fix the input

Common errors you'll encounter:

Error	Cause	Fix
`AuthenticationError`	Invalid API key	Check `ANTHROPIC_API_KEY` env var
`RateLimitError`	Too many requests	Exponential backoff + retry
`APITimeoutError`	Request took too long	Lower `max_tokens`, retry
`BadRequestError`	Invalid message format	Check message structure
`OverloadedError`	Anthropic servers busy	Retry with backoff

Cost Optimization Patterns

Claude is priced per token. At scale, these patterns matter:

1. Cache system prompts with Prompt Caching

If you use a long system prompt repeatedly, enable caching to reduce costs by up to 90% on repeated calls:

pythonresponse = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "Your very long system prompt here...",
            "cache_control": {"type": "ephemeral"}  # Cache this block
        }
    ],
    messages=[{"role": "user", "content": user_message}]
)

2. Use the right model for the task

Task	Recommended Model	Why
Simple classification, extraction	`claude-haiku-4-5`	10x cheaper than Sonnet
General development, writing	`claude-sonnet-4-6`	Best price/performance
Complex reasoning, architecture	`claude-opus-4-6`	Max capability

3. Set realistic max_tokens max_tokens is the ceiling, not the target. Setting it to 4096 when you need 200 words wastes nothing — you're only billed for tokens generated. But a well-calibrated ceiling prevents runaway generations.

Key Takeaways

Install with pip install anthropic — the SDK handles auth automatically from your environment variable
System prompts are your primary control lever — invest time in writing good ones
Manage conversation history yourself — pass the full messages array every time
Use streaming for any UI — it transforms the user experience
Tool use is the gateway to agents — loop until stop_reason == "end_turn", execute tools in between
Handle errors with exponential backoff — especially RateLimitError and APITimeoutError
Cache repeated system prompts — saves 80-90% on those tokens at scale

Next Steps

If you want to go deeper on the Claude API, check out these resources on AI for Anything:

Claude API Prompt Caching Guide — cut costs on repeated context
Claude Tool Use & Function Calling Tutorial — build production agents
How to Build a Chatbot with Claude API — full chatbot walkthrough

Ready to validate your Claude knowledge? Take our Claude Certified Architect (CCA) practice exam — 200+ questions covering API patterns, agent architecture, and production best practices. The first 20 questions are free.

Claude API with Python: Complete Tutorial with Real-World Examples (2026)