How to Build a Claude-Powered CLI Tool in Python

Most Claude API tutorials stop at "send a message, get a reply." That's fine for demos — but if you want to actually ship something useful, you need to know how to build a proper command-line tool: one that streams responses, remembers conversation context, calls external tools, reads from files, and installs cleanly with pip install.

This tutorial walks you through building a Claude-powered CLI from scratch — a developer assistant called ask that you can invoke from your terminal. By the end, you'll have a fully functional tool you can package and distribute.

What you'll build: A CLI tool called ask that:

Streams Claude responses token-by-token
Maintains multi-turn conversation context within a session
Reads files from disk (so you can ask -f main.py "what does this do?")
Supports system prompt configuration via a config file
Can be installed globally with pip install

Prerequisites and Setup

You need Python 3.10+, an Anthropic API key, and basic familiarity with Python.

bash# Create a new project
mkdir claude-cli && cd claude-cli
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

# Install dependencies
pip install anthropic click rich python-dotenv

anthropic — official Python SDK for the Claude API
click — best-in-class CLI framework for Python
rich — terminal formatting (markdown rendering, spinners)
python-dotenv — load API key from .env file

Create a .env file:

ANTHROPIC_API_KEY=sk-ant-...

Project Structure

claude-cli/
├── ask/
│   ├── __init__.py
│   ├── cli.py          # Main CLI entry point
│   ├── client.py       # Anthropic client wrapper
│   └── config.py       # Config file handling
├── pyproject.toml
└── .env

This structure keeps things clean and makes packaging straightforward later.

Building the Core Client

Start with ask/client.py — the wrapper around the Anthropic SDK that handles streaming and conversation history:

python# ask/client.py
import os
from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

DEFAULT_MODEL = "claude-sonnet-4-6"
DEFAULT_SYSTEM = "You are a helpful developer assistant. Be concise and practical."

class ClaudeClient:
    def __init__(self, system_prompt: str = DEFAULT_SYSTEM):
        self.client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
        self.system = system_prompt
        self.history: list[dict] = []

    def ask(self, user_message: str) -> str:
        """Send a message and stream the response, maintaining history."""
        self.history.append({"role": "user", "content": user_message})

        full_response = ""

        with self.client.messages.stream(
            model=DEFAULT_MODEL,
            max_tokens=4096,
            system=self.system,
            messages=self.history,
        ) as stream:
            for text in stream.text_stream:
                print(text, end="", flush=True)
                full_response += text

        print()  # newline after stream ends
        self.history.append({"role": "assistant", "content": full_response})
        return full_response

    def clear_history(self):
        """Reset conversation context."""
        self.history = []

Key details here:

stream.text_stream yields individual text chunks — this is what gives you the fast token-by-token output instead of waiting for the full response
self.history accumulates turns so Claude has full context within a session
The system prompt sets Claude's behavior globally

The CLI Entry Point

Now build the actual CLI in ask/cli.py:

python# ask/cli.py
import sys
import click
from pathlib import Path
from rich.console import Console
from rich.markdown import Markdown
from .client import ClaudeClient
from .config import load_config

console = Console()

@click.group(invoke_without_command=True)
@click.pass_context
@click.argument("prompt", nargs=-1)
@click.option("-f", "--file", "filepath", type=click.Path(exists=True),
              help="Include a file as context")
@click.option("-s", "--system", default=None,
              help="Override system prompt for this session")
@click.option("--chat", is_flag=True, default=False,
              help="Start an interactive multi-turn chat session")
def ask(ctx, prompt, filepath, system, chat):
    """Ask Claude anything from your terminal."""
    config = load_config()
    system_prompt = system or config.get("system_prompt")
    client = ClaudeClient(system_prompt=system_prompt)

    if chat:
        _run_chat_session(client)
        return

    if filepath:
        content = Path(filepath).read_text()
        file_context = f"File: {filepath}\n\n

\n{content}\n```\n\n"

user_input = file_context + " ".join(prompt)

else:

user_input = " ".join(prompt)

if not user_input.strip():

# Read from stdin if no prompt given (pipe support)

if not sys.stdin.isatty():

user_input = sys.stdin.read().strip()

else:

click.echo("Usage: ask or ask --chat")

sys.exit(1)

client.ask(user_input)

def _run_chat_session(client: ClaudeClient):

"""Interactive multi-turn chat loop."""

console.print("[bold green]Claude chat session started.[/bold green] "

"Type [bold]exit[/bold] or Ctrl+C to quit.\n")

while True:

try:

user_input = click.prompt("You", prompt_suffix=" > ")

except (EOFError, KeyboardInterrupt):

console.print("\n[dim]Session ended.[/dim]")

break

if user_input.lower() in ("exit", "quit", "q"):

break

if user_input.lower() == "/clear":

client.clear_history()

console.print("[dim]History cleared.[/dim]")

continue

console.print("[bold cyan]Claude:[/bold cyan] ", end="")

client.ask(user_input)

print()


This gives you three usage modes:
1. **Single query:** `ask "explain decorators in Python"`
2. **With file context:** `ask -f app.py "find potential bugs"`
3. **Interactive chat:** `ask --chat`

The pipe support (`cat error.log | ask "what's wrong here?"`) is free because we check `sys.stdin.isatty()`.

---

## Configuration File Support

Power users want to set a default system prompt, preferred model, or other settings. Handle this with `ask/config.py`:

python

ask/config.py

import json

import os

from pathlib import Path

CONFIG_PATH = Path.home() / ".config" / "ask" / "config.json"

DEFAULT_CONFIG = {

"system_prompt": "You are a helpful developer assistant. Be concise and practical.",

"model": "claude-sonnet-4-6",

"max_tokens": 4096,

}

def load_config() -> dict:

if CONFIG_PATH.exists():

with open(CONFIG_PATH) as f:

return {DEFAULT_CONFIG, json.load(f)}

return DEFAULT_CONFIG

def save_config(updates: dict):

CONFIG_PATH.parent.mkdir(parents=True, exist_ok=True)

current = load_config()

current.update(updates)

with open(CONFIG_PATH, "w") as f:

json.dump(current, f, indent=2)


Add a `config` subcommand to the CLI so users can set their preferences without editing JSON manually:

python

Add to ask/cli.py

@ask.command()

@click.option("--system", help="Set default system prompt")

@click.option("--model", help="Set default model")

def config(system, model):

"""Configure default settings."""

updates = {}

if system:

updates["system_prompt"] = system

if model:

updates["model"] = model

if updates:

save_config(updates)

console.print(f"[green]Config updated:[/green] {updates}")

else:

from .config import load_config

console.print_json(data=load_config())


Usage:

bash

ask config --system "You are a senior Python engineer. Always include type hints."

ask config # view current config


---

## Adding Tool Use: Run Shell Commands

Here's where Claude CLI tools get genuinely powerful. With tool use, Claude can propose a shell command and your tool executes it:

python

ask/tools.py

import subprocess

import json

TOOLS = [

{

"name": "run_shell_command",

"description": "Execute a shell command and return its output. Use for git, file operations, or running tests.",

"input_schema": {

"type": "object",

"properties": {

"command": {

"type": "string",

"description": "The shell command to run"

}

"required": ["command"]

}

]

def run_shell_command(command: str) -> str:

"""Execute command and return stdout + stderr."""

result = subprocess.run(

command, shell=True, capture_output=True, text=True, timeout=30

)

output = result.stdout + result.stderr

return output.strip() or "(no output)"


Then update your `ClaudeClient.ask()` to handle tool calls — Claude will call `run_shell_command` when it needs to check something:

python

def ask_with_tools(self, user_message: str) -> str:

self.history.append({"role": "user", "content": user_message})

while True:

response = self.client.messages.create(

model=DEFAULT_MODEL,

max_tokens=4096,

system=self.system,

tools=TOOLS,

messages=self.history,

)

if response.stop_reason == "end_turn":

text = response.content[0].text

self.history.append({"role": "assistant", "content": response.content})

return text

if response.stop_reason == "tool_use":

# Handle tool calls

tool_results = []

for block in response.content:

if block.type == "tool_use":

result = run_shell_command(block.input["command"])

tool_results.append({

"type": "tool_result",

"tool_use_id": block.id,

"content": result,

})

self.history.append({"role": "assistant", "content": response.content})

self.history.append({"role": "user", "content": tool_results})


> **Safety note:** Always show the user what command will run before executing, or restrict to a safe allowlist. Never run tool use against untrusted input without validation.

---

## Packaging for Distribution

Make the tool installable with `pip install ask-claude`. Add `pyproject.toml`:

toml

[build-system]

requires = ["hatchling"]

build-backend = "hatchling.build"

[project]

name = "ask-claude"

version = "0.1.0"

description = "A Claude-powered CLI developer assistant"

requires-python = ">=3.10"

dependencies = [

"anthropic>=0.30.0",

"click>=8.1.0",

"rich>=13.0.0",

"python-dotenv>=1.0.0",

]

[project.scripts]

ask = "ask.cli:ask"


The `[project.scripts]` section is the key part — it creates the `ask` command globally when someone installs your package.

bash

Build and install locally

pip install -e .

Now you can use it anywhere:

ask "what is a Python context manager?"

ask -f requirements.txt "are there any security vulnerabilities?"

ask --chat


To publish to PyPI:

bash

pip install build twine

python -m build

twine upload dist/*


---

## Production Tips

**1. Handle API errors gracefully:**

python

from anthropic import APIError, RateLimitError, APIConnectionError

try:

response = client.ask(prompt)

except RateLimitError:

console.print("[red]Rate limit hit. Wait 60s and retry.[/red]")

except APIConnectionError:

console.print("[red]Connection failed. Check your network.[/red]")

except APIError as e:

console.print(f"[red]API error {e.status_code}: {e.message}[/red]")

```

2. Add --model flag for power users who want to switch between claude-haiku-4-5 (fast, cheap) and claude-opus-4-6 (most capable) on the fly. 3. Cache responses for repeated queries using a simple SQLite store keyed by the prompt hash — saves API costs when debugging. 4. Stream to a pager for long outputs by piping to less -R when stdout is a terminal and the response exceeds N lines.

Key Takeaways

Use messages.stream() for token-by-token streaming — it makes the tool feel instant
Store self.history as a list of messages to maintain multi-turn context within a session
The [project.scripts] entry in pyproject.toml is what makes pip install create a global command
Tool use lets Claude actually do things (run commands, read files) rather than just describe them
Always validate and sandbox tool inputs — never pass raw user input directly to subprocess

What to Build Next

This foundation supports a wide range of specialist CLI tools:

A code reviewer (review -f src/ --pr-style)
A commit message generator (git diff | ask "write a commit message")
A test generator (ask -f utils.py "write pytest tests")
A SQL query assistant connected to your local database schema

If you're preparing for the Claude Certified Architect (CCA) exam, understanding how to architect Claude-powered applications — including tool use, streaming, and multi-turn context management — is core exam material. Our CCA study guide covers the architectural patterns you'll be tested on, and our practice test bank has 200+ questions with detailed explanations.

The full source code for this tutorial is available to AI for Anything Pro members. Start your free trial to get access along with our complete library of Claude development tutorials.

Want to go deeper? Read our guides on Claude tool use and function calling, Claude API streaming in real time, and building Claude multi-agent systems.

How to Build a Claude-Powered CLI Tool in Python (2026 Tutorial)

How to Build a Claude-Powered CLI Tool in Python

Prerequisites and Setup

Project Structure

Building the Core Client

The CLI Entry Point

ask/config.py

Add to ask/cli.py

ask/tools.py

Build and install locally

Now you can use it anywhere:

Key Takeaways

What to Build Next

Ready to Start Practicing?