Dev.to
5/12/2026

AI agent audit finds most
Original: I Audited My AI Agents and Found That Most of Their Reasoning Wasn’t Observable
Short summary
Author audited personal AI agent platform and discovered only 13–17% of high-volume agent decisions were captured in Langfuse traces. Root cause: agents running with LANGFUSE_ENABLED=false environment variable by default. Includes runnable SQL audit schema and TypeScript gating code pattern to identify observability gaps in your own agent systems.
- •High-volume agents (ARIA 31K decisions) showed 17% Langfuse coverage vs 100% for low-volume agents
- •Gap traced to LANGFUSE_ENABLED env var defaulting to false, routing LLM calls through no-op path
- •Provides SQL audit query and code pattern to detect similar observability blind spots
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



