Dev.to
5/12/2026

I built an AI phone receptionist in 3 weeks. Here's what nobody tells you.
Short summary
Author built a phone receptionist AI in 3 weeks using Twilio, OpenAI Realtime API, and LiveKit. Critical breakthrough: switching to speech-to-speech reduced perceived latency from 1.5 seconds to 350ms. Covers real-world failures (hallucinations, interruptions), architectural solutions (voice activity detection, structured business logic via function-calling tools), and hard-won lessons on user trust and escalation.
- •OpenAI Realtime API was the critical breakthrough for reducing latency below the ~600ms threshold where calls feel robotic
- •Handle interruptions via voice activity detection; escalate risky calls (pricing disputes, angry callers) rather than attempting resolution
- •Tell callers it's AI upfront; inject business knowledge through structured prompts, function-calling tools, and explicit escalation rules
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



