Dev.to
6/11/2026

I built an AI chat over my CV on a zero-pound inference budget
Short summary
Built ask.hiten.dev, a streaming CV chatbot with zero inference costs by chaining free-tier LLM providers (Groq → OpenRouter → NVIDIA → Cerebras) with automatic failover to survive rate limits. Uses stream normalization to fix model quirks, fact whitelisting to prevent hallucinations, and in-memory caching for fixed prompts. Deployed serverless on Oracle free-tier VM with Astro SSR.
- •Built zero-cost CV chatbot by chaining 4 free LLM providers with automatic failover for reliability
- •Normalizes model output in code (quotes, dashes), whitelists facts to prevent hallucinations
- •Caches fixed prompts in-memory; validated with Playwright tests; deployed on free Oracle VM
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



