Back to feed
Dev.to
Dev.to
6/11/2026
I built an AI chat over my CV on a zero-pound inference budget

I built an AI chat over my CV on a zero-pound inference budget

Short summary

Built ask.hiten.dev, a streaming CV chatbot with zero inference costs by chaining free-tier LLM providers (Groq → OpenRouter → NVIDIA → Cerebras) with automatic failover to survive rate limits. Uses stream normalization to fix model quirks, fact whitelisting to prevent hallucinations, and in-memory caching for fixed prompts. Deployed serverless on Oracle free-tier VM with Astro SSR.

  • Built zero-cost CV chatbot by chaining 4 free LLM providers with automatic failover for reliability
  • Normalizes model output in code (quotes, dashes), whitelists facts to prevent hallucinations
  • Caches fixed prompts in-memory; validated with Playwright tests; deployed on free Oracle VM

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more