Dev.to
5/13/2026

I Built an Offline AI Career Advisor Using Gemma 4 — Here's Exactly How It Works
Short summary
Developer built GuidanceOS, an offline AI career advisor using Gemma 4 on a Kaggle T4 GPU with no inference-time internet. The system uses TF-IDF indexing for fast job/course matching, then chains multi-agent prompts through Gemma to analyze skills, generate personalized recommendations, and compute ATS scores. Technical walkthrough covers quantization tricks, GPU memory management, and agent orchestration patterns applicable to many domains.
- •Offline Gemma 4 system solves career guidance at inference-time with no API calls
- •TF-IDF indexing provides fast deterministic matching against 123K+ job postings and 6K+ courses
- •Multi-agent orchestration (Skills Analyzer → Job Matcher → Course Recommender) runs sequentially on constrained GPU memory
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



