Dev.to
5/8/2026

Your PDFs Never Leave Your Pocket: Building a 100% Offline RAG App with Gemma 4 + LiteRT-LM
Short summary
PocketSage is a fully offline Android RAG application that lets users chat with PDFs entirely on-device using Gemma 4 and LiteRT for inference. It solves a critical GDPR compliance problem: no cloud processing means no data processor role, no DPA needed, and a one-week legal review instead of six months. Built with MVVM architecture, Room embeddings, and streaming LLM inference in approximately 500 lines of Kotlin.
- •Offline RAG app eliminates cloud data transfer, solving GDPR compliance for regulated industries
- •Runs Gemma 4 natively on Android with streaming token responses and 2-3 second first-token latency
- •Clean Modern Android Development architecture (MVVM, Hilt, Compose) provides reusable pattern for ML inference
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



