Dev.to
5/8/2026

Sanctum Machina: Gemma 4 in your pocket, forever
Short summary
Sanctum Machina is an open-source Android app that runs Google's Gemma 4 LLM locally on mid-range phones with competitive inference speeds. The app features persistent multi-chat history, multimodal input (text, image, audio), granular inference control, and intelligent model warm-loading to eliminate cold-start delays. Author plans to explore tools and agent modes using FunctionGemma while investigating Multi-Token Prediction speedups.
- •Running Gemma 4 locally on Android with usable speeds and zero internet dependency
- •Features persistent history, multimodal input, and detailed inference tuning
- •Open-source project with roadmap for tools, agent modes, and performance optimizations
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



