Back to feed
Dev.to
Dev.to
5/14/2026
Gemma 4 26B on v6e-4 Turbo-Stable Benchmark

Gemma 4 26B on v6e-4 Turbo-Stable Benchmark

Short summary

Google's Gemma 4 26B MoE model on TPU v6e-4 has reached production-ready status with 100% test stability across 144 concurrency points. Latency improvements are significant: 114x reduction at the 2K context boundary (now ~1.15s), with peak throughput of 467,825 tokens/sec. Cold-start times dropped from 24 minutes to under 10 seconds via persistent JAX caching.

  • Gemma 4 26B achieves 100% stability across all 144 concurrency tests (1-2048 users)
  • 114x latency improvement; consistent ~1.15s at 2K context, 467K tokens/sec peak throughput
  • Cold-start reduced from 24 min to <10s using persistent JAX cache

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more