Airbnb Engineering
4/21/2026

Building a fault-tolerant metrics storage system at Airbnb
Short summary
Airbnb engineered an internally operated metrics storage system handling 50 million samples per second and 1.3 billion active time series. Shuffle sharding isolates tenant workloads to prevent cascade failures, and a consolidated control plane automated tenant onboarding and configuration management. The system maintains sub-30-second p99 query latency while supporting 10,000 dashboards and 500,000 alerts.
- •Airbnb built internal metrics system handling 50M samples/sec, 1.3B time series—replacing vendor solution
- •Shuffle sharding technique isolates tenant workloads and prevents single-tenant DDoS from cascading across cluster
- •Automated control plane eliminated manual bottlenecks in tenant onboarding and configuration management
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



