Netflix: Consistent Caching Mechanism in Titus Gateway

Article Summary

Netflix's Titus container platform hit a wall: their singleton leader couldn't handle the API query load. Here's how they scaled horizontally without breaking consistency guarantees.

Netflix engineers Tomasz Bak and Fabio Kung share how they redesigned Titus Gateway's architecture. They moved from a single leader handling all queries to a distributed caching system that maintains strict read consistency across multiple gateway nodes.

Key Takeaways

Median latency increased slightly but 99th percentile dropped 90% (292ms to 30ms)
System now handles 8K queries per second versus 4K limit before collapse
Custom keep alive protocol ensures clients never see stale data across gateways
Cache synchronization adds average 4ms delay using monotonic timestamps
Horizontal scaling achieved without changing API contract or client migration

Critical Insight

Netflix doubled their query capacity while improving tail latencies by 90% through consistent caching that guarantees read your write semantics across distributed gateway nodes.

The article reveals how they use dummy messages and logical timestamps to solve a tricky distributed systems problem that most caching solutions ignore.

Consistent Caching Mechanism in Titus Gateway

Article Summary

Key Takeaways

Recent from Netflix

Related Articles

Related Articles

Media on Swiggy's Mobile Apps

Swiggy handles media in their apps with finesse and speed.

Swiggy • Oct 17, 2023

Improving Video Cache Hits on Swiggy Apps

Swiggy tunes video caching for smoother playback in their apps.

Swiggy • Jun 20, 2023

Handling Multiple Caches in App

Swiggy juggles multiple caches to keep their app performing well.

Swiggy • Feb 21, 2023

How Optimizing Memory Management with LMDB Boosted Performance on Our API Service

Pinterest uses LMDB to optimize memory and speed up their API service.

Pinterest • Oct 11, 2022

Consistent Caching Mechanism in Titus Gateway

Article Summary

Key Takeaways

Recent from Netflix

Netflix App Testing At Scale

Fixing Performance Regressions Before They Happen

Netflix Android and iOS Studio Apps — now powered by Kotlin Multiplatform

Seamlessly Swapping the API backend of the Netflix Android app

Related Articles

Media on Swiggy's Mobile Apps

Improving Video Cache Hits on Swiggy Apps

Handling Multiple Caches in App

How Optimizing Memory Management with LMDB Boosted Performance on Our API Service