Pareto Principle in Action: Boosting Performance with Smart Caching
Article Summary
CRED Engineering faced a nightmare: caching every query combination would require 45,768+ TB of storage. Here's how they got it down to 135 MB.
Sri Harsha from CRED's engineering team breaks down their card offers caching challenge. With user-specific data, dynamic filters, and pagination, naive caching would have consumed thousands of terabytes and crashed their systems.
Key Takeaways
- Sorted card queries to reduce combinations from 131 billion to 32,767
- Multi-layered cache stores identifiers separately from full offer records
- 85% of data now served from cache with P75 latency under 1ms
- ElasticSearch traffic dropped 80% using only 1% memory with Caffeine cache
- Focused on top 4 banks covering 70%+ users via Pareto Principle
By applying smart constraints and multi-layer caching, CRED achieved 85% cache hit rate and cut ElasticSearch traffic by 80% while using minimal memory.
About This Article
CRED's card offers service had a scaling problem. With 5 banks and 3 networks, users could create roughly 1.31×10¹¹ different card filter combinations. Brute-force caching would have required over 45,768 TB of storage, which wasn't practical.
Sri Harsha's team sorted the card queries, which cut the combinations down to 32,767. They then limited caching to users with 4 or fewer cards, a group that covers over 70% of RBI-issued cards. They built a multi-layered cache that stored only offer identifiers separately from the full card records.
The cache hit-to-miss ratio reached 90:10. P75 latency dropped below 1ms. ElasticSearch query traffic fell by 80% while the system used only 1% or less memory with Caffeine cache.