Cred • Sri Harsha • Feb 12, 2024

Pareto Principle in Action: Boosting Performance with Smart Caching

Article Summary

CRED Engineering faced a nightmare: caching every query combination would require 45,768+ TB of storage. Here's how they got it down to 135 MB.

Sri Harsha from CRED's engineering team breaks down their card offers caching challenge. With user-specific data, dynamic filters, and pagination, naive caching would have consumed thousands of terabytes and crashed their systems.

Key Takeaways

Sorted card queries to reduce combinations from 131 billion to 32,767
Multi-layered cache stores identifiers separately from full offer records
85% of data now served from cache with P75 latency under 1ms
ElasticSearch traffic dropped 80% using only 1% memory with Caffeine cache
Focused on top 4 banks covering 70%+ users via Pareto Principle

Critical Insight

By applying smart constraints and multi-layer caching, CRED achieved 85% cache hit rate and cut ElasticSearch traffic by 80% while using minimal memory.

The math behind reducing cache combinations from astronomical numbers to manageable sizes is surprisingly elegant (and involves some clever sorting tricks).

About This Article

Problem

CRED's card offers service had a scaling problem. With 5 banks and 3 networks, users could create roughly 1.31×10¹¹ different card filter combinations. Brute-force caching would have required over 45,768 TB of storage, which wasn't practical.

Solution

Sri Harsha's team sorted the card queries, which cut the combinations down to 32,767. They then limited caching to users with 4 or fewer cards, a group that covers over 70% of RBI-issued cards. They built a multi-layered cache that stored only offer identifiers separately from the full card records.

Impact

The cache hit-to-miss ratio reached 90:10. P75 latency dropped below 1ms. ElasticSearch query traffic fell by 80% while the system used only 1% or less memory with Caffeine cache.