Solving the Problem of One Billion Computations
Article Summary
Skyscanner needed to store 1.6 billion weights to rank hotel prices. Their in-memory approach couldn't scale, so they turned to AWS.
When Skyscanner added granular partner ranking (by market, device, and hotel), they faced a massive data challenge. Their algorithm needed to compute and store weights for every partner across every combination, updated daily and consumed in real-time.
Key Takeaways
- 1.6 billion weights stored across partner, market, device, and hotel combinations
- DynamoDB chosen for key-value storage with auto-scaling across three AWS regions
- Average search resolves 50 price parities across 15 hotels in real-time
- Auto-scaling EC2 groups handle traffic spikes without manual intervention
Skyscanner solved their billion-computation problem by moving from in-memory storage to a distributed AWS architecture with DynamoDB and auto-scaling APIs.
About This Article
Skyscanner needed to compute hotel ranking weights across different partners, markets, devices, and hotels. Each combination generated about 150 bytes of data, and the total volume was too large to keep in memory.
Skyscanner used AWS DynamoDB to store the weights as a distributed key-value store. A lightweight API ran on auto-scaling EC2 instances across three regions: eu-west-1, ap-southeast-1, and ap-northeast-1. Every day, batch processing recalculated and updated all the weights.
The system could consume weights in real time. For a typical search with 15 hotels, it resolved about 50 price parities. It also scaled automatically when traffic increased, so no one had to manually add servers.