Skyscanner Toni Marques Feb 23, 2016

Solving the Problem of One Billion Computations

Article Summary

Skyscanner needed to store 1.6 billion weights to rank hotel prices. Their in-memory approach couldn't scale, so they turned to AWS.

When Skyscanner added granular partner ranking (by market, device, and hotel), they faced a massive data challenge. Their algorithm needed to compute and store weights for every partner across every combination, updated daily and consumed in real-time.

Key Takeaways

Critical Insight

Skyscanner solved their billion-computation problem by moving from in-memory storage to a distributed AWS architecture with DynamoDB and auto-scaling APIs.

The article reveals why price parity sorting required such complex infrastructure and how they handle daily batch updates without impacting live traffic.

About This Article

Problem

Skyscanner needed to compute hotel ranking weights across different partners, markets, devices, and hotels. Each combination generated about 150 bytes of data, and the total volume was too large to keep in memory.

Solution

Skyscanner used AWS DynamoDB to store the weights as a distributed key-value store. A lightweight API ran on auto-scaling EC2 instances across three regions: eu-west-1, ap-southeast-1, and ap-northeast-1. Every day, batch processing recalculated and updated all the weights.

Impact

The system could consume weights in real time. For a typical search with 15 hotels, it resolved about 50 price parities. It also scaled automatically when traffic increased, so no one had to manually add servers.