Badoo Marko Kevac Aug 7, 2017

Map Performance Comparison and Judy Data Structure Alternatives

Article Summary

Marko Kevac from Bumble's Platform team benchmarked 6 associative array implementations to replace their aging Judy data structure. The winner? Not what you'd expect when memory matters.

Bumble's C/C++ services handle hundreds of thousands of requests per second with hundreds of gigabytes in memory. Their team tested hash tables and trees from Google, C++ STL, and others against their longtime Judy implementation, measuring both speed and memory overhead with 10 million random lookups.

Key Takeaways

Critical Insight

spp emerged as the top Judy replacement, balancing performance with reasonable memory overhead for high-throughput services.

The team discovered why the standard hash function beat specialized alternatives, and it reveals something surprising about how modern implementations optimize for common use cases.

About This Article

Problem

Badoo's Judy implementation hadn't been updated in years, so the Platform team needed to figure out if newer options could handle the same workload. They were processing hundreds of thousands of requests per second while keeping hundreds of gigabytes in memory.

Solution

Marko Kevac tested six hash table options: Judy, std::unordered_map, google::sparse_hash_map, google::dense_hash_map, and spp::sparse_hash_map. He used 32-bit integer keys and 64-bit pointer values, running 10 million random lookups on an Intel i7-6700K with GCC 6.3.0 and -O3 optimization.

Impact

spp::sparse_hash_map worked well as a Judy replacement. It matched Judy's random-get performance while using only 1 bit of memory overhead per entry. Badoo built a C++ wrapper around it and started planning production tests on their high-demand services.