Dream11 May 19, 2022

Automated Performance Testing with Torque

Article Summary

Dream11 handled 5.5 million concurrent users and 80 million requests per minute during IPL 2020. Their secret? A custom performance testing framework called Torque.

Dream11's engineering team built Torque to automate performance testing across 100+ microservices built with Java, Scala, Node, and various databases. The framework replaced their scattered setup of JMeter, Rundeck, and shell scripts with a unified solution.

Key Takeaways

Critical Insight

Torque enabled Dream11 to run 1,500+ load test iterations, benchmark critical services at 5x scale, and deliver a seamless experience during their biggest traffic event.

The article reveals how they used Redis locks and AWS Lambda to solve a tricky problem with distributing unique test data across multiple load generators.

About This Article

Problem

Dream11 had to test over 100 microservices built with different technologies like Java, Scala, Node, MySQL, Cassandra, and Aerospike. Traffic patterns shifted dramatically before and after matches started, and managing gigabytes of test data across all these services became unwieldy.

Solution

Dream11's engineering team built Torque, a framework that pulls together Gatling for load generation, Scala for automation, Redis for distributed locking, AWS S3 and Lambda to split data, and Apache Spark to process large datasets. Jenkins orchestrates the whole thing.

Impact

Torque let Dream11 run over 1,500 load test iterations and test 50 critical services at 5x their normal scale. During IPL peak traffic, when the system handled 80 million requests per minute, they saw zero scale-related issues in production.