Measuring Product Impact Without A/B Testing: How Discord Used the Synthetic Control Method for Voice Messages
Article Summary
Discord wanted to measure Voice Messages impact, but A/B testing would break the feature. Network effects meant users in control groups couldn't receive voice messages from treatment users.
Discord's data science team shares how they used the Synthetic Control Method to evaluate Voice Messages when traditional experimentation wasn't possible. The article walks through their methodology, from problem identification to implementation and results.
Key Takeaways
- Synthetic controls compare one treated country to a weighted mix of untreated countries
- Brazil was treatment group, synthetic control was 50% Argentina, 30% Uruguay, 20% Chile
- Method controls for observable and unobservable differences better than geo tests
- Results showed clear engagement increase after Voice Messages launched in Brazil
- Approach works when randomization is impossible or sacrifices too much precision
Discord successfully measured Voice Messages impact using synthetic controls, proving the method works for network effect features that break traditional A/B tests.
About This Article
Discord's testing platform didn't support cluster randomization. The team had to pick between running a flawed user-level A/B test or doing geo-testing, which mixes country differences with treatment effects and introduces omitted variable bias.
Alec Brevé and Angela Ambroz built the Synth package in R and SyntheticControlMethods in Python. They used these tools to create a weighted counterfactual and checked how well it fit by looking at Mean Squared Prediction Error (MSPE) before and after the feature rolled out.
The synthetic control method showed a clear positive gap between Brazil's actual engagement and what its synthetic counterfactual predicted after Voice Messages launched. This gave the team confidence in the feature's value without the precision loss that comes with traditional geo-testing.