How Pinterest Cut Android Testing CI Build Times by 36%+
Article Summary
Pinterest's Android CI builds were bleeding 9 minutes per run because one slow test shard held everything hostage. Their fix? Stop counting tests and start counting seconds.
Pinterest's Test Tools team rebuilt their entire Android E2E testing infrastructure from scratch. They ditched Firebase Test Lab for a custom solution called PinTestLab running on EC2 bare-metal instances, then tackled the real bottleneck: wildly unbalanced test shards that made developers wait for the slowest runner every single time.
Key Takeaways
- Cut total build time by 9 minutes (36% faster) using runtime-aware sharding
- Reduced slowest shard from 863s to 392s — a 55% improvement
- Built PinTestLab on c7i.metal-24xlarge instances after x86 VMs proved too slow
- Used greedy LPT algorithm with historical data instead of equal test counts
- Compressed shard time variance from 597 seconds down to just 130 seconds
By switching from count-based to time-based test sharding with historical runtime data, Pinterest cut Android CI feedback time by 36% and nearly eliminated tail latency.
About This Article
Firebase Test Lab's setup took 5-6 minutes and ate up over half of each build's total time. On top of that, the infrastructure was unstable, causing 1-2 outages per week that lasted 3-4 hours and blocked all code merges.
Pinterest built PinTestLab with a greedy Longest Processing Time algorithm that sorts tests by their historical runtime and assigns each one to whichever emulator will finish first. The system taps into Metro's rich historical test data to make these assignments.
Average shard time fell from 400.1 seconds to 303 seconds, a 24.3% improvement. The gap between the fastest and slowest emulators shrank from 597 seconds to 130 seconds, which made CI feedback more predictable.