Pinterest George Kandalaft Nov 10, 2025

How Pinterest Cut Android Testing CI Build Times by 36%+

Article Summary

Pinterest's Android CI builds were bleeding 9 minutes per run because one slow test shard held everything hostage. Their fix? Stop counting tests and start counting seconds.

Pinterest's Test Tools team rebuilt their entire Android E2E testing infrastructure from scratch. They ditched Firebase Test Lab for a custom solution called PinTestLab running on EC2 bare-metal instances, then tackled the real bottleneck: wildly unbalanced test shards that made developers wait for the slowest runner every single time.

Key Takeaways

Critical Insight

By switching from count-based to time-based test sharding with historical runtime data, Pinterest cut Android CI feedback time by 36% and nearly eliminated tail latency.

They also explored an on-demand SQS-based sharding approach that could dynamically rebalance work mid-run — here's why they didn't ship it yet.

About This Article

Problem

Firebase Test Lab's setup took 5-6 minutes and ate up over half of each build's total time. On top of that, the infrastructure was unstable, causing 1-2 outages per week that lasted 3-4 hours and blocked all code merges.

Solution

Pinterest built PinTestLab with a greedy Longest Processing Time algorithm that sorts tests by their historical runtime and assigns each one to whichever emulator will finish first. The system taps into Metro's rich historical test data to make these assignments.

Impact

Average shard time fell from 400.1 seconds to 303 seconds, a 24.3% improvement. The gap between the fastest and slowest emulators shrank from 597 seconds to 130 seconds, which made CI feedback more predictable.