Better Android Testing at Airbnb (Part 4)
Article Summary
Airbnb's Android tests were failing randomly. The culprit? Everything from cached drawables to delayed runnables creating unpredictable test behavior.
In Part 6 of their testing series, Airbnb's Eli Hart reveals the hidden sources of test flakiness that plague screenshot and interaction testing. With tests running in unpredictable order via Flank, even small state leaks compound into major reliability issues.
Key Takeaways
- Forced drawable cache clearing after each screenshot eliminated pixel variation flakiness
- Custom wrapper functions for postDelay and async code enable deterministic test execution
- Mocked date framework ensures JodaTime calls return consistent values across test runs
- Disabled RecyclerView prefetching prevents non-deterministic view layout during screenshots
- Centralized ImageView architecture allows synchronous local asset injection instead of network loads
Airbnb achieved reliable Android testing by systematically eliminating flakiness sources at the framework level, from shared preferences to WebView mocking.
About This Article
Airbnb's test framework couldn't use Android Test Orchestrator because it made tests run seven times slower. Engineers had to manually clear shared state between tests and deal with memory leaks that appeared randomly across test shards.
Eli Hart's team built centralized wrapper APIs for async operations like RxJava's execute function and injected test Coroutine Scopes. This let the framework block or detect when asynchronous code finished in a predictable way.
Test shards now run in three minutes or less, and LeakCanary detection catches memory issues before they happen. This stopped the Out Of Memory exceptions that used to crash test processes after running many tests in a row.