Instagram May 2, 2023

Building an Open Source Carefree Android Disk Cache

Article Summary

Instagram's disk cache was causing more crashes than any other component in their Android app. The culprit? A well-intentioned open source library that made error handling a developer nightmare.

Instagram's engineering team rebuilt their Android disk caching system from scratch after DiskLruCache's complex exception handling led to constant NPEs and IOExceptions. Engineer Jimmy Zhang shares how they designed IgDiskCache to be truly "carefree" for developers.

Key Takeaways

Critical Insight

By rethinking cache design around the principle that caches can always say "I don't have this," Instagram eliminated their top crash category and made disk caching simple.

The new cache's built-in checks actually helped Instagram discover hidden race conditions that were nearly impossible to detect before.

About This Article

Problem

DiskLruCache forced developers to manually handle Editor and OutputStream objects through nested try-catch blocks. When cleanup went wrong, incomplete cached files would corrupt the entire cache.

Solution

Instagram built IgDiskCache with a simpler architecture. They removed the Editor and Snapshot layers and added OptionalStream to handle IOExceptions automatically. This meant developers no longer had to manage resources manually.

Impact

After IgDiskCache launched in production, Instagram saw cache-related crashes drop significantly. These crashes had been at the top of their crash list for over a year. The built-in thread checking also stopped inefficient disk IO from running on the main thread.