Solving Native Memory Leaks in Mobile Apps
Article Summary
Sid Rathi from Expedia Group debugged a production nightmare: KStream apps dying in an endless OOM loop. The culprit? A single unclosed iterator eating native memory.
Expedia engineers faced a vicious cycle where Kafka Streams applications kept crashing from out-of-memory errors. New instances would spin up, rebalance, then fail again within minutes. The leak wasn't in heap memory, making it exceptionally hard to diagnose.
Key Takeaways
- Scaling horizontally made the problem worse: memory jumped 5 to 10GB in 50 minutes
- RocksDB state store iterators were never closed, leaking native memory via OS page cache
- Fix was simple: wrap iterator in try-with-resources to auto-close after use
- Docker memory usage stabilized completely after deploying the one-line fix
An unclosed RocksDB iterator caused unbounded native memory growth that horizontal scaling couldn't solve, but proper resource management with try-with-resources fixed permanently.
About This Article
Sid Rathi's KStream application kept crashing with OOM errors in production. Docker host memory was growing over time even though the JVM heap stayed constant, which pointed to a native memory leak outside the heap.
Rathi used YourKit profiler to compare memory snapshots taken at different times. He found that RocksDB state store iterators were never being closed. This caused native memory to grow without limit as the OS page cache buffered compressed data blocks.
After wrapping the iterator in try-with-resources to auto-close it, docker memory usage stayed flat over time. The application stopped losing containers to OOM errors. This ended the rebalancing cycle that had been causing problems before.