LinkedIn Swapnil Ghike Apr 8, 2014

Garbage Collection Optimization: High Throughput and Low Latency Java Applications

Article Summary

LinkedIn Engineering cut their tail latency by 75% through systematic garbage collection tuning. Here's their playbook for high-performance Java apps.

Swapnil Ghike shares how LinkedIn's team optimized GC settings for their next-generation feed data platform serving thousands of requests per second. The article breaks down their methodical approach from baseline measurements to production-ready configuration.

Key Takeaways

Critical Insight

LinkedIn achieved 40-60ms GC pauses every 3 seconds and 60ms p99.9 latency through data-driven tuning of ParNew/CMS settings.

The article reveals why their team deliberately avoided mlock despite its performance benefits.

About This Article

Problem

LinkedIn's feed platform had a garbage collection problem. Young generation pauses were hitting 80ms, and the old generation kept triggering unpredictably. The issue came down to long-lived cached objects piling up in a 32GB heap, with the CMS initiation threshold set at 70%.

Solution

Swapnil Ghike's team dug into verbose GC logs using Naarad and gclogviewer to spot patterns. They increased the heap to 40GB and tuned the card table scanning by setting ParGCCardsPerStrideChunk to 32768. This change improved how worker threads distributed their tasks.

Impact

The new settings cut young generation pauses down to 40-60ms, happening every three seconds instead of constantly. Old generation collections dropped to about once per hour. The platform could now handle thousands of requests per second without breaking.