Performance Bottlenecks in Go Apps Summary & Key Takeaways

Article Summary

Grab's engineering team discovered their Go apps were mysteriously throttling at 1.94 CPU cores but flying at 2 cores. The culprit? A sneaky interaction between Kubernetes VPA and GOMAXPROCS.

Grab's real-time data platform team (Coban) runs stream processing pipelines on Kubernetes with vertical pod autoscaling. While debugging consumer lag issues on their SinktoS3 pipeline, they uncovered a critical performance trap affecting Go applications.

Key Takeaways

GOMAXPROCS rounds down CPU allocation: 1.94 cores becomes just 1 core of actual performance
VPA scaled down from 2.5 to 1.94 cores, causing 50% CPU drop and massive consumer lag
Bumping minimum to 2 cores restored 95% CPU utilization and eliminated backlog immediately
VPA v0.13 now supports integer-only CPU recommendations for Kubernetes 1.25+
Autoscaling can't solve everything: manual intervention needed for optimal Go performance

Critical Insight

A 0.06 core difference (1.94 vs 2.0) caused catastrophic performance degradation because Go's GOMAXPROCS only uses integer CPU values.

The article includes detailed tables showing exactly when VPA recommendations will throttle your Go apps and when they'll thrive.

Performance Bottlenecks in Go Apps

Article Summary

Key Takeaways

Recent from Grab

Related Articles

Related Articles

How We Optimized Concurrency Using Node.js at Skeelo

Skeelo tunes Node.js concurrency to keep their app humming along.

Skeelo • Sep 19, 2023

Why xHE-AAC is being embraced at Meta

Meta adopts xHE-AAC to pump up audio quality across their apps.

Meta • Apr 11, 2023

6 Lessons Learned from Optimizing the Performance of a Node.js Service

Klarna shares six big lessons from speeding up their Node.js service.

Klarna • Nov 15, 2022

How Removing Caching Improved Mobile Performance by 25%

Klarna ditched caching and somehow boosted mobile speed by 25%.

Klarna • Jul 19, 2022

Performance Bottlenecks in Go Apps

Article Summary

Key Takeaways

Recent from Grab

Cursor at Grab: Adoption and impact

Demystifying user journeys: Revolutionizing troubleshooting with auto tracking

Grab’s Mac Cloud Exit supercharges macOS CI/CD

How We Reduced GrabX SDK Initialisation Time

Related Articles

How We Optimized Concurrency Using Node.js at Skeelo

Why xHE-AAC is being embraced at Meta

6 Lessons Learned from Optimizing the Performance of a Node.js Service

How Removing Caching Improved Mobile Performance by 25%