Grab Apr 7, 2022

Profile-Guided Optimisation

Article Summary

Grab's engineering team just unlocked 30% memory savings and 38% storage reduction with a simple compiler flag. Here's how Profile-Guided Optimization (PGO) delivered massive gains with minimal code changes.

The Grab AI Platform team experimented with PGO, a technique where production CPU profiles guide compiler optimizations in Go 1.20+. They tested it across multiple services including their open-source TalariaDB time-series database to measure real-world impact.

Key Takeaways

Critical Insight

PGO delivered 10-30% resource savings on high-throughput services with just a compiler flag change, but the gains depend heavily on service characteristics and profiling approach.

The team discovered why their first PGO attempt actually increased CPU usage, and what monorepo teams need before they can adopt this optimization.

About This Article

Problem

Grab's engineering team wanted to squeeze more performance out of their Go applications beyond what the compiler already provided. They knew profile-guided optimization could help, but needed a way to figure out which services would actually benefit before investing engineering time in the effort.

Solution

The team set up PGO by collecting 6-minute CPU profiles from production services using pprof, then rebuilt their Docker images with the `-PGO=./talaria.PGO` compiler flag. This approach let them apply profiles to both main binaries and Go plugins in services like TalariaDB.

Impact

TalariaDB saw real improvements, but when they tried the same approach on Catwalk, they only got a 5% CPU gain. It turned out PGO works differently depending on how each service is built. For their monorepo setup, they also needed to add more support to their build pipeline to make PGO practical across the board.

Recent from Grab

Related Articles