Accelerating on-device ML on Meta’s apps with ExecuTorch
Article Summary
Meta just shared how they moved billions of daily users to their new on-device ML framework. The performance gains are substantial.
Meta's PyTorch Edge team rolled out ExecuTorch across Instagram, WhatsApp, Messenger, and Facebook over the past year. This open-source framework replaces their previous mobile ML stack and runs AI models directly on users' devices instead of servers.
Key Takeaways
- Instagram Cutouts runs significantly faster with ExecuTorch, boosting daily active users
- WhatsApp slashed model load time and inference time for bandwidth estimation
- Messenger moved server models on-device to enable end-to-end encryption
- Facebook's SceneX model shows performance gains across all device tiers
- Built with Arm, Apple, and Qualcomm for cross-platform compatibility
Critical Insight
ExecuTorch delivered faster inference, lower latency, and better privacy across Meta's apps while enabling features like E2EE that weren't possible before.