Bringing Native AI to Mobile Apps with ExecuTorch: Part 1 (iOS)
Article Summary
Jakub Chmura from Software Mansion shows how to run AI models directly on iOS devices without a single API call. No cloud dependency, no latency, just native performance.
This tutorial walks through implementing ExecuTorch, PyTorch's framework for on-device AI, in iOS apps. Software Mansion demonstrates the complete workflow from model export to running style transfer models using Apple's Neural Engine via CoreML backend.
Key Takeaways
- ExecuTorch exports PyTorch models to .pte executables for mobile deployment
- CoreML backend leverages Apple Neural Engines for high-performance inference
- Implementation requires Objective-C++ with Module class and tensor preprocessing
- Style transfer model runs at 640x640 resolution entirely on-device
- Part 1 of series covering iOS, Android implementation coming next
Developers can now run sophisticated AI models like style transfer completely on-device using ExecuTorch, eliminating API costs and latency while leveraging native hardware acceleration.
About This Article
Mobile developers struggled to integrate AI models into iOS apps. The process required building ExecuTorch libraries, linking multiple frameworks like CoreML.framework and Accelerate.framework, and setting linker flags such as -all_load to get everything working properly.
Jakub Chmura's team created pre-built xcframework files for coreml_backend.xcframework and executorch.xcframework. They also documented the full Xcode project setup, so developers no longer need to compile ExecuTorch from source themselves.
Developers can now deploy style transfer models at 640x640 resolution directly on-device using Apple's Neural Engine. This cuts down latency and API costs while keeping performance high through CoreML's optimized backend.