Google Aug 22, 2025

Gemini Nano with On-Device ML Kit and GenAI APIs

Article Summary

Caren Chang, Joanna Huang, and Chengji Yan from Google reveal how they're making Gemini Nano v3 940 tokens/second fast while keeping quality consistent across devices. The secret? LoRA adapters and rigorous evals behind the scenes.

Google just launched Gemini Nano v3 on Pixel 10 devices, accessible through ML Kit GenAI APIs. The team explains their approach to maintaining consistent quality as they upgrade models: combining evaluation pipelines across languages with feature-specific LoRA adapter training on top of the base model.

Key Takeaways

Critical Insight

Google's GenAI APIs now deliver 84% faster prefix processing with Gemini Nano v3 while using adapter training to guarantee developers get consistent results across model upgrades.

The article reveals specific benchmarking data comparing nano-v2 and nano-v3 performance that shows where the real speed gains are coming from.

Recent from Google

Related Articles