We lately launched Gemma 4, our most succesful open fashions to this point. Since then, they’ve been downloaded greater than 150 million instances, and we’ve been increasing the household’s capabilities. We launched Multi-Token Prediction (MTP) to speed up inference, and lately launched the 12B Unified model and Quantization-Aware-Training (QAT) checkpoints. Launched underneath an Apache 2.0 license, Gemma 4 offers builders and organizations flexibility to fine-tune and deploy fashions throughout quite a lot of environments, from edge gadgets to native workstations.
Many builders are sharing what they’ve created with Gemma 4, showcasing how the fashions’ capabilities translate into real-world purposes. Listed below are three highlights of what folks and firms are creating.
Construct low-latency, on-device apps.
The staff on the app constructing firm HubX used Gemma 4 to construct BetterSpeak, an offline AI English tutoring platform. BetterSpeak makes use of the edge-optimized Gemma 4 E2B (efficient 2B parameters) mannequin because the reasoning engine for its on-device pipeline, enabling non-public, low-latency tutoring with out the necessity for an web connection.
To beat cellular {hardware} constraints, HubX deployed the 4-bit quantized model of the mannequin launched by Google. This model handles duties like grammar explanations and progress monitoring throughout a number of languages. By leveraging Gemma 4’s native audio enter capabilities, the app helps direct speech-to-speech studying, lowering prices whereas making certain consumer privateness by processing all vocal and textual content information solely on-device.
