At present, we’re introducing Gemma 4 12B, our newest mannequin designed to deliver agentic multimodal intelligence on to laptops. Bridging the hole between our edge-friendly E4B and our extra superior 26B Combination of Consultants (MoE), Gemma 4 12B packages highly effective capabilities inside a lowered reminiscence footprint. It is usually our first mid-sized mannequin to function native audio inputs.
Because of the developer neighborhood, Gemma 4 fashions have now crossed 150 million downloads. You’ve constructed every part from wearable robotic arms for bodily help to enterprise-grade AI security. We’re excited to see what you construct with this newest addition.
Right here’s an summary of what makes Gemma 4 12B distinctive:
- Novel unified structure: No multimodal encoders. The imaginative and prescient and audio inputs circulation immediately into the LLM spine.
- Superior reasoning: Benchmark efficiency nearing our 26B mannequin, unlocking highly effective multi-step reasoning and agentic workflows.
- Laptop computer prepared: Sufficiently small to run domestically with simply 16GB of VRAM or unified reminiscence.
- Open and accessible: Launched below an Apache 2.0 license with help throughout the developer ecosystem.
- Drafter-ready: Gemma 4 12B comes geared up with Multi-Token Prediction (MTP) drafters to cut back latency.
Collectively, these options deliver superior multimodal capabilities to on a regular basis {hardware} with out sacrificing velocity or reasoning. Let’s now take a better have a look at how Gemma 4 12B achieves this.
Run state-of-the-art brokers domestically
Gemma 4 12B delivers efficiency nearing our bigger 26B MoE mannequin on customary benchmarks, however at lower than half the whole reminiscence footprint. Sufficiently small to run domestically on shopper laptops with 16GB of RAM, it unlocks highly effective multimodal and agentic experiences proper in your machine.
