Google launches Gemma 3: A series of open-source AI models that runs on a single GPU or TPU

Google has launched Gemma 3, its latest series of open-source AI models, aiming to make AI development more efficient and accessible. The models come in four sizes: 1 billion, 4 billion, 12 billion, and 27 billion parameters—and are designed to run on a single GPU or TPU. This makes them practical for a wide range of devices, from laptops to powerful workstations.
Alphabet CEO Sundar Pichai announced the launch on X, saying, “Gemma 3 is here! Our new open models are incredibly efficient – the largest 27B model runs on just one H100 GPU. You’d need at least 10x the compute to get similar performance from other models.”
Gemma 3 is here! Our new open models are incredibly efficient – the largest 27B model runs on just one H100 GPU. You’d need at least 10x the compute to get similar performance from other models ⬇️ pic.twitter.com/4FKujOROQ4
— Sundar Pichai (@sundarpichai) March 12, 2025
Gemma 3: The Most Efficient Open-Source AI Model with Single-GPU Efficiency and Multimodal Power
In a blog post, Google described Gemma 3 as a collection of advanced, lightweight open models built on the same research and technology behind its Gemini 2.0 models. The search giant said Gemma 3 is its most portable and responsibly developed open models to date, Gemma 3 sets a new standard for accessibility and performance.

This chart ranks AI models by Chatbot Arena Elo scores; higher scores (top numbers) indicate greater user preference. Dots show estimated NVIDIA H100 GPU requirements. Gemma 3 27B ranks highly, requiring only a single GPU despite others needing up to 32.
What Makes Gemma 3 Stand Out
- Flexible Model Sizes: Whether you’re working on mobile apps or enterprise software, there’s a Gemma 3 model that fits your needs.
- Multilingual Support: Gemma 3 handles over 140 languages, with ready-to-use support for 35 of them. This opens up broader possibilities for global projects.
- Works Across Text, Images, and Videos: These models aren’t limited to just text. They can process images and short videos, which is handy for building more interactive applications.
- Extended Context Window: With the ability to process up to 128,000 tokens, Gemma 3 can take on complex tasks like summarizing large documents or analyzing extensive data sets.
- Strong Performance: The 27B model scored 1338 on LMArena, outperforming larger competitors. Google calls it the “world’s best single-accelerator model.”
- Privacy and Local Processing: By enabling on-device AI processing, Gemma 3 helps protect user data and reduces dependency on cloud services.
- Automated Task Handling: Features like function calling and structured output simplify the creation of dynamic and responsive applications.
Keeping AI Use Safe
Google has also introduced ShieldGemma 2, an image safety classifier that sorts content into three categories: dangerous content, sexually explicit material, and violence. It’s a step toward helping developers keep their platforms safer for users.
Easy Access for Developers
Developers can start using Gemma 3 through platforms like Google AI Studio, Hugging Face, and Kaggle. The models work smoothly with popular tools and frameworks such as Hugging Face Transformers, Ollama, JAX, Keras, PyTorch, Google AI Edge, and more, offering plenty of flexibility for different projects.
Why It Matters
Gemma 3 isn’t just about making AI faster or smarter. It’s about lowering the barriers so more people can build useful AI applications. Whether you’re developing for mobile or desktop, Gemma 3 brings new possibilities to the table.
For anyone interested in the technical details, Google has shared a full report outlining the architecture and benchmarks of Gemma 3.