Products & Toolsgooglegemma 4macosai edge

Google Brings AI Edge Gallery To macOS

|June 4, 2026|By LDS Team

6.6

Relevance Score

Google Brings AI Edge Gallery To macOS — Photo: 9to5mac.com · rights & takedowns

Google released the Google AI Edge Gallery for macOS, letting Mac users run its Gemma models locally, alongside a new Gemma 4 12B model and the on-device Google AI Edge Eloquent dictation app, per Google's developer blog and 9to5mac. Google describes Gemma 4 12B as designed to bring agentic, multimodal intelligence to laptops, running on machines with about 16GB of RAM and handling text, vision, and audio. 9to5mac reports the macOS Gallery currently exposes five Gemma builds, including Gemma-4-12B-it and several Gemma-3n variants, and contrasts it with runtimes like Ollama and LM Studio that allow installing a wider set of third-party models. Google also extended its LiteRT-LM CLI with a serve command that creates a local, OpenAI-compatible endpoint for fully on-device agents and tools.

What happened

Google released the Google AI Edge Gallery for macOS, enabling local execution of its Gemma models on Macs, and introduced a new Gemma 4 12B model plus the on-device Google AI Edge Eloquent dictation app, per Google's developer blog and 9to5mac. Google describes Gemma-4-12B-it as designed to bring agentic, multimodal intelligence to laptops, running on machines with about 16GB of RAM and handling text, vision, and audio. 9to5mac reports the macOS Gallery exposes five Gemma builds: Gemma-4-12B-it, Gemma-4-E2B-it, Gemma-4-E4B-it, Gemma-3n-E2B-it, and Gemma-3n-E4B-it.

Technical details

Per Google's blog, the Gallery can generate and run Python locally for tasks such as data analysis and charting, and Eloquent adds on-device voice editing powered by Gemma 4 12B. Google also extended its LiteRT-LM CLI with a serve command that creates a local, OpenAI-compatible endpoint, letting standard tools and SDKs point at an on-device model.

Editorial analysis

Class B analysis: local models trade raw scale for on-device availability, lower latency, and reduced cloud dependency. Running a 12B-class model on a laptop typically depends on sufficient memory and on accelerator support such as Apple silicon, so practical performance varies by machine and quantization. A curated, vendor-supplied catalog differs from open runtimes like Ollama and LM Studio, which let users install a wider range of third-party models.

What to watch

•Whether the Gallery expands beyond the initial five Gemma builds.
•Real-world throughput and quality of Gemma-4-12B-it on Macs versus cloud-hosted models.
•Interoperability between LiteRT-LM endpoints and existing local runtimes and agent frameworks.

Key Points

1Google's AI Edge Gallery now runs Gemma models locally on macOS, with a new Gemma 4 12B multimodal model targeting laptops with about 16GB of RAM.
2The macOS Gallery ships a curated set of five Gemma builds, unlike open runtimes such as Ollama and LM Studio that allow third-party models.
3A new LiteRT-LM serve command exposes a local OpenAI-compatible endpoint, easing fully on-device, privacy-preserving agent workflows.

Scoring Rationale

A notable product release that makes Google's Gemma family, including a new multimodal 12B model, runnable on-device on macOS, which matters for offline, low-latency, and privacy-preserving workflows. It is an incremental local-AI advance rather than a frontier-model milestone, placing it in the notable tier.

MoreGoogle AI news

Sources

Public references used for this report.

2 sources

developers.googleblog.comBringing Gemma 4 12B to your Laptop: Unlocking Local, Agentic Workflows with Google AI Edge

9to5mac.comGoogle AI Edge Gallery launches on macOS, letting Mac users run Gemma models locally

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

Active Search Campaigns by BudgetEasy

High CPC Clicks & Poor Landing PagesMedium

Campaign ROAS by Attribution ModelHard

250 free problems · No credit card

See all Ad Tech problems