
The release includes Google AI Edge Gallery for macOS, where developers can use Gemma 4 12B to generate and run scripts for tasks such as data analysis. Google also said its Eloquent voice dictation and editing app now runs fully on-device on macOS, with support for local transcription and voice-driven text editing.
Google has also expanded LiteRT-LM, its lightweight command-line tool for running language models locally, with a new serve command. The company said this allows the CLI to act as a local LLM server and lets developers connect Gemma 4 12B to standard tools, SDKs, and frameworks through a local endpoint.
“Your data stays on your device while maintaining reliable responsiveness, utility, and cost efficiency,” the company said in the blog post.
The announcement comes as enterprises are looking beyond large, general-purpose models for some AI workloads. Gartner predicted that by 2027, organizations will use small, task-specific AI models at least three times more than general-purpose large language models, citing demand for more contextualized and cost-effective AI systems.

