Gemini Flash model gets visual reasoning capability

Google has added an Agentic Vision capability to its Gemini 3 Flash model, which the company said combines visual reasoning with code execution to ground answers in visual evidence. The capability fundamentally changes how AI models process images, according to Google.

Introduced January 27, Agentic Vision is available via the Gemini API in the Google AI Studio development tool and Vertex AI in the Gemini app.

Agentic Vision in Gemini Flash converts image understanding from a static act into an agentic process, Google said. By combining visual reasoning andcode execution, the model formulates plans to zoom in, inspect, and manipulate images step-by-step. Until now, multimodal models typically processed the world in a single, static glance. If they missed a small detail—like a serial number or a distant sign—they were forced to guess, Google said. By contrast, Agentic Vision converts image understanding into an active investigation, introducing an agentic, “think, act, observe” loop into image understanding tasks, the company said.

What's Hot

NEET MDS 2026 registration begins at nbe.edu.in: Check important dates | Education News

RRB NTPC 2026 UG Expected Questions, Download PDF & Live Quiz

LearnCube: Senior Account Manager – DACH – (SaaS / Corporate Training / EdTech)

Gemini Flash model gets visual reasoning capability

Microsoft accelerates pace of VS Code development

Save money by canceling more software projects, says survey

The AI coding hangover

NEET MDS 2026 registration begins at nbe.edu.in: Check important dates | Education News

RRB NTPC 2026 UG Expected Questions, Download PDF & Live Quiz

LearnCube: Senior Account Manager – DACH – (SaaS / Corporate Training / EdTech)

Expert Strategies for Aaroh, Vitan, and the Writing Section

News

Usefull Links

Latest jobs

What's Hot

Gemini Flash model gets visual reasoning capability

Related Posts

News

Usefull Links

Latest jobs

Subscribe to Updates