Google I/O 2026: Gemini Omni unveiled as Google’s ‘can do anything from any input’ world model
At its annual I/O developer conference, Google announced Gemini Omni as its next-generation AI model capable of generating and editing highly realistic video outputs from virtually any form of media input.
Touted by the tech giant as the “next big step” in AI, Gemini Omni is designed as a comprehensive world model. Unlike previous text-to-video platforms, such as Google’s own Veo, Omni is a true multimodal system built on the company’s core Gemini architecture. It utilises advanced reasoning and simulated physics to interpret text, images, and existing video clips simultaneously, producing highly consistent and sophisticated cinematic results grounded in real-world logic.
