Google’s Gemini Omni could change how we create and edit video entirely

Google has introduced Gemini Omni, a new multimodal AI model designed to transform video creation and editing using advanced generative capabilities. Building on its earlier Nano Banana image tools, Omni extends similar functionality to video, all...

By Dhruv Mohan, ET Online | May 20, 2026, 12.39 AM IST

Google has officially introduced Gemini Omni, a new multimodal AI model focused on video generation and editing. The announcement follows weeks of speculation around the project, with references to “Omni” reportedly appearing for some users inside the Gemini dashboard just two days before launch.

Omni builds on the work Google started with Nano Banana last year, which brought Gemini-powered image generation and editing tools to users. While Nano Banana focused primarily on still images, Gemini Omni expands those capabilities into video and broader multimodal generation.

The first release in the lineup is Gemini Omni Flash, which is rolling out starting today across the Gemini app, Google Flow and YouTube Shorts.

A major part of Omni’s pitch is conversational video editing. Instead of using traditional editing tools, users can modify clips using natural language prompts. Google says edits can build on top of each other while maintaining consistency between characters, environments and motion across scenes.

The company demonstrated examples where users could alter the action inside existing videos, transform environments or add entirely new visual elements. In one showcased prompt, a person touches a mirror, causing it to ripple like liquid while their arm gradually transforms into reflective mirror material.

Google is also positioning Omni as more than just a visual generation model. According to the company, the system has been trained to better understand physical interactions such as gravity, motion and fluid dynamics, allowing generated scenes to behave more realistically. Gemini’s broader knowledge model is also being integrated to help generate explainers and context-aware visuals rather than purely aesthetic clips.

Omni supports multiple input types, including text, images, video and voice references. Users can combine different references together to guide the style, motion or structure of the final generated clip. Audio support is currently limited to voice references, with broader audio input capabilities expected later.

Another feature being introduced is AI-generated Avatars, allowing users to create digital versions of themselves for video generation using their own voice. Google says broader speech editing capabilities are still being tested before a wider rollout.

As with its other generative AI products, Google says all videos created with Gemini Omni will include SynthID watermarking for content verification and AI transparency.

Gemini Omni Flash is launching first for Google AI Plus, Pro and Ultra subscribers globally through the Gemini app and Google Flow. The company is also bringing the technology to YouTube Shorts and the YouTube Create app at no additional cost starting this week.

Google says developer and enterprise API access will arrive in the coming weeks.

Download
The Economic Times Business News App for the Latest News in Business, Sensex, Stock Market Updates & More.

Google’s Gemini Omni could change how we create and edit video entirely

Google has introduced Gemini Omni, a new multimodal AI model designed to transform video creation and editing using advanced generative capabilities. Building on its earlier Nano Banana image tools, Omni extends similar functionality to video, all...

READ MORE:

More from our Partners

Popular Categories

Hot on Web

In Case you missed it

Top Searched Companies

Latest News

Download ET APP

Follow us on

become a member