Gemini Omni AI Video Generator — Video, Character & Audio

Module

Prompt0/20000

Reference Images(0/7)

Click to upload or drag and drop

JPG, PNG, WebP • Max 30MB

Up to 7 image slots total (video uses 2, each character ID uses 1)

Source Video (optional)

Click to upload video or drag and drop

MP4, MOV, WebM • Max 100MB

Audio IDOptional

Character IDsOptional

Resolution

Duration

Estimated cost: $2.25

Aspect Ratio

Seed (optional)Optional

Required credits: 225

Result

Video generation usually takes 1–3 minutes. Please don't close this page

Transform Worlds Through Conversation

Official demo clips from Google I/O 2026 Gemini Omni showcase

Showcase

Official Google Gemini Omni Demos

These clips are from Google's I/O 2026 Gemini Omni showcase — conversational video editing, reference-guided style, and voice-driven generation in one model family.

Transform Worlds Through Conversation

Start from real footage and reshape the environment step by step — bubble structures become structural foam while the scene stays coherent across edits.

Why Gemini Omni

Google's Multimodal Video Creation Model

Gemini Omni Flash combines Gemini reasoning with generative video — text, image, video, character, and voice inputs in one API family.

Multimodal Video Generation

Generate 4–10 second clips at 720P, 1080P, or 4K from prompts plus up to 7 reference images, 1 source video, 1 audio ID, and 3 character IDs.

Conversational Video Editing

Upload source footage and transform scenes with natural language — change environments, actions, camera angles, and effects while keeping coherence.

Reusable Character Assets

Create stable character IDs from a portrait and description. Reuse them across Gemini Omni video generations for consistent identity.

Custom Voice Profiles

Generate reusable audio IDs from preset voices plus optional voice descriptions and sample dialogue.

Real-World Scene Logic

Gemini Omni connects visual creation with physics, narrative, and context — outputs feel intentional, not random.

Transparent Pay-As-You-Go Pricing

Pricing follows Kie.ai list rates at 3× markup. Video from 135 credits (4s 720P), character 20 credits, audio 15 credits.

How to Use

Create in 3 Steps

Use the Video, Character, and Audio tabs in one studio

Build Reusable Assets

Open the Character tab to create a character ID from a portrait, or the Audio tab to create a voice ID from a preset voice and description.

Compose Your Video Prompt

In the Video tab, describe the scene, add reference images or source video, and paste character/audio IDs as needed.

Generate & Download

Pick resolution, duration, and aspect ratio, then generate. Credits are charged upfront; failed video tasks are refunded automatically.

FAQ

Gemini Omni FAQ

Common questions about Gemini Omni on our platform

How is video pricing calculated?

Without source video: 720P/1080P costs 135/180/225/270 credits for 4/6/8/10 seconds; 4K costs 315/360/405/450 credits. With source video: 360 credits (720P/1080P) or 540 credits (4K) per generation. All prices are 3× Kie.ai list USD (100 credits = $1).

How much do Character and Audio cost?

Creating one reusable character ID costs 20 credits. Creating one reusable voice profile costs 15 credits. Both are fixed per-asset fees at 3× Kie upstream cost.

What inputs does Gemini Omni Video support?

Required prompt plus optional reference images (up to 7 slots), one trimmed source video (uses 2 slots), up to 3 character IDs (1 slot each), and one audio ID.

Can I edit existing footage?

Yes. Upload a source video in the Video tab. Output duration is determined automatically by the model; duration controls are disabled when video input is present.

What resolutions and durations are available?

Resolutions: 720p, 1080p, 4k. Durations: 4, 6, 8, or 10 seconds when no source video is uploaded. Aspect ratios: 16:9 or 9:16.

How long does generation take?

Most video jobs complete within 1–3 minutes. Character and audio asset creation is usually faster and returns IDs immediately.

Gemini Omni AI Video Studio

Result

Official Google Gemini Omni Demos

Transform Worlds Through Conversation

Why Gemini Omni

Google's Multimodal Video Creation Model

Multimodal Video Generation

Conversational Video Editing

Reusable Character Assets

Custom Voice Profiles

Real-World Scene Logic

Transparent Pay-As-You-Go Pricing

How to Use

Create in 3 Steps

Build Reusable Assets

Compose Your Video Prompt

Generate & Download

FAQ

Gemini Omni FAQ

How is video pricing calculated?

How much do Character and Audio cost?

What inputs does Gemini Omni Video support?

Can I edit existing footage?

What resolutions and durations are available?

How long does generation take?

Get Started

Start Creating with Gemini Omni