Veo 3.1 by Google

Veo 3.1 AI Video Generator

Veo 3.1 is Google's AI video model for text-to-video and image-to-video generation. It supports up to three reference images, first or last frame guidance, native audio, and 4, 6, or 8 second clips up to 1080p or 4K.

Try Veo 3.1 View showcase

Veo 3.1 preview

Native audio with image-led direction

3Reference images per generation

4-8sClip duration options

24fpsOutput frame rate

Text or image promptsUp to 3 reference imagesFirst and last frame controlNative audio4K at 8 seconds24fps outputText or image promptsUp to 3 reference imagesFirst and last frame control

What Veo 3.1 can do

Key features of Veo 3.1

Veo 3.1 is built for short-form scenes that need better shot direction, tighter visual anchoring, and sound inside the first generation.

dual input modes

Generate from text alone or from a source image, then layer up to three reference images when a shot needs tighter visual direction.

reference images

Anchor style and subject details with as many as three reference images instead of relying on one image or prompt wording alone.

frame controls

Guide the opening or closing beat with first-frame and last-frame inputs when a scene needs a precise visual start, finish, or transition.

camera language

Prompt shot scale, lens feel, and movement directly so dolly moves, aerial sweeps, and close framing read more like planned cinematography.

video extension

Extend previously generated clips when one good take needs more runway instead of rebuilding the sequence from the first frame.

native audio

Render video with dialogue, sound effects, and ambience in the same generation pass instead of sending clips to a separate audio workflow.

Veo 3.1 video showcase

These sample directions illustrate the kinds of short scenes Veo 3.1 fits best: dialogue, vertical portrait motion, and cinematic teaser beats with direct camera language.

How to use Veo 3.1 on Alici

How to Create AI Videos with Veo 3.1

Veo 3.1 is best when the brief is short, the visual references are intentional, and the scene needs audio, pacing, and framing to land together.

Open Veo 3.1

Upload your prompt assets

Start with a text brief or a source image, then add up to 3 reference images if the shot needs stronger subject or style control. Reference-image jobs require an 8 second duration in Veo 3.1.

Describe the scene

Write the action, setting, camera movement, and sound cues in one prompt. Veo responds best when the prompt names subject action, shot framing, lens feel, and any dialogue or ambient audio expectations.

Generate and refine

Choose 16:9 or 9:16 output, then render a 4, 6, or 8 second clip at 720p, 1080p, or 4K depending on the shot setup. Extend successful 720p generations when the first take needs another beat.

Featured creators

Top Veo 3.1 Creators on Instagram

These creators are the clearest fit for Veo 3.1 style workflows: realistic UGC, dialogue-led scenes, and identity-stable short clips.

Miko

@Mho_23 · Workflow Veo 3.1 Creator

48 posts

Our Insight: Uses health confessional ugc as the core hook, with pacing and scene clarity that make the clip a strong fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Aria Cruz | Influencer AI

@soy_aria_cruz · Workflow Veo 3.1 Creator

1174 posts

Our Insight: Best direct vector match around "soy aria cruz: Gladiator Corridor Scene", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Raine

@raine_traveller · Workflow Veo 3.1 Creator

66 posts

Our Insight: Best direct vector match around "raine.traveller: Edens Fate The Mistake Wrong Address Scene", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Karol Życzkowski

@dreamweaver_ai_pl · Workflow Veo 3.1 Creator

117 posts

Our Insight: Best direct vector match around "dreamweaver ai pl: Breakfast Horror Teeth Reveal", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Salma

@salmaaboukarr · Workflow Veo 3.1 Creator

20 posts

Our Insight: Best direct vector match around "Salmaaboukarr: Facial Expressions Lip Sync", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Sara Shakeel

@sarashakeel · Workflow Veo 3.1 Creator

272 posts

Our Insight: Best direct vector match around "sarashakeel: Crystalline Big Cat Fantasy Montage", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Katsukokoiso | AI visual artist

@katsukokoiso_ai · Workflow Veo 3.1 Creator

76 posts

Our Insight: Best direct vector match around "katsukokoiso.: Steve Aoki Easy Surreal", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Mona Lisa & Friends

@monalisa_and_friends · Workflow Veo 3.1 Creator

101 posts

Our Insight: Best direct vector match around "Girl With Pearl Earring Dance", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Tim Tadder

@timtadder · Workflow Veo 3.1 Creator

95 posts

Our Insight: Best direct vector match around "timtadder: Red Veil Mirror Water Fashion", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Project Alice ΛI 🍄 Sharon Saar | AI & Design

@sharonsaar_design_ai · Workflow Veo 3.1 Creator

39 posts

Our Insight: Best direct vector match around "project alice: Surreal Rainbow Slide Aesthetic", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Rio

@rioaigc · Workflow Veo 3.1 Creator

41 posts

Our Insight: Builds transformation-driven character reveals with cleaner progression beats, making the clip a strong fit for identity-stable fantasy scenes and short cinematic evolution shots.

Night Wolf

@nightwolf_ai · Workflow Veo 3.1 Creator

10 posts

Our Insight: Best direct vector match around "nightwolf : Cinematic Car Narrative InVideo Breakdown", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

aiproductionstudios

@aiproductionstudios · Workflow Veo 3.1 Creator

50 posts

Our Insight: Best direct vector match around "aiproductionstudios: Fashion Campaign", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

🍥 Timmy 🍥

@ixitimmyixi · Workflow Veo 3.1 Creator

8 posts

Our Insight: Best direct vector match around "IXITimmyIXI: Underwater Fireball", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Kelly Boesch

@kelly_boesch_ai_art · Workflow Veo 3.1 Creator

38 posts

Our Insight: Best direct vector match around "kelly boesch ai art: Arctic Steampunk Explorer", with a clear fit for UGC realism, identity-stable people shots, and dialogue-ready short cinematic scenes.

Discover More AI Video Creators →

Technical specifications

Veo 3.1 technical specifications

Veo 3.1 supports short clip generation with native audio, two aspect ratios, three output lengths, and a reference-image workflow that locks key constraints to 8-second jobs.

Input specifications

Prompts: text-to-video or image-to-video generation
Reference images: up to 3 style or subject references
Frame guidance: first-frame and last-frame control
Source media: one image input for image-to-video jobs
Extension input: previously generated Veo clip for continuation

Output specifications

Resolution: 720p, 1080p, and 4K
Duration: 4, 6, or 8 seconds per clip
Frame rate: 24fps
Aspect ratios: 16:9 and 9:16
Audio: video with native generated sound

Workflow support

Reference-image jobs: 8-second clips only
1080p and 4K jobs: 8-second clips only
Extension workflow: supported on 720p generations
Model status: preview on Google's Gemini API
Use on Alici: google_veo_31_pro_h

Built for these workflows

Who uses Veo 3.1: use cases by industry

Veo 3.1 fits teams that need short, directed scenes with audio and better visual anchoring than prompt-only video models.

AI Video for Social Media Marketing

Create launch hooks, short dialogue scenes, and creator-style vertical edits with native audio so the first output is closer to a publishable social cut.

AI Video for E-Commerce & Product Marketing

Turn one product image and a tight shot brief into short demos, unboxing mood clips, and premium reveal loops for PDPs, ads, and launch pages.

AI Video for Advertising & Brand Content

Prototype campaign scenes with camera direction, sound cues, and controlled endpoints before the team commits to live-action production or post-heavy animation.

AI Video for Education & Training

Build short explainers, process visuals, and narrated lesson moments that need direct framing instructions and clear audio without a full studio setup.

Your content, your rights

Rights language should stay clear on launch pages, especially when teams plan social campaigns, commercial use, and client projects.

All videos generated with Veo 3.1 on Alici are intended to be watermark-free and available for commercial use, including client projects, subject to applicable laws, platform policies, and rights review before publication.

Veo 3.1 vs other AI video models

Feature	Veo 3.1	Seedance 2.0	Kling 3.0	Runway Gen-4.5
Multi-modal inputs	Text-to-video or image-to-video, plus up to 3 reference images	Text with image, video, and audio assets in one generation	Text, image, video, and audio inputs across video workflows	Text-to-video or image-to-video with image reference support
Audio generation	Native audio generated in the same video output	Built-in audio generation with lip-sync support	Native audio, dialogue, and lip-sync in one workflow	Lip-sync and AI voice handled through separate Runway tools
Reference control	Up to 3 reference images plus first or last frame inputs	Tagged image, video, and audio references inside one prompt	Image, video, and voice references with storyboard control	Image references guide identity and style direction
Consistency target	Subject appearance and shot direction anchored across one 8-second scene	Frame-level identity, prop, and styling continuity	Character, wardrobe, and voice continuity across up to 6 cuts	Character and location continuity across Gen-4 referenced shots
Editing workflow	Extend prior generations and constrain openings or endings	Extend, replace, insert, and revise with continuity	Extend clips, apply motion brush, and steer camera paths	Extend shots, upscale exports, and pair with audio tools

Veo 3.1 differentiates from Seedance 2.0, Kling 3.0, and Runway Gen-4.5 by pairing native audio with up to three reference images and frame-specific control in one model. That combination makes its 8-second shot format especially useful when the opening image, the ending beat, and the sound all need to align from the start.

FAQ

Everything you need to know about Veo 3.1

What is Veo 3.1?

Veo 3.1 is Google's AI video generation model for text-to-video and image-to-video work. It creates short clips with native audio, supports up to three reference images, and adds first-frame, last-frame, and extension workflows so creators can guide both the visual direction and how a scene opens or resolves.

What inputs does Veo 3.1 support?

Veo 3.1 supports text prompts, image-to-video generation, and up to three reference images for style or subject guidance. It also supports first-frame and last-frame inputs for frame-specific generation, plus extension from a previously generated Veo clip when the goal is to continue an existing shot instead of starting over.

Does Veo 3.1 generate audio?

Veo 3.1 generates video with native audio, which means dialogue, ambience, and sound cues can arrive in the same output rather than being added afterward in a second tool. That matters most for creators who need to review pacing, spoken lines, and shot timing together instead of judging a silent draft.

How do reference images work in Veo 3.1?

Reference images let Veo 3.1 borrow visual direction from as many as three uploaded stills. In practice, creators use them to hold onto character styling, product appearance, or a particular aesthetic while the prompt controls action, camera language, and sound. Reference-image jobs require 8-second generations, which is part of the model's current rules.

What video quality and length does Veo 3.1 support?

Veo 3.1 supports 4, 6, and 8 second clips at 24 frames per second in 16:9 or 9:16 layouts. It can generate 720p, 1080p, or 4K output, but 1080p and 4K are limited to 8-second generations. Video extension is currently limited to 720p outputs.

Can Veo 3.1 extend or edit existing videos?

Veo 3.1 supports extending videos that were previously generated with Veo, which is useful when a shot already works and simply needs one more beat. It also supports first-frame and last-frame control, so creators can constrain how a scene begins or ends instead of relying on prompt wording alone.

How do I use Veo 3.1 on Alici?

To use Veo 3.1 on Alici, prepare a clear shot brief, decide whether the scene should start from text or from an image, then gather any reference stills that the shot needs. Once the route is live in Alici, the workflow will center on uploading those assets, setting duration and aspect ratio, and iterating on the scene direction.

How is Veo 3.1 different from Seedance 2, Kling 3.0, or Runway Gen-4.5?

Veo 3.1 stands apart by combining native audio output with up to three reference images and first-frame or last-frame control in one model. Seedance 2 covers more asset types in one generation, Kling 3.0 leans harder into multimodal storyboard workflows, and Runway Gen-4.5 stays strongest around image-led identity control plus adjacent audio tools.

How much will Veo 3.1 cost on Alici?

Alici pricing for Veo 3.1 has not been published yet. The route is being prepared as a product landing page first, so creators can review specs, workflows, and comparisons before access opens. When Alici enables the model, pricing will need to reflect Veo's higher-cost modes such as 1080p or 4K video with native audio.

Do I need video editing experience to use Veo 3.1?

Video editing experience is helpful, but it is not a requirement for getting useful first outputs from Veo 3.1. The more important skill is describing subject action, framing, and sound clearly. Editors will get more mileage from the tool, but marketers, educators, and creators can still work productively from references and prompts.

Can I use Veo 3.1 videos for commercial work?

Commercial use decisions should follow Alici's launch terms plus the rights review attached to your specific project. The landing page assumes a watermark-free workflow for client projects and marketing output, but teams still need to check brand permissions, likeness rights, and any applicable Google or Alici policy updates before publishing the final asset.

Is Veo 3.1 available now on Alici?

Yes. Veo 3.1 is available on Alici through the video generation workflow. This landing page explains where it fits best, how its reference-image and audio controls work, and how it compares with nearby models before you start a generation.

When will Veo 3.1 be available?

Veo 3.1 is already accessible on Alici through the video generator route. If workflow settings, pricing, or supported modes change over time, the live generation page is the source of truth for the current setup.

Start generating

Use Veo 3.1 on Alici

Open the Veo 3.1 workflow on Alici to start generating with text prompts, image inputs, reference images, and native audio in one route.

Use Veo 3.1 Compare models