Veo 3.1 Guide: How to Use It and Build Long Videos (with Pricing)

Veo 3.1: Google's AI Video Generation Model
Veo 3.1 is Google's latest model for generating video clips from text or images, released on October 15, 2025. What sets it apart is that it doesn't just make a moving image — it produces realistic video with synchronized audio (dialogue, ambient sound, and effects) in a single pass. This version focused on improving image-to-video, cinematic control, and keeping characters consistent across shots.
This practical guide covers what's new, where to use it, how to build a long, connected video (the question everyone asks), and ends with a detailed pricing breakdown.
What's New in Veo 3.1?
Beyond better image and audio quality, this version added three important practical capabilities:
- Reference images (Ingredients to video): upload up to 3 reference images of a character, product, or scene and the model adheres to them — the key to character consistency or product identity across multiple shots.
- Scene extension: generate new clips that connect to the end of the previous one (each new clip builds on the final second of the one before), giving you a longer video — a minute or more — with visual and audio continuity.
- First and last frame: define a starting image and an ending image, and the model generates the smooth transition between them, complete with audio.
Where Can You Use Veo 3.1?
Veo 3.1 is available through four routes — pick based on your needs and technical level:
- The Gemini app: the easiest way to try it — write a description and get a clip. Good for individuals and quick experiments.
- Flow: Google's AI filmmaking tool, where you build, manage, and assemble scenes into a complete video (requires a Google AI Pro or Ultra subscription). This is the practical place to build long videos.
- Gemini API (via Google AI Studio): to connect generation to your own software and products with code.
- Vertex AI: the enterprise route on Google Cloud, for large-scale production with enterprise controls.
How to Use It, Step by Step
The basics are simple: describe what you want precisely and the model generates the clip. The sharper the description, the closer the result is to what you had in mind:
- Set the scene: subject, place, time, and lighting (e.g., "a modern coffee shop at sunset with warm lighting").
- Describe motion and camera angle: "a slow push-in shot," "the camera orbits the product."
- State style and audio: cinematic, realistic, with soft music or ambient sound.
- Start from an image (optional): turning your product photo into video gives higher brand consistency.
How Do You Build a Long Video?
A single clip is usually short (about 8 seconds), but Veo 3.1 lets you go beyond that by building a connected chain. The practical method:
- Plan the scenes first: write a short script and a shot list (storyboard) before generating — it saves a lot of time and cost.
- Lock characters and identity: use reference images (up to 3) so the same hero or product stays consistent in every shot.
- Generate the first clip, then extend it with Scene extension to continue the action consistently, or use a shot's last frame as the first frame of the next for a seamless transition.
- Assemble the scenes in Flow: arrange shots in one sequence and tune the pacing and audio.
- Export the final video.
Technically via the API: you can extend a Veo-generated video by 7 seconds at a time, up to 20 times, bringing the total to about 141 seconds. Inside Flow, assembly is easier and visual, with no code.
Practical Steps in Flow: Reference Images, Extension, and Assembly
These are the three most-asked-about features, and they all live inside Flow. Here are the exact steps as documented in Google's official Flow help pages:
1) Lock a character or product (Ingredients):
- Open your project in Flow, click the model name in the prompt box, then choose Video > Ingredients.
- Add your reference images: drag and drop them, type @ to pick an asset already saved in the project, or click Add under the prompt to upload from your device.
- In the prompt, describe how the ingredient should be used, e.g. "the same man walking down a foggy street." For best results, use a plain or segmented background.
- Faster route for a recurring character: define it once as a Character, then call it in any shot by typing @CharacterName.
2) Scene extension (Extend) — where to find it:
- It only works on Veo-generated clips. Open the project and click the clip you want to make longer.
- At the bottom, click Extend, describe how the action should continue, then click Generate.
- You can repeat the extension several times to lengthen the scene. Note: an extended clip can't take other edits such as insert, remove, or camera moves.
3) Assemble scenes in Scenebuilder:
- Hover over the clip you want to add, click More, then Add to Scene.
- Once your clips are added: reorder them by dragging, trim the start and end of each with the handles, preview the whole sequence, then download the final scene.
- To add an earlier version of a clip you edited, first save it from the History panel with Save to Project, then add it to the scene.
Tips for Better Results
- Iterate cheap, then produce expensive: brainstorm in the Gemini app, then render the final version in Flow or the API.
- Polished short shots beat one long shaky take: three 20-second shots are often better than a single "minute."
- Be specific: vague prompts produce random results; specify lighting, mood, and camera angle.
The value isn't "type a sentence, get a film" — it's good planning, then building connected scenes with the consistency and extension tools.
Pricing
Google sells access to Veo 3.1 through two entirely different doors:
1) Monthly subscription (easiest, no code):
- Google AI Pro: about $20/month, includes Veo in the Gemini app and access to Flow with limited credits. Good for individuals, small businesses, and serious experimentation.
- Google AI Ultra: about $250/month, the highest Flow credits and broadest access, for professionals and heavy use.
- (There is also a lower "Google AI Plus" tier that is cheaper but usually does not include Flow.)
2) Pay-per-second via the API / Vertex AI (for products and large-scale production):
- Veo 3.1: about $0.40 per second of video.
- Veo 3.1 Fast (faster and cheaper): about $0.15 per second.
- Example: an 8-second clip costs about $3.20 on the standard model and about $1.20 on Fast.
- Google states that Veo 3.1 is the same price as Veo 3, and prices may differ for 4K resolution or by region.
Note: prices are in US dollars and may differ in Saudi Arabia depending on currency and access route, and they change from time to time — verify on Google's official pages before relying on them (links in "Sources").
Practical Uses for Businesses
For a business owner, the value is in cutting the cost and time of video production: short ads for social platforms, product demos, explainer clips, real-estate content, and training material — all now possible in days instead of weeks and at a far lower budget than traditional production.
Origami's Role
At Origami we're a technology company that integrates models like Veo 3.1 into your company's workflow: from crafting the prompt and a consistent visual identity, to connecting generation to your software via the API and automating content production at scale, all with cost under control. The goal is publish-ready results, not just experiments.
Sources
- Google Developers Blog — Veo 3.1 launch: developers.googleblog.com
- Veo in the Gemini API docs (extension and frames): ai.google.dev/gemini-api/docs/video
- Gemini API pricing: ai.google.dev/gemini-api/docs/pricing
- Google AI plans (Pro / Ultra): one.google.com/about/google-ai-plans
- Note: features and prices change quickly; the official sources above are the reference.
Frequently Asked Questions
How long can a Veo 3.1 video be?+
A single clip is usually short (about 8 seconds), but with Scene extension you can reach a minute or more. Via the API you can extend by 7 seconds at a time, up to 20 times, for a total of about 141 seconds.
How do I build a long, connected video?+
Plan the scenes, lock characters with reference images, then generate the first clip and use Scene extension or the last frame to connect shots, and assemble them in Flow into one sequence with consistent audio.
How much does Veo 3.1 cost?+
Two doors: a monthly subscription (Google AI Pro about $20, or Ultra about $250), which is the easiest with no code; or pay-per-second via the API (about $0.40 per second, and $0.15 for the Fast model) for large-scale production. Prices can change and may differ by region.
Does Veo 3.1 generate audio with the video?+
Yes. It generates native synchronized audio including dialogue, effects, and ambient sound within the same generation pass — one of this version's standout features.
Rate this article
Related Articles
- Artificial IntelligenceHow AI is Reshaping the Future of Business in Saudi Arabia?AI is no longer science fiction. Explore how Saudi companies use AI technologies to improve efficiency, reduce costs, and innovate new business models.
- Artificial IntelligenceAI in Procurement and Inventory: How It Saves Your Business Money and TimeDead stock and guesswork purchasing quietly drain the profits of many businesses. Learn how AI turns your data into sharper purchasing decisions and leaner inventory.
- Artificial IntelligenceAutomating Customer Service with WhatsApp and AI ChatbotsA practical guide to automating customer service with WhatsApp and AI chatbots: reply to customers instantly 24/7, cut costs, and raise satisfaction in Saudi Arabia.
Weekly newsletter
The latest articles that matter to business owners, once a week. Just your email.
Looking for a software solution for your business?
At Origami we build custom systems, websites, and stores tailored to how your business works. Get in touch and we'll show you how we can help.
