Ai Pov Video Maker
Create first-person scenario videos that feel like a viewer is inside the scene instead of watching from the outside. This page should help users find POV formats built for immersive setups, role-play hooks, and short-form scenario storytelling.
INVARIANTS TO LOCK - Vertical 9:16 tutorial Reel about creating POV animal camera videos with AI. - Presenter is a young adult man in a black hoodie and backwards black cap, speaking directly into a microphone against a dark background with subtle red edge lighting. - Tutorial uses stacked screen captures from ChatGPT, OpenArt, Nano Banana 2, and Kling 3.0 above or alongside the presenter. - Core example is a macro wildlife scene involving an ant at an ant nest entrance, with a tiny mounted micro camera harnessed to the ant. - Tone is educational, platform-native, and workflow-specific. SHOTLIST 1. [00:00-00:06] Open on viral-looking POV animal thumbnails with bold text like MAKE POV ANIMAL VIDEOS, then cut to the presenter introducing the idea. 2. [00:06-00:14] Show ChatGPT on screen and a cursor selecting the concept for the animal POV video. 3. [00:14-00:22] Display a simple list titled Choose Your Animal, with options like ant, termite, field mouse, and more, while the presenter explains the setup. 4. [00:22-00:31] Cut to OpenArt or Nano Banana 2 image generation UI with a long prompt describing a macro wildlife researcher placing a tiny camera on an ant worker near the nest entrance. 5. [00:31-00:40] Show the prompt being pasted and image generation initiated, reinforcing that the first frame is built from a strong macro reference image. 6. [00:40-00:52] Transition into Kling 3.0 frame-to-video UI, with start/end frame logic selected, demonstrating how the macro ant POV concept is turned into motion. STYLE BIBLE Visual style: creator tutorial with high-contrast UI overlays and clear step-by-step logic. Camera signature: static talking-head presenter, combined with screen recordings, cursor highlights, and bold subtitles. Lighting signature: presenter in dark studio with subtle red side light; UI screens bright and high-contrast. Grade signature: neutral tech UI whites and blacks, magenta accents inside OpenArt, dark interface cards, clear subtitle emphasis. Speech style: concise workflow explanation, confident and beginner-friendly. MASTER PROMPT GLOBAL LOCK: Create a vertical tutorial Reel showing how to make POV animal camera videos with AI. Keep a young male presenter in a black hoodie and backwards cap speaking directly to camera from a dark setup with subtle red rim light. Layer the tutorial with clear screen recordings from ChatGPT, OpenArt using Nano Banana 2, and Kling 3.0. The example concept should be a hyper-real macro wildlife scene where a tiny camera is mounted on an ant near the entrance to its nest. The workflow should feel concrete, repeatable, and easy to follow. [00:00-00:06] Open on a collage of viral POV animal thumbnails with bold text stating how to make POV animal videos. Cut immediately to the presenter introducing the idea. [00:06-00:13] Show ChatGPT on screen while the presenter explains that the first step is using AI to develop the concept and prompt structure for the animal POV shot. [00:13-00:20] Display a choose your animal list, with the cursor selecting ant. Keep the presenter visible below, continuing the explanation. [00:20-00:31] Move into OpenArt or Nano Banana 2’s create image interface. Paste a detailed macro prompt describing a researcher attaching a tiny micro camera harness to an ant at the edge of an ant nest, with realistic soil, pebbles, plant stems, and documentary-style realism. [00:31-00:40] Emphasize that this prompt creates the reference frame. Show generation controls, image mode, and the final macro nature prompt inside the interface. [00:40-00:52] Switch to Kling 3.0 frame-to-video UI. Select start/end frame workflow and demonstrate how the ant POV concept is animated into a moving clip. End with the sense that the pipeline can be reused for any tiny animal camera idea. NEGATIVE PROMPT Do not turn the macro wildlife concept into fantasy creature art or generic insect imagery. Avoid weak scale cues, blurry UI, unreadable prompts, unrealistic ant anatomy, or sloppy tutorial pacing. The appeal depends on making the animal POV feel grounded and reproducible. SPEECH PACK [00:00-00:14] Speaker A. Meaning: this tutorial shows how to make viral POV animal videos with AI. Delivery: direct, helpful, energetic. TAKE_A: “Here is how to make POV animal camera videos with AI.” TAKE_B: “This format is going viral, and the workflow is actually very simple.” TAKE_C: “If you want those wild tiny-animal POV videos, this is the setup.” [00:14-00:31] Speaker A. Meaning: choose the animal and generate the base image prompt. Delivery: step-by-step. TAKE_A: “First pick your animal, then generate the base image with a strong macro wildlife prompt.” TAKE_B: “The trick is treating it like wildlife photography, not fantasy art.” TAKE_C: “Your first frame has to sell the realism before you animate anything.” [00:31-00:52] Speaker A. Meaning: use Kling to animate the frame into the final POV shot. Delivery: practical close. TAKE_A: “Then bring that frame into Kling, set the motion, and turn it into the final POV clip.” TAKE_B: “Once the image is strong, the video step becomes way easier.” TAKE_C: “This same workflow works for ants, beetles, mice, and almost any tiny-animal POV concept.”
INVARIANTS TO LOCK - Vertical 9:16 first-person selfie-style fantasy skit. - Same young adult white male creator with light skin, slim build, side-swept brown hair, wearing a plain black long-sleeve shirt at first, holding the camera himself. - Story premise is explicit: “POV: You used AI to travel back to medieval times.” - Environment progresses from an outdoor castle battlefield to a candlelit stone interior chamber, then to royal close-ups and armor-room reveal. - Tone mixes humor, wish-fulfillment fantasy, and creator-native AI demo energy. SHOTLIST 1. [00:00-00:05] Selfie-style walk through a muddy medieval battlefield outside a stone castle with riders, soldiers, and smoke in the background. On-screen text states the POV premise. 2. [00:05-00:10] Cut into a warm stone chamber where the creator stands beside a princess in a dark ornate gown; candles and medieval decor surround them. 3. [00:10-00:14] Romantic payoff: the princess leans in and kisses him while he stays in surprised selfie mode. 4. [00:14-00:17] Hard cut to an intense king close-up wearing a gold crown and embroidered royal robes, implying social or political consequence. 5. [00:17-00:20] Final reveal: the creator now appears as a prince or knight in silver chest armor, still holding the camera in the medieval room. STYLE BIBLE Visual style: comedic time-travel fantasy POV with AI-generated realism. Camera signature: handheld selfie lens, slight fisheye or wide-angle feel, direct eye contact, forward walking motion. Lighting signature: cool overcast daylight on the battlefield; warm candlelight inside the chamber; dramatic frontal detail on the king portrait. Grade signature: earthy medieval palette, smoky grays outdoors, amber stone interiors, metallic highlights on armor. Speech style: either silent text-led skit or short surprised reactions; playful, fast, and meme-ready. MASTER PROMPT GLOBAL LOCK: Create a vertical 9:16 POV fantasy Reel where a young adult white male creator appears to travel back to medieval times using AI. Keep it in handheld selfie perspective with a wide-angle phone-camera feel. Start outdoors in front of a large gray stone castle with smoke, mounted riders, scattered soldiers, and a muddy field. Keep the creator in a plain black long-sleeve shirt at first. Then move into a warm candlelit medieval chamber where he meets a princess in an elegant dark embroidered gown. End by escalating the fantasy into royal consequences and a prince-like armor reveal. Keep the tone playful, wish-fulfillment, and highly shareable. [00:00-00:05] The creator walks selfie-style through a medieval battlefield in front of a castle, smiling in disbelief. Horses, guards, and smoke drift behind him. Overlay text says: POV: You used AI to travel back to medieval times. [00:05-00:09] Cut inside a stone chamber lit by candles and window light. A beautiful princess in a dark gown stands beside him, holding onto his arm while he films himself, still amazed by the situation. [00:09-00:13] The princess pulls him in and kisses him on the cheek or lips while the creator keeps the camera up in awkward, excited selfie mode. The moment should feel romantic but comedic. [00:13-00:16] Hard cut to an older king in ornate robes and a gold crown staring directly into camera with stern suspicion, as if he has discovered what happened. [00:16-00:20] Final reveal: the creator now wears silver medieval armor and films himself in the same chamber, implying he successfully became a prince or knight. End on triumphant, funny surprise. NEGATIVE PROMPT Do not modernize the wardrobe, architecture, or props. Avoid plastic-looking armor, fake castle textures, weak crowd staging, flat candlelight, or inconsistent face identity between the battlefield, romance, and armor shots. Keep the selfie perspective believable and the medieval environment grounded. SPEECH PACK [00:00-00:10] Speaker A or text-led POV. Meaning: you used AI to time-travel into medieval fantasy and suddenly landed inside a royal story. Delivery: amused, surprised. TAKE_A: “POV: you used AI to travel back to medieval times.” TAKE_B: “Imagine using AI and ending up inside a medieval kingdom.” TAKE_C: “This is what AI time travel to the Middle Ages looks like.” [00:10-00:20] Speaker A or text-led payoff. Meaning: you meet the princess and become the prince. Delivery: playful wish-fulfillment. TAKE_A: “You meet the princess, kiss her, and somehow become the prince.” TAKE_B: “One minute you are time traveling, the next minute you are in royal armor.” TAKE_C: “Comment AI if you want the guide for this fantasy POV workflow.”
GLOBAL LOCK: A vertical 9:16 creator tutorial reel teaching how to make first-person time-travel vlogs with AI. The lower half of the video holds a young male creator speaking directly to camera in a dark studio with red side lighting, black hoodie or jacket, and a backward cap. The upper half alternates between social-proof examples, smartphone search screens, browser pages, prompt-writing documents, and final generated historical selfie videos. The core output style is a realistic vlog shot where a modern creator appears to be filming himself inside major historical moments such as Viking England, the Wild West, or D-Day. The entire reel should feel practical and system-driven, built for viewers who want repeatable viral history content. [00:00-00:12] Open on two successful example clips above the speaker: one where a young woman appears to selfie-vlog among Vikings in England in 865 AD, and another where she appears in a Wild West town in 1880. Both examples should look like genuine first-person historical vlogs with modern camera behavior but era-correct surroundings. View counts or social-proof markers should be visible to show that this content format already works. [00:12-00:28] Move into the workflow entry step through a smartphone UI. Show a phone search screen with “Time Travel” typed in, then a Google-like result page for “Higgsfield AI.” The creator below explains the process in clear terms, making the tutorial feel accessible. The emphasis is on how surprisingly simple the setup is once the right tools are known. [00:28-00:46] Show prompt-building and script-generation stages. Display a prompt document or text page labeled for text-to-video prompts, with entries for historical scenarios like landing craft before a beach assault or other era-specific vlog scripts. The interface should feel like a practical creator workflow rather than a polished marketing demo. The point is that the output begins with scripting the right first-person historical situation. [00:46-01:01] End on a dramatic finished example where the creator appears to be selfie-vlogging during a World War II beach landing, with smoke, soldiers, landing craft, and battlefield chaos behind him. Overlay a small thumbnail or packaging element suggesting how the final video can be turned into a clickable social or YouTube asset. The result should feel both absurd and convincing: modern vlog behavior dropped into a massive historical event. NEGATIVE PROMPT: static history painting look, third-person documentary framing, no selfie perspective, bland phone UI, generic prompts, inconsistent main character face, casual modern backgrounds, low-detail crowds, weak historical setting, no social-proof packaging. SHOT PROMPTS: Viking time-travel selfie vlog; Wild West selfie vlog; phone search Time Travel; Higgsfield AI search result; ChatGPT prompt document; text-to-video historical script; D-Day beach selfie vlog; viral history series tutorial. SPEECH PACK: One male speaker only. Tone is practical and energetic, emphasizing simplicity, virality, and repeatability. Stress “time travel vlogs,” “Higgsfield AI,” “ChatGPT prompts,” and the historical selfie angle.
GLOBAL LOCK: - Format: vertical 9:16 short-form tutorial reel, creator-education pacing, black background UI inserts, high contrast social video polish. - Keep one consistent male creator for all talking-head shots: young adult male, light skin, black backwards baseball cap, black hoodie/jacket, seated at desk, direct-to-camera framing, confident tutorial delivery. - Keep one consistent demo subject inside the generated example image/video: a plush panda lying on a worn circular rug in a dim rustic room with warm overhead spotlight, scattered objects around the floor, soft moody shadows. - No character drift, no costume drift, no sudden age changes, no extra presenters, no unrelated cutaways. SHOT TIMELINE: [00:00-00:03] Talking-head intro. Creator sits centered against dark background and speaks straight to camera with energetic tutorial tone. Large editorial text overlays summarize the hook: make cinematic scenes from your phone. Insert fast teaser flashes of social posts showing the panda image/video result and yellow headline blocks. [00:03-00:06] Phone close-up UI. Vertical smartphone screen fills frame. A circularly framed panda image appears inside a social-style composition. Overlaid kinetic words emphasize the concept of turning a phone photo into a scene. Screen recording aesthetic should remain crisp and legible. [00:06-00:09] Back to talking head. Creator gestures lightly while saying the workflow starts by opening the app. Tight chest-up framing, direct eye contact, subtle head movement, clean synced speech. [00:09-00:12] Phone settings interface. User taps through app menu and settings-like pages to reach AI generation tools. Interface is dark mode, minimal, modern, with distinct list items and icons. [00:12-00:16] Prompt-building section on phone. Search field, model selection, and text-entry screens appear. User searches for GPT/prompt helper style tools, selects options, and opens a text area. On-screen rhythm should clearly communicate “build the prompt first.” [00:16-00:20] Text drafting flow on phone. Long paragraph prompt appears in a dark text box. User chooses/copies prompt text, then taps through action buttons. Highlight the exact motions: choose, copy, click, and go. The UI should feel like a real mobile workflow, not abstract fake panels. [00:20-00:24] Model/generation interface. User pastes the prompt into an AI image/video generation tool, selects the correct model or preset, and taps generate. Show dark-mode tool UI with image prompt area, buttons, and tabs. [00:24-00:28] Example asset preview returns. The panda scene appears again as a generated image/video preview. The phone screen cycles from prompt entry to generated result. Add supporting overlay words that reinforce the logic of generating the scene from a single photo. [00:28-00:32] Phone-to-output transition. The generated panda shot becomes larger and more immersive, as if stepping out of the interface into the final cinematic frame. Keep the panda, rug, spotlight, and room layout consistent with the reference image. [00:32-00:35] Talking-head recap. Creator returns on camera and explains the final step or CTA. He maintains same wardrobe and setup, speaking with persuasive, practical creator-teacher energy. [00:35-00:39] Final CTA and social proof. Talking-head remains center frame while comment-style overlays and platform UI elements appear below, suggesting engagement and repeatability. End on a clean, punchy tutorial finish. VISUAL STYLE: - Social tutorial reel, fast but readable editing. - Mix talking-head shots with direct phone-screen recordings. - Dark UI, white text, occasional high-contrast yellow hook text. - Clean mobile creator aesthetic with authentic app interaction. CAMERA AND EDITING: - Talking-head: locked tripod or subtle digital push-in. - Phone segments: full-screen mobile capture with smooth taps and transitions. - Fast snap cuts between explanation, interface, and result. - Keep chronological clarity so the viewer can follow the workflow in order. SPEECH PACK: - Spoken language: English. - Creator voice: young male creator educator, confident, concise, practical, slightly hyped but not cheesy. - Delivery style: short tutorial phrases, clear CTA emphasis, social-video pacing. - Lip sync must stay natural and tightly aligned during talking-head shots. NEGATIVE PROMPT: - No extra hands floating over the phone. - No unreadable UI gibberish replacing app text. - No switching creator identity between talking-head shots. - No panda changing species, color, pose logic, or room layout between preview and final output. - No random additional animals or fantasy objects appearing in the room. - No horizontal framing, no cinematic letterboxing, no documentary cutaways. - No blurred phone screens, broken typography, or unusable interface text.
GLOBAL LOCK: A vertical 9:16 social ad-style creator demo designed to prove that AI UGC can now pass as authentic iPhone footage. The central actor is a cheerful blonde woman in her early 20s with fair skin, shoulder-length wavy hair, small hoop earrings, olive-green ribbed tank top, white high-waisted pants, and a light shoulder bag. Keep her identity, smile, casual tourist energy, arm-extended selfie angle, and natural handheld iPhone aesthetic consistent across all lifestyle clips. The first half should feel like genuine travel and theme-park UGC: overcast daylight, slight motion blur, imperfect framing, crowd-filled backgrounds, spontaneous laughter, and everyday phone-camera exposure. The second half shifts into a clean white Arcads AI interface showing cards, actor tools, Sora 2 labels, gesture options, prompt inputs, remix states, and a call-to-action for prompt access. On-screen headline text in the lifestyle clips reads: POV: You can’t tell if it’s AI anymore. End with a direct response CTA to comment REAL for the UGC prompt collection. [00:00-00:04] Open with selfie-style footage of the blonde creator smiling wide in front of a pink fairytale-style castle in a crowded theme park. The phone is held slightly above eye level at arm’s length. The frame should feel casually captured, with moving tourists behind her and overcast sky overhead. Large centered text reads: POV: You can’t tell if it’s AI anymore. [00:04-00:08] Cut through more candid UGC travel moments with the same woman: riding a colorful amusement ride, laughing into the camera, and turning while walking outdoors. Preserve authentic micro-shake, natural skin texture, and the kind of framing a real creator would capture on an iPhone. [00:08-00:11] Show another lifestyle beat of her eating a Mickey-shaped pastry or cookie while seated on a bench, still in selfie framing. Keep the wardrobe, hair, and tourist-day vibe consistent. The realism should come from ordinary action, not polished commercial blocking. [00:11-00:14] Transition into the Arcads AI product section. Use red and black title cards or quick branded blocks to separate the proof montage from the tool reveal. Introduce the idea that this realism is generated inside Arcads AI. [00:14-00:18] Display the Arcads interface on a bright white background with rectangular tool cards. Surface labels related to gestures, Sora 2, and actor tooling. The UI should look crisp and modern while the pacing stays social-first and easy to scan. [00:18-00:22] Zoom into the workflow: show an actor selection area, a prompt box, and a remix or generation state tied to a female actor reference. Make it clear that the creator is using Sora 2 actor capabilities and gesture control to produce believable UGC motion. [00:22-00:25] End with a conversion frame in the product UI, hinting at a prompt collection or one-credit generation CTA. Reinforce that viewers can get the realism-focused prompt pack by commenting REAL. The tone is confident, practical, and creator-oriented. NEGATIVE PROMPT: avoid polished commercial perfection, studio lighting, or hyper-smooth camera movement in the UGC clips; no uncanny blinking, rubbery mouth motion, warped arms from selfie perspective, or inconsistent hair length; keep realistic crowd density, natural overcast color, and phone-camera exposure; avoid theme-park background melting, extra fingers on the phone hand, broken earrings, or outfit drift; in the UI section avoid unreadable labels, malformed buttons, or random tool names not aligned with Arcads, Sora 2, actor, gesture, and prompt workflow language.
GLOBAL LOCK: A vertical 9:16 tutorial Reel, approximately 55 seconds, teaching creators how to make first-person “time travel vlog” videos with AI. The format alternates between three visual layers: (1) viral sample clips styled as selfie vlogs recorded inside different historical eras, with a modern creator holding the camera at arm’s length and speaking to viewers while standing inside convincing period environments such as ancient Pompeii, plague-era London, or ancient Egypt; (2) a talking-head male host in a black cap and dark jacket/hoodie, centered against a dark background with red-magenta accent lighting and studio microphone visible; and (3) screen recordings of research prompts, AI tool menus, model selection screens, and generation dashboards showing how the workflow is assembled. The historical vlog examples should feel UGC and immediate: casual arm-extended framing, reactive facial expression, period background detail, and humorous or surprising captions like “I tried to warn the people of Pompeii” or “I visited London during the Black Death.” The tutorial tone is direct, tactical, and creator-friendly. [00:00-00:05] Open with two or more viral time-travel vlog examples stacked above the host. Show a woman filming herself in ancient Pompeii and another person filming themselves in plague-era London, each with caption-style text embedded in the sample. The host below introduces the concept with bold text like TIMETRAVEL VLOG and immediately frames it as a repeatable AI content format. [00:05-00:12] Continue with more sample cards, including an ancient Egypt selfie-vlog shot near the pyramids with humorous “POV: I time-traveled…” captioning. Keep the host visible below, speaking quickly while the audience sees the end result before the process. [00:12-00:18] Transition to research/prompt structure. Show a white text document or GPT-style planning screen listing inputs such as historical event, era, location, future scenario, or fictional world. The text promises outputs like cinematic text-to-image prompts, text-to-video prompts, spoken vlog dialogue, and background action. The host explains that you first decide the time/place/event. [00:18-00:24] Show additional workflow pages or prompt-planning screens that suggest using a custom GPT or research agent to generate the historical setup, dialogue, and shot instructions. The host remains steady, centered, and instructional, while the UI reinforces that the process is systematic. [00:24-00:31] Move into the image-generation stage. Show a dark creative workspace with model selection (for example Seedream 5.0 Lite or other image tools), “Create Image” style tabs, visual reference upload zones, and prompt boxes. The host explains that you create still images first before moving to video. [00:31-00:38] Cut through tool chain screens implying additional steps: OpenArt or similar image creation, ElevenLabs for voice-over, and CapCut or editing steps. The important point is that the final time-travel vlog is modular: script, image, animation, voice, edit. Keep the visuals practical rather than abstract. [00:38-00:46] Show video-generation dashboards where historical selfie frames are converted into short clips. Example thumbnails display the host inside sandy excavation scenes or period streets, with output duration settings visible. The host explains that you use short durations and build several clips for one vlog. [00:46-00:55] End by returning to the strongest time-travel examples while the host summarizes the workflow. The final feeling should be that anyone can choose an era, generate a selfie-style historical perspective, add voiceover, and turn it into a serialized creator format. NEGATIVE PROMPT: avoid polished cinematic third-person shots, avoid generic documentary history footage, avoid inconsistent protagonist identity between clips, avoid modern background objects leaking into historical scenes, avoid unreadable UI panels, avoid lifeless history tableaux, avoid missing selfie-arm framing, avoid flat educational tone, avoid overlong text blocks on screen, and avoid making the workflow feel more complex than the creator can reproduce.
GLOBAL LOCK: Create a vertical tutorial reel about generating extreme FPV-style AI camera motion using Kling 2.6 inside Freepik Spaces. The entire video uses a dark, product-demo presentation shell with the creator visible in a rounded talking-head panel near the lower portion of frame. The creator is an apparent white adult man in his late 20s to late 30s with light skin, dark beard, blue baseball cap, white T-shirt, average build, and direct-to-camera tutorial energy. Above and around him, large demonstration panels cycle through fast cinematic examples and workflow screens. The visual language should feel like startup product education crossed with high-energy AI filmmaking: dark UI panels, sharp white headings, subtle glow accents, structured JSON prompt blocks, step cards, and occasional bold CTA typography. The motion signature of the demo clips must feel aggressively dynamic: low gliding FPV runs, underwater rushes, fantasy city fly-throughs, car-chase style streaking movement, aerial dives, and rapid forward camera travel. Lighting changes by example clip but the interface wrapper stays consistent. Speech is continuous through most of the reel with one primary male speaker, clear close-mic narration, confident tutorial cadence, medium-fast pace, high intelligibility, and phrase boundaries that often line up with visual changes. [00:00-00:04] Open with a striking FPV-style demo clip occupying the top half or top panel of the vertical frame: a fast low glide through a warm interior or restaurant-like space with tables, surfaces, and dramatic motion blur. White text referencing text-to-video JSON prompts appears near the top or center. The creator remains visible in a rounded lower talking-head box, gesturing with one hand while explaining that the workflow uses Kling 2.6 to create insane FPV-style camera motion. Lips are fully visible in the lower panel, lip_sync_strictness high, and the cut should land on a strong opening phrase. [00:04-00:08] Keep the same presentation shell while the top demo changes into another dynamic motion example, still emphasizing low-angle speed and aggressive camera path control. Show structured prompt text blocks below the main demo area, making it clear that the motion is driven by deliberate prompt engineering rather than random generation. The creator speaks continuously, with crisp articulation and subtle head movement, reinforcing that the workflow is text-to-video and motion-first. [00:08-00:12] Shift the main example to an underwater rush toward a large shark or fast-moving aquatic subject. The top panel should feel cold, blue, and immersive, with a strong forward movement signature and heavy depth cueing. The creator in the lower box explains that this workflow is for high-energy scenes, not static compositions. Maintain dry mic sound and clean sync. [00:12-00:16] Transition into a darker tunnel, chase, or enclosed-motion example in the top panel. The camera races through space with pronounced blur and strong vanishing-point pull. Prompt text remains visible in a structured block. The creator’s delivery becomes slightly more emphatic here, matching the sense of speed, but still remains clear and tutorial rather than dramatic acting. [00:16-00:22] Move into fantasy-scale flying examples. Show the camera soaring through arches, over rooftops, or across a monumental cityscape with dramatic elevation shifts. Preserve the creator in the lower rounded box as a stable identity anchor. The speech meaning here should explain that the same workflow can be used for cinematic movement, action-heavy fly-throughs, and bigger spectacle shots. Phrase endings should align to the scene changes. [00:22-00:27] Hard cut into the Freepik Spaces interface. The screen now emphasizes a dark product UI with cards, menu states, and a visual workflow board. “FREEPIK” branding or interface elements should be readable. The creator keeps talking in the lower panel, now transitioning from proof to process: where to click, what workflow to open, and how the system is structured. [00:27-00:33] Show a “Free Workflow” board or similar layout, then move into a “Step 1” card. The card should explain the first step in the process, such as feeding the model a structured motion prompt or setting up a text-to-video JSON block. The creator remains lower frame, speaking directly to camera with practical tone. The UI is dark, clean, and node-based or card-based, with white text and subtle blue accent glow. [00:33-00:38] Advance through “Step 2” and a larger JSON-style prompt block. Make the prompt area visually dense and technical, with multi-line structured text that feels like a reusable template. The creator continues narrating that the movement is engineered via prompt design, not guessed. Keep lip sync consistent and the lower talking-head framing unchanged. [00:38-00:42.2] Move to “Step 3” or a later workflow state, then return briefly to dramatic FPV examples: fast car-light streaks, aerial dives, ring-like or portal-like movement, and fantasy-style motion beats. The reel should now feel like a complete loop between examples and process, proving both result quality and reproducibility. End on a bold “Comment AI” CTA integrated into the dark interface shell while the creator remains visible in the lower panel, asking viewers to comment for the workflow link. NEGATIVE PROMPT: static slideshow tutorial, missing rounded talking-head panel, different male presenter identity, wrong wardrobe color, no blue cap, no beard, weak lip sync, robotic narration, over-reverbed voice, cluttered unreadable UI, pastel app interface, missing JSON prompt blocks, missing Freepik workflow screens, missing FPV energy, slow gentle camera movement, generic drone footage without aggressive forward motion, no underwater scene, no fantasy fly-through, no car-light streak example, low-detail city flyovers, text artifacts, warped hands in presenter box, jittery face replacement, over-stabilized cinematic ad polish, subtitles burned into every frame, meme-style caption clutter. SPEECH PACK: [00:00-00:04] TAKE_A: “This Kling two point six workflow lets you create insane FPV-style camera motion using pure text to video.” TAKE_B: “Here’s how we’re using Kling two point six to get wild FPV motion from text-to-video.” TAKE_C: “This setup gives you high-motion FPV-style shots in Kling two point six with text alone.” [00:04-00:08] TAKE_A: “The key is that the movement is not random, it’s driven by structured JSON prompt design.” TAKE_B: “What makes this work is the JSON prompt structure controlling the motion path.” TAKE_C: “These shots work because the motion is engineered through the prompt, not left to chance.” [00:08-00:12] TAKE_A: “You can use it for underwater rushes, fast action beats, and scenes with serious camera energy.” TAKE_B: “It handles high-speed sequences like underwater pushes and aggressive action movement really well.” TAKE_C: “This workflow is built for scenes where the camera needs to feel fast, immersive, and intense.” [00:12-00:16] TAKE_A: “That’s why it feels much more dynamic than the usual static AI video look.” TAKE_B: “It gives you something way more kinetic than flat, locked-off generations.” TAKE_C: “The result is motion that feels alive instead of stuck in place.” [00:16-00:22] TAKE_A: “You can also push it into cinematic fly-throughs, fantasy worlds, and dramatic aerial camera paths.” TAKE_B: “It’s not just for one type of shot, you can scale the same logic into bigger cinematic worlds.” TAKE_C: “The same workflow can drive fantasy fly-throughs, sweeping aerials, and large-scale movement.” [00:22-00:27] TAKE_A: “Inside Freepik Spaces, you open the workflow and start wiring the motion logic step by step.” TAKE_B: “Once you’re in Freepik Spaces, the process becomes a repeatable workflow instead of a one-off trick.” TAKE_C: “The workflow lives inside Freepik Spaces, where you can set the motion system up properly.” [00:27-00:33] TAKE_A: “Step one is about building the right motion instruction so the model understands the camera behavior.” TAKE_B: “The first step is giving the model a motion-first instruction block with clear camera intent.” TAKE_C: “You start by telling the model exactly how the camera should travel, not just what the scene looks like.” [00:33-00:38] TAKE_A: “Then you expand that into a structured JSON prompt that keeps the movement aggressive and readable.” TAKE_B: “Next you build out the JSON prompt so the motion stays consistent and deliberate.” TAKE_C: “From there, the structured prompt block becomes the engine for the whole movement style.” [00:38-00:42.2] TAKE_A: “If you want the full workflow, comment AI and I’ll send you the link.” TAKE_B: “Comment AI if you want the workflow and I’ll send it over.” TAKE_C: “Drop AI in the comments and I’ll send you the full setup.”
MASTER PROMPT GLOBAL LOCK: Vertical creator tutorial reel about making rust-cleaning videos with generative AI. A male host in a cap speaks to camera while the reel alternates between rusty object examples, cleaning transformations, tool screens, and before-after visuals. Keep the process clear and the transformation obvious. [00:00-00:05] Open on bold text and rust-cleaning examples. [00:05-00:12] Show the host plus rusty object references and early before-after visuals. [00:12-00:20] Move through workflow pages and tool screens. [00:20-00:30] Show more transformation outputs from rusted to clean metal. [00:30-00:44] End with recap examples and workflow close. NEGATIVE PROMPT Avoid muddy textures, weak before-after contrast, unreadable UI, inconsistent object geometry, and robotic host delivery. SPEECH PACK Open by framing the topic as how to make rust-cleaning videos. Walk through references, tools, prompts, and transformations. Close by reinforcing the workflow and creator use case.
GLOBAL LOCK: Horizontal creator-demo video set in a minimalist white studio built around a glossy retro-futurist red terminal or kiosk branded as an AI creation device. The cast includes a young blonde man with curly hair and casual-cool styling, plus a brunette woman in a black camisole or simple fitted top. The red terminal has a built-in screen that first shows a crude stick-figure face, then transitions into a modern AI interface associated with Hedra Agent. The style blends real-life creator demo energy with clean commercial staging: white cyclorama backdrop, bold red hardware centerpiece, yellow subtitle captions, and fast transitions into generated outputs. The core promise is that casual natural-language requests can be turned into structured prompts, AI tool recommendations, and finished visuals. [00:00-00:08] Open on a cinematic shot of the blonde man sitting in or beside a vintage car with bold yellow subtitle text. The mood feels like a lifestyle ad or stylized short film. The brunette woman appears in adjacent car shots, creating the impression of a polished generated scene. [00:08-00:14] A pink title card or interstitial appears, then the video cuts into the white studio setup with the retro red terminal. The brunette woman stands beside it while the blonde man faces the screen. Yellow subtitle captions carry the spoken explanation. [00:14-00:22] The terminal screen shows a simple stick figure, then switches to a Hedra-like interface asking what should be made today. This establishes the joke and the product capability at the same time: conversational input becomes creative output. [00:22-00:32] Show the interface more clearly. A prompt field, asset options, and example thumbnails appear as the system loads. The presenter explains that the agent can understand casual requests, structure prompts, and route them toward the right generation tools and settings. [00:32-00:42] Cut to the visual payoff: multiple styled versions of the same man appear side by side in different looks and outfits, demonstrating reference control and character transformation. The clean white background keeps attention on the generated variations and the tool logic above them. [00:42-00:54] End with more polished studio shots of the brunette woman beside the red terminal while the narration frames Hedra Agent as an easier way to generate strong AI visuals. The overall tone should feel like a product demo wrapped in a playful, high-concept studio vignette.
GLOBAL LOCK: vertical social post layout demonstrating an AI video prompt, top half shows photoreal first-person domestic-catastrophe sequence, bottom half is a persistent black text card with yellow-white prompt copy and a bright yellow CTA reading 'Send this post!'. Top sequence begins as anonymous first-person POV with tanned forearms in bright blue latex gloves pressing a chrome soap dispenser or faucet at a bathroom sink, then escalates into impossible expanding soap foam filling the sink, hallway, staircase, living room, and eventually flooding outside around a suburban house and neighborhood. Camera language is literal and progression-based, moving from POV bathroom realism to wider interior and exterior coverage. Tone is deadpan, absurd, and photoreal. [00:00-00:03] First-person POV looking down at blue-gloved hands over a white bathroom sink beneath a mirror, pressing the chrome soap dispenser, ordinary warm-lit bathroom realism. Bottom half already displays the black prompt card explaining the scenario. [00:03-00:05] Dense white foam erupts and swells rapidly above the sink basin, hands spread apart by the pressure, top image remains centered on the bathroom event while prompt text stays locked below. [00:05-00:08] Cut to hallway and staircase views as the same white foam mass pushes through the house with unnatural volume and even pressure, filling corridors and rooms, furniture disappearing beneath the expansion, bottom prompt card unchanged. [00:08-00:12] Wider interior and exterior suburban-house shots show foam bursting out windows and doors, piling around the home in thick rounded clusters under overcast daylight, still framed above the persistent prompt text block. [00:12-00:15] Final aerial or pulled-back neighborhood-scale view reveals the foam event overtaking surrounding streets and lawns, turning the block white while the bottom prompt card and yellow 'Send this post!' CTA remain visible, ending like a shareable AI prompt concept demo.
GLOBAL LOCK: A vertical 9:16 prompt-showcase video with a cinematic letterboxed scene occupying the upper half and a full block of readable prompt text occupying the lower half. The visual sequence is a first-person or near-ground ant-scale POV inside a chaotic child's birthday party. The environment should feel like an apocalyptic landscape from the perspective of a tiny ant: giant sneakers and sandals moving unpredictably, cake chunks scattered like rubble, colorful sprinkles as boulders, a dark juice puddle like a lake, balloons blurred in the distance, and a bright red gummy bear acting as the sugar objective. The motion should feel immersive and game-like, as if the viewer is inside a miniature survival mission. The bottom text block should remain visible throughout, reinforcing that the scene is generated directly from the written prompt. [00:00-00:05] Open low to the floor with the ant-scale world fully established. Show the reflective juice puddle in the foreground, cake debris and sprinkles across the wooden floor, and a red gummy bear centered deeper in the scene. Towering children's shoes and legs form the background while the word “Prompt” and a full detailed English prompt sit below the image in a static text block. [00:05-00:11] Move the ant viewpoint or near-ant camera through the sprinkle field toward the gummy bear. Keep the shoes huge, threatening, and cinematic, with their soles and motion reading like environmental hazards. The gummy bear should remain the emotional objective while the prompt text continues to be fully readable at the bottom, making the clip function as both demo and educational content. [00:11-00:15] Resolve the prompt narrative by showing a child's hand descending from above to pick up the gummy bear while three ants remain on the floor below. The ants should feel tiny but determined, and the outcome should underline the tragicomic mission structure described in the prompt. End with the prompt still on screen so the viewer links the final cinematic beat back to the writing. NEGATIVE PROMPT: ordinary human eye-level party scene, no prompt text visible, empty clean floor, generic macro insect footage, blurry gummy bear, low-detail shoes, no sense of scale, multiple unrelated prompts flashing, cheerful toy commercial tone, flat lighting, missing juice puddle and sprinkle terrain. SHOT PROMPTS: ant POV birthday party apocalypse; gummy bear mission objective on floor; giant sneakers and sprinkles at micro scale; Seedance prompt showcase layout; cinematic prompt text plus generated scene; child hand taking gummy bear from ants. SPEECH PACK: No spoken dialogue needed. The clip should read as a silent or music-backed prompt demo where the written prompt and generated visual are the main content.
GLOBAL LOCK: Subject: Caucasian female, mid-20s, long blonde hair with soft waves, blue eyes, fair skin with warm undertones. Wardrobe: Simple beige or light pink silk camisole. Environment: Indoor bedroom setting, soft out-of-focus background with plants and warm interior elements. Lighting: Strong, warm directional sunlight from the side (golden hour), creating soft shadows and high-end cinematic highlights on the face. Color Grade: Warm, saturated tones, high contrast, clean highlight rolloff. Camera: Static Medium Close-Up (MCU), 50mm lens feel, shallow depth of field. Speech Signature: Friendly, conversational female voice, medium pace, warm and trustworthy tone. [00:00–00:04] Subject: The blonde woman is holding a small pink jar (Laneige Lip Sleeping Mask) near her right cheek. Action: She is looking directly into the lens, speaking with a natural smile. Her head tilts slightly to the left as she begins her sentence. Speech: "This is my secret to waking up with smooth, hydrated lips." Lip-Sync: High strictness; mouth movements must perfectly match the words "secret," "smooth," and "hydrated." Motion: Subtle hair movement as if from a light breeze, natural blinking, and micro-expressions around the eyes. [00:04–00:08] Subject: Same woman, same position holding the pink jar. Action: She continues speaking, maintaining eye contact. A slight nod of the head for emphasis on the word "works." Speech: "This mask works while I dream." Lip-Sync: High strictness; mouth must close on the "m" in "dream." Motion: The jar remains steady in her hand. The warm light creates a slight shimmer on her lips. Transitions: Continuous shot, no cuts. The video ends on a friendly, closed-mouth smile. NEGATIVE PROMPT: Visual: Robotic movement, distorted fingers, floating product, flickering light, blurry face, inconsistent hair texture, uncanny valley eyes, double chin, warped background. Speech: Robotic cadence, monotone delivery, muffled audio, lip-sync delay, harsh 's' sounds, unnatural pauses, background noise, clipping. SPEECH PACK: Transcript: 00:00-00:04: "This is my secret to waking up with smooth, hydrated lips." 00:04-00:08: "This mask works while I dream." TAKE_A (Natural/Influencer): [Warm, slightly upbeat] "This is my secret... to waking up with smooth, hydrated lips. This mask works... while I dream." TAKE_B (Soft/ASMR-lite): [Whispery, intimate] "This is my secret to waking up with smooth, hydrated lips. This mask works while I dream." TAKE_C (Direct/Authoritative): [Clear, punchy] "This is my secret to waking up with smooth, hydrated lips. This mask works while I dream." Prosody Markup: "This is my **secret**... to waking up with **smooth**, hydrated lips. This mask **works**... while I **dream**." (Pauses at "..." for natural breath)
MASTER PROMPT GLOBAL LOCK: Vertical talking-head tutorial reel about AI-generated drone flythrough shots. A male creator wearing a black cap and dark sweatshirt speaks directly to camera in a dim studio setup with soft warm practical lighting. The video repeatedly cuts between his presenter shots and cinematic flythrough examples, including dramatic interior passages, landscape or coastline passes, and fast-moving aerial-style camera moves. Large on-screen text references flythrough shots and the creator explains how easy they are to make. The tone is confident, instructional, and social-native. [00:00-00:06] Open on fast eye-catching flythrough example footage with bold on-screen text about drone flythrough shots. The examples should look cinematic and highly scroll-stopping. [00:06-00:15] Cut to the presenter in the studio speaking directly to camera. He should gesture with his hands while explaining that these shots are easy to create and perform well in the feed. [00:15-00:28] Alternate between presenter segments and more flythrough examples, including movement through tight spaces or dramatic environments. The edit should reinforce the point that the format is accessible and effective. [00:28-00:45] Continue the workflow explanation while showing additional examples and visual proof. The presenter remains the authority figure, but the examples carry the excitement. [00:45-01:01] End with a clear call to action around commenting for the PDF or workflow breakdown, preserving the creator-education plus lead-gen structure common in AI tutorial content. NEGATIVE PROMPT Avoid flat presenter lighting, boring static examples, unreadable text overlays, overcomplicated editing, or flythrough clips that feel generic instead of genuinely immersive. SPEECH PACK Speech should be upbeat creator-educator delivery: concise, practical, and persuasive. The intended meaning is that AI drone flythrough shots are easy to make, highly effective, and supported by a downloadable workflow guide.
GLOBAL LOCK: A vertical cinematic fashion tutorial video that begins with a direct hook and transitions into a moody nighttime beach portrait sequence. The subject is an East Asian young woman with short black hair and light skin, wearing a white satin slip dress, with visible tattoo sleeves and shoulder tattoos on one arm. The visual identity combines raw direct-flash photography, grainy night texture, dark ocean horizon, bright moonlight in the sky, wet sand reflections, and a dreamy editorial tone. Keep the same woman, dress, tattoos, beach-at-night setting, flash-lit skin highlights, and minimalist sensual styling throughout. The audio style is creator-led tutorial / prompt-sharing narration with concise social-video pacing, dry close mic sound, and an intimate but confident tone. [00:00–00:04] Extreme close-up of the woman’s eye and cheek under hard direct flash, with a metallic star sticker on her face and strands of black hair crossing the frame. Large bold text appears in sequence: “STEAL,” “STEAL MY,” “STEAL MY AI,” “STEAL MY AI PROMPTS.” The camera is nearly static, intimate, and confrontational, using a macro beauty framing with shallow focus and stark flash highlights against deep shadow. [00:00–00:04] The hook is spoken or implied as a fast creator-style opening line inviting viewers to take the prompts. Speech cadence is clipped and attention-grabbing, landing in sync with each text change. Lips are only partially visible, so sync matters less than timing and mood. [00:04–00:09] Cut to a full-body night beach portrait. The woman stands barefoot at the shoreline in the white slip dress, lit by harsh on-camera flash while the moon glows above the horizon. Yellow subtitle-style text begins presenting prompt-writing advice across the lower portion of the frame. The camera alternates between profile and back views as she faces the sea, then touches her hair and turns slightly toward camera. Keep the wet sand glistening and the sky nearly black-blue. [00:09–00:14] Continue the beach sequence with slower editorial posing. The woman steps through shallow water, then faces away from camera so the back of the dress and her damp hair are visible. Use a mix of medium full-body and lower-body shots that emphasize bare feet in the surf, dress hem in water, and direct-flash specular highlights on skin and fabric. The voiceover/tutorial text explains how the prompt should describe camera treatment and mood, while the images function as the visual result. [00:14–00:19] The woman sits or kneels near the shoreline and opens her arms outward, then shifts into seated portrait poses looking toward the horizon and back to camera. The composition becomes softer and more romantic while still retaining the raw flash look. Yellow caption blocks continue in the lower frame with practical prompt tips. Motion is minimal, with small posture changes and gentle ocean movement carrying the scene. [00:19–00:23] Move into medium close-up portraits of the seated woman in the surf. Her tattoos, shoulder line, cheekbones, and the satin texture of the dress become more prominent. She glances downward, then sideways, then leans toward the camera. Maintain the tension between harsh direct flash and soft emotional expression. The tutorial text suggests concrete structure for recreating the look rather than vague aesthetic language. [00:23–00:25] End on standing and close-up beach portraits with the woman facing camera head-on and then slightly off-axis. The dress clings softly with dampness, the tattooed arm remains a clear identity anchor, and the flash creates a glossy editorial finish. The final beat feels like a complete visual example of the prompt style being taught: raw, romantic, direct-flash night photography translated into AI-video form.
GLOBAL LOCK: The video is a vertical 9:16 split-screen composite.
BOTTOM HALF (Locked): A Caucasian male creator in his 30s with a beard, wearing a white baseball cap and a black sweater with a large white flower/heart design on the chest. He is positioned in the center of the bottom frame against a plain, flat-lit neutral wall. He acts as a commentator, looking up at the top frame, using expressive hand gestures (pointing, clapping) and speaking directly to the camera.
TOP HALF (Locked): A series of AI-generated UGC-style clips and UI screen recordings. The primary AI character is a young Caucasian woman in her 20s, with dark brown hair pulled back in a messy bun, light freckles. The environment is a cozy bedroom with natural, soft window lighting coming from the right, featuring a bed, a corkboard with photos, and a bookshelf with plants. The camera style for the AI woman is a handheld smartphone front-facing camera (MCU framing).
SPEECH STYLE: The bottom creator speaks with an energetic, instructional, authoritative tone (medium-fast pace, crisp articulation, close-mic dry room sound). The top AI woman (when speaking) has a fast, enthusiastic, gossipy UGC tone (high energy, uses filler words, slight room reverb).
[00:00–00:22]
TOP HALF: The AI woman is wearing a long-sleeved green top. She is holding a clear plastic cup of iced coffee with a straw in her right hand. She speaks enthusiastically to the camera, gesturing slightly with the coffee. At 00:18, the shot cuts, and she is now holding a red box of Colgate toothpaste, pointing to it with her left hand while continuing to speak. Text overlays appear: "AI is getting too realistic 🤯" and a large block of text titled "Veo 3.1 Prompt".
BOTTOM HALF: The male creator looks up in amazement. He points both index fingers upwards towards the top frame. He maintains an expression of shock and excitement.
SPEECH/AUDIO:
Top Speaker (AI Woman): "Oh my god guys, do you ever just stop and realize how far AI has actually come? Like, it's actually insane how normal this feels now. I remember when it used to be just little short clips, right? But now it's full videos. The face looks real, the voice stays kinda consistent, everything just works. And the craziest part? I can literally hold products, talk about them, move them around. Like I'm actually filming this myself." (Lip sync strictness: High).
Bottom Speaker: None.
[00:23–00:26]
TOP HALF: The frame splits vertically into two smaller side-by-side videos. Left side labeled "Original": A man with long hair applying makeup with a brush. Right side labeled "Kling 01": The AI woman (in the green top) mimicking the exact same makeup application motion with a brush.
BOTTOM HALF: The male creator claps his hands together once, then points directly at the camera, then points up again.
SPEECH/AUDIO:
Bottom Speaker (Male Creator): "Here's exactly how you can create the most realistic AI UGC creatives." (Lip sync strictness: High).
[00:27–00:32]
TOP HALF: A static graphic of a PDF document titled "GenHQ: Long-form UGC style videos" appears, showing workflow steps and images of the AI woman.
BOTTOM HALF: The male creator continues speaking directly to the camera, using hand gestures to emphasize his points.
SPEECH/AUDIO:
Bottom Speaker (Male Creator): "I'll even give you this PDF document giving you a detailed list of instructions of how to do it at the end of the video." (Lip sync strictness: High).
[00:33–00:38]
TOP HALF: A screen recording of a web UI (Arcads). The cursor clicks "Image" then "See more". A text prompt is typed into a box. The screen shows two generated image options (A and B) of the AI woman, now wearing a long-sleeved red top with a black bra strap visible, holding the iced coffee. Option B is selected.
BOTTOM HALF: The male creator points up with his right index finger, explaining the UI steps.
SPEECH/AUDIO:
Bottom Speaker (Male Creator): "To get started, go to Arcads and go to the image section and write in a prompt detailing what you want your character to look like." (Lip sync strictness: High).
[00:39–00:42]
TOP HALF: A screen recording of an image editing UI. A text overlay says "change the coffee to a labubu doll". The image of the woman in the red top updates; the coffee cup in her hand is replaced by a small, tan plush doll.
BOTTOM HALF: The male creator continues explaining, hands clasped together.
SPEECH/AUDIO:
Bottom Speaker (Male Creator): "Now you can select the image you like and change it using Google NanoBanana Pro." (Lip sync strictness: High).
[00:43–00:50]
TOP HALF: A screen recording of a video generation UI. The image of the woman in the red top holding the coffee is set as "Start Frame". The image of her holding the plush doll is set as "End Frame".
BOTTOM HALF: The male creator gestures with both hands, explaining the start/end frame concept.
SPEECH/AUDIO:
Bottom Speaker (Male Creator): "Once you've got these two images, you can use it as a start and an end frame with Google Veo 3. Replace your start frame with the end frame and repeat the process." (Lip sync strictness: High).
[00:51–00:56]
TOP HALF: The generated video plays: The AI woman in the red top is holding the plush doll, smiling and talking. She then holds up a printed copy of the PDF guide. A large, bold cyan text overlay "AI" appears in the center of the top frame.
BOTTOM HALF: The male creator points emphatically at the cyan "AI" text overlay.
SPEECH/AUDIO:
Top Speaker (AI Woman): [Inaudible/background speech].
Bottom Speaker (Male Creator): "Comment AI to get the entire workflow in the PDF." (Lip sync strictness: High).
NEGATIVE PROMPT:
Visual: temporal jitter, flickering lighting, morphed hands, extra fingers, melting objects, inconsistent facial features on the AI woman, text corruption in UI screens, blurry text overlays, unnatural robotic body movements, mismatched split-screen alignment, creator's background changing.
Audio: robotic voice cadence, harsh sibilance, audio clipping, lip-sync mismatch, unnatural pauses, lack of room tone in UGC segments.
SPEECH PACK:
[00:00-00:22]
Speaker: AI Woman
Transcript: "Oh my god guys, do you ever just stop and realize how far AI has actually come? Like, it's actually insane how normal this feels now. I remember when it used to be just little short clips, right? But now it's full videos. The face looks real, the voice stays kinda consistent, everything just works. And the craziest part? I can literally hold products, talk about them, move them around. Like I'm actually filming this myself."
Prosody: Fast-paced, enthusiastic, slight vocal fry, emphasis on "insane", "full videos", and "craziest part".
[00:23-00:56]
Speaker: Male Creator
Transcript: "Here's exactly how you can create the most realistic AI UGC creatives. I'll even give you this PDF document giving you a detailed list of instructions of how to do it at the end of the video. To get started, go to Arcads and go to the image section and write in a prompt detailing what you want your character to look like. Now you can select the image you like and change it using Google NanoBanana Pro. Once you've got these two images, you can use it as a start and an end frame with Google Veo 3. Replace your start frame with the end frame and repeat the process. Comment AI to get the entire workflow in the PDF."
Prosody: Instructional, clear articulation, punchy delivery, emphasis on tool names ("Arcads", "NanoBanana Pro", "Veo 3") and the final CTA ("Comment AI").GLOBAL LOCK: The video consists of a series of "impossible POV" shots where the camera is placed inside objects. The visual style is consistently cinematic, photorealistic, and high-detail. Lighting is motivated by the environment, often warm and soft. The camera uses macro or wide-angle lenses depending on the internal space. Textures like skin, metal, and liquid are hyper-detailed. [00:00–00:02] Subject: A young Caucasian girl with light brown hair, wearing a dark blue hoodie. Environment: Viewed from inside an open human mouth. The camera is placed on the tongue. Action: The girl leans forward toward the camera as if to kiss it or look closely. Framing: Extreme macro. The upper and lower rows of teeth and pink gums frame the top and bottom of the image. Lighting: Soft, natural light coming from behind the girl, creating a slight rim light on her hair. Motion: Subtle movement of the girl's head and the camera's slight handheld shake. [00:03–00:06] Subject: A mailman in a blue uniform and gloves. Environment: Viewed from inside a dark metal mailbox looking out onto a city street. A brown UPS truck is parked in the background. Action: The mailman opens the door and slides a stack of white envelopes into the mailbox toward the camera. Framing: Wide-angle POV. The dark interior of the mailbox frames the street scene. Lighting: Bright, overcast daylight outside; the interior of the box is in deep shadow. Motion: Fast motion of the mail being inserted. [00:07–00:10] Subject: A person's fingers (macro skin texture). Environment: Viewed from inside the eye of a large sewing needle. Action: A thick, blue-colored thread is being pushed through the eye of the needle toward the camera. Framing: Microscopic macro. The scratched, silver metallic edges of the needle eye dominate the frame. Lighting: Harsh, direct studio lighting highlighting the metallic texture and skin pores. Motion: Slow, deliberate threading motion. [00:11–00:15] Subject: A man's eye and forehead. Environment: Viewed from inside an antique brass clock mechanism. Action: A man stares through a circular opening in the clock face, his eye moving as he inspects the gears. Framing: Close-up. Large, out-of-focus brass gears and springs frame the circular opening. Lighting: Warm, golden light reflecting off the brass components. Motion: Rotating gears in the foreground; the man's eye blinks and shifts. [00:16–00:19] Subject: Carbonated dark liquid (cola). Environment: Viewed from the bottom of a metallic soda can, looking upward toward the opening. Action: Dark liquid rushes into the can, creating violent streams and a mass of dense, fizzy bubbles that explode toward the lens. Framing: Dynamic POV. The circular opening of the can is at the top of the frame. Lighting: Backlit through the can opening, creating high-contrast highlights on the bubbles. Motion: Fast, turbulent fluid dynamics. [00:20–00:23] Subject: A coastal landscape with a lighthouse. Environment: Viewed from deep within the cranial cavity of a weathered, sun-bleached skull resting on a beach. Action: Static landscape shot. Framing: The two eye sockets and nasal cavity of the skull frame the ocean and lighthouse in the distance. Cobwebs are visible inside the skull. Lighting: Natural, diffused daylight. Motion: Subtle waves in the background; slight camera drift. [00:24–00:28] Subject: The internal anatomy of a flower. Environment: Viewed from the center of a blooming tulip or lily. Action: Looking outward from the base of the pistil. Framing: Large, yellow stamens with pollen grains tower like pillars around the frame; soft pink/orange petals form the "walls." Lighting: Bright, ethereal sunlight filtering through the translucent petals. Motion: Pollen particles floating in the air; gentle swaying of the petals. [00:29–00:33] Subject: A girl blowing out a candle. Environment: A birthday cake with colorful blue and yellow frosting. Action: The camera is placed low in the frosting. A girl leans down into the frame and blows toward a single lit candle. Framing: Low-angle macro. Swirls of frosting frame the bottom and sides. Lighting: Warm, flickering candlelight; soft bokeh of party lights in the background. Motion: The flame flickering and then being extinguished; smoke rising. [00:34–00:38] Subject: Text on black background. Action: "COMMENT 'ARCADS' FOR THE PROMPTS" appears in bold white and yellow font. NEGATIVE PROMPT: blurry, low resolution, distorted anatomy, extra fingers, cartoonish, 2D, flat lighting, watermark, text (except for the intended overlays), shaky camera, glitchy transitions, unrealistic physics. SPEECH PACK: [00:00–00:33] No speech, only background music. [00:34–00:38] Text-to-speech or silent CTA. Transcript: "Comment 'ARCADS' for the prompts." TAKE_A: (Energetic) Comment ARCADS for the prompts! TAKE_B: (Direct) Just comment ARCADS and I'll send you the prompts. TAKE_C: (Casual) Want these? Comment ARCADS.
Ai Pov Video Maker
AI POV Video Maker is for creators who want videos built around first-person perspective rather than standard third-person framing. The page should guide them toward examples and prompts that simulate being inside the scene, with scenario-led hooks, viewer-positioned framing, and immediate role-play logic.
The strongest angle is immersion. Users here are not looking for a generic short video maker. They want a format where the viewer feels placed inside a moment or fictional situation. The copy should keep that first-person framing central.
What this page should make clear: - The output is based on first-person perspective and scenario immersion. - Hook framing matters because the viewer needs to understand the role instantly. - This style works for TikTok, Reels, and scenario-led short content. - The best examples feel like a situation the viewer is dropped into, not just shown.
FAQ
Q: What is an AI POV video maker? A: It is a tool for making first-person scenario videos where the viewer experiences the scene from inside it.
Q: What makes a video feel like POV? A: First-person framing, role-based setup, and an immediate sense of being inside the scenario.
Q: What is it best for? A: TikTok POV trends, role-play clips, scenario storytelling, and immersive short-form content.