How to Make an AI Meme Video

How to make an AI meme video pages are for beginners who want a clear workflow from idea to shareable clip. The page should explain how to choose a tool, prepare source material, pick a format, write captions, and export the result for the right platform. This page helps users get from a rough joke or reaction idea to a finished meme video without needing prior editing experience.

Video
GLOBAL LOCK: preserve a creator-led talking-head tutorial format mixed with vertical phone screen recordings. Keep one young male creator in a backward black cap and dark hoodie speaking directly to camera in a studio setup with a microphone. Intercut iPhone-style screen captures showing ChatGPT/OpenAI image workflow steps, uploaded object photos, prompt entry, and AI video generation screens. Maintain a practical “make from your phone” educational reel structure. No random B-roll, no unrelated tools, no logo overlays beyond app UI already present in the source.

Create a 37.8-second social-first AI tutorial reel showing how to turn ordinary phone photos into animated AI character videos. Begin with a hook using a simple hand-held object photo and bold on-screen teaching posture from the creator. Then show phone interfaces: photo selection, ChatGPT or image-tool screens, prompt entry, image transformation results, switching to an AI video tool, uploading the generated image, entering a motion prompt, and generating the final animated output. Use repeated face-cam segments where the creator explains the steps and emphasizes that the workflow can be done from a phone.

Include the specific examples visible in the source: tiny object/food photos held in a hand, ChatGPT app icon and mobile interface, typed prompts that turn objects into cute expressive characters, a generated pear-like baby character image, a switch to another AI generation interface, upload and prompt steps for video, and a final generated moving result shown on-screen. Preserve the educational pacing and creator-marketing vibe.

SHOT SEGMENTS:
[00:00-00:06] Hook with object photos in hand and creator talking-head intro about making AI content from your phone.
[00:06-00:14] Mobile screens show ChatGPT / image workflow setup, app screens, and prompt entry.
[00:14-00:22] Creator explains the key steps while on-screen phone UI shows prompt refinement and generated object-to-character image outputs.
[00:22-00:30] The tutorial switches to an AI video tool, showing upload, prompt, and generation steps from the phone.
[00:30-00:37.8] Final result displays the generated animated character clip, while the creator closes with a call to try the workflow.

ENVIRONMENT: creator desk/studio face-cam plus crisp mobile screen recordings. CAMERA: direct-to-camera presenter shots alternating with full-screen phone UI captures. LIGHTING: clean creator-studio lighting on face-cam; bright legible phone UI on inserts. MOTION: tutorial pacing, finger taps on phone UI, creator emphasis gestures, no cinematic narrative scenes.

NEGATIVE PROMPT: generic AI ad montage, unrelated tools, desktop-only workflow, no phone UI, missing creator face-cam, subtitles replacing the actual visible UI, blurry screens, watermark, logo overlays.

SPEECH PACK: creator-to-camera tutorial speech implied, but do not transcribe captions here.
Video
GLOBAL LOCK:
- Format: vertical 9:16 short-form tutorial reel, creator-education pacing, black background UI inserts, high contrast social video polish.
- Keep one consistent male creator for all talking-head shots: young adult male, light skin, black backwards baseball cap, black hoodie/jacket, seated at desk, direct-to-camera framing, confident tutorial delivery.
- Keep one consistent demo subject inside the generated example image/video: a plush panda lying on a worn circular rug in a dim rustic room with warm overhead spotlight, scattered objects around the floor, soft moody shadows.
- No character drift, no costume drift, no sudden age changes, no extra presenters, no unrelated cutaways.

SHOT TIMELINE:

[00:00-00:03]
Talking-head intro. Creator sits centered against dark background and speaks straight to camera with energetic tutorial tone. Large editorial text overlays summarize the hook: make cinematic scenes from your phone. Insert fast teaser flashes of social posts showing the panda image/video result and yellow headline blocks.

[00:03-00:06]
Phone close-up UI. Vertical smartphone screen fills frame. A circularly framed panda image appears inside a social-style composition. Overlaid kinetic words emphasize the concept of turning a phone photo into a scene. Screen recording aesthetic should remain crisp and legible.

[00:06-00:09]
Back to talking head. Creator gestures lightly while saying the workflow starts by opening the app. Tight chest-up framing, direct eye contact, subtle head movement, clean synced speech.

[00:09-00:12]
Phone settings interface. User taps through app menu and settings-like pages to reach AI generation tools. Interface is dark mode, minimal, modern, with distinct list items and icons.

[00:12-00:16]
Prompt-building section on phone. Search field, model selection, and text-entry screens appear. User searches for GPT/prompt helper style tools, selects options, and opens a text area. On-screen rhythm should clearly communicate “build the prompt first.”

[00:16-00:20]
Text drafting flow on phone. Long paragraph prompt appears in a dark text box. User chooses/copies prompt text, then taps through action buttons. Highlight the exact motions: choose, copy, click, and go. The UI should feel like a real mobile workflow, not abstract fake panels.

[00:20-00:24]
Model/generation interface. User pastes the prompt into an AI image/video generation tool, selects the correct model or preset, and taps generate. Show dark-mode tool UI with image prompt area, buttons, and tabs.

[00:24-00:28]
Example asset preview returns. The panda scene appears again as a generated image/video preview. The phone screen cycles from prompt entry to generated result. Add supporting overlay words that reinforce the logic of generating the scene from a single photo.

[00:28-00:32]
Phone-to-output transition. The generated panda shot becomes larger and more immersive, as if stepping out of the interface into the final cinematic frame. Keep the panda, rug, spotlight, and room layout consistent with the reference image.

[00:32-00:35]
Talking-head recap. Creator returns on camera and explains the final step or CTA. He maintains same wardrobe and setup, speaking with persuasive, practical creator-teacher energy.

[00:35-00:39]
Final CTA and social proof. Talking-head remains center frame while comment-style overlays and platform UI elements appear below, suggesting engagement and repeatability. End on a clean, punchy tutorial finish.

VISUAL STYLE:
- Social tutorial reel, fast but readable editing.
- Mix talking-head shots with direct phone-screen recordings.
- Dark UI, white text, occasional high-contrast yellow hook text.
- Clean mobile creator aesthetic with authentic app interaction.

CAMERA AND EDITING:
- Talking-head: locked tripod or subtle digital push-in.
- Phone segments: full-screen mobile capture with smooth taps and transitions.
- Fast snap cuts between explanation, interface, and result.
- Keep chronological clarity so the viewer can follow the workflow in order.

SPEECH PACK:
- Spoken language: English.
- Creator voice: young male creator educator, confident, concise, practical, slightly hyped but not cheesy.
- Delivery style: short tutorial phrases, clear CTA emphasis, social-video pacing.
- Lip sync must stay natural and tightly aligned during talking-head shots.

NEGATIVE PROMPT:
- No extra hands floating over the phone.
- No unreadable UI gibberish replacing app text.
- No switching creator identity between talking-head shots.
- No panda changing species, color, pose logic, or room layout between preview and final output.
- No random additional animals or fantasy objects appearing in the room.
- No horizontal framing, no cinematic letterboxing, no documentary cutaways.
- No blurred phone screens, broken typography, or unusable interface text.
Video
GLOBAL LOCK: A 9:16 vertical creator tutorial video showing how to build cinematic AI videos inside Freepik Spaces using Kling 3.0. The structure alternates between a casual male creator talking directly to camera, screen-like workflow panels, and polished AI-generated example sequences. The speaker is a white male in his 20s or 30s with beard, cap, and casual streetwear, filmed in a warm apartment or studio environment. He should feel approachable, creator-native, and energetic rather than corporate. Keep the edit fast and legible, with repeated “How to do this” framing, visual examples of cinematic shots, and interface scenes that imply prompt building, scene sequencing, and generation controls. Audio is speech-first and educational, with the creator explaining the workflow in concise steps.

[00:00-00:05] Open on a catchy example visual or lifestyle shot with bold tutorial framing like “How to do this,” immediately pairing aspirational output with educational intent.

[00:05-00:10] Cut to the creator talking directly to camera in a casual indoor setup, hands gesturing upward as he introduces the workflow and hooks viewers with the promise of showing the full process.

[00:10-00:18] Alternate between creator face-cam, finished AI shots, and screen-style panels showing thumbnails or interface blocks, making it clear that multiple scenes are being built inside one pipeline.

[00:18-00:28] Include more practical inserts: example frames, real-world pose or filming inspiration, and workflow interface layouts that suggest prompt control, shot planning, and visual refinement.

[00:28-00:40] Keep cycling between explanation and proof, with the creator speaking in short, punchy segments while the examples show the quality ceiling of the method.

[00:40-00:56] End with a clearer recap feel: more screen panels, more finished outputs, and a final face-cam summary that reinforces this as a repeatable Freepik Spaces plus Kling production workflow.

NEGATIVE PROMPT: dry webinar, plain slideshow only, no example outputs, stiff face-cam, dark podcast studio, random office footage, unreadable UI, over-designed captions everywhere, broken hands, uncanny face, robotic speech, disconnected examples, generic stock footage, text-heavy PowerPoint feel, poor pacing, muddy screen inserts, lip-sync errors, low-quality AI art, unrelated memes.

SHOT PROMPT DELTAS:
1) Aspirational example frame with tutorial hook text treatment.
2) Casual creator face-cam explaining workflow.
3) Screen-style interface panels and scene thumbnails.
4) Example cinematic outputs paired with explanation.
5) Final recap with tools, outputs, and creator closeout.

SPEECH PACK:
[00:00-00:56] One male speaker throughout. Tone should be concise, confident, and creator-educational, explaining how to structure prompts, build shots, and use Freepik Spaces with Kling 3.0 to generate cinematic AI videos. Medium lip-sync strictness when on-camera.
Video
GLOBAL LOCK: A vertical 9:16 creator tutorial reel teaching how to make first-person time-travel vlogs with AI. The lower half of the video holds a young male creator speaking directly to camera in a dark studio with red side lighting, black hoodie or jacket, and a backward cap. The upper half alternates between social-proof examples, smartphone search screens, browser pages, prompt-writing documents, and final generated historical selfie videos. The core output style is a realistic vlog shot where a modern creator appears to be filming himself inside major historical moments such as Viking England, the Wild West, or D-Day. The entire reel should feel practical and system-driven, built for viewers who want repeatable viral history content.

[00:00-00:12] Open on two successful example clips above the speaker: one where a young woman appears to selfie-vlog among Vikings in England in 865 AD, and another where she appears in a Wild West town in 1880. Both examples should look like genuine first-person historical vlogs with modern camera behavior but era-correct surroundings. View counts or social-proof markers should be visible to show that this content format already works.

[00:12-00:28] Move into the workflow entry step through a smartphone UI. Show a phone search screen with “Time Travel” typed in, then a Google-like result page for “Higgsfield AI.” The creator below explains the process in clear terms, making the tutorial feel accessible. The emphasis is on how surprisingly simple the setup is once the right tools are known.

[00:28-00:46] Show prompt-building and script-generation stages. Display a prompt document or text page labeled for text-to-video prompts, with entries for historical scenarios like landing craft before a beach assault or other era-specific vlog scripts. The interface should feel like a practical creator workflow rather than a polished marketing demo. The point is that the output begins with scripting the right first-person historical situation.

[00:46-01:01] End on a dramatic finished example where the creator appears to be selfie-vlogging during a World War II beach landing, with smoke, soldiers, landing craft, and battlefield chaos behind him. Overlay a small thumbnail or packaging element suggesting how the final video can be turned into a clickable social or YouTube asset. The result should feel both absurd and convincing: modern vlog behavior dropped into a massive historical event.

NEGATIVE PROMPT: static history painting look, third-person documentary framing, no selfie perspective, bland phone UI, generic prompts, inconsistent main character face, casual modern backgrounds, low-detail crowds, weak historical setting, no social-proof packaging.

SHOT PROMPTS: Viking time-travel selfie vlog; Wild West selfie vlog; phone search Time Travel; Higgsfield AI search result; ChatGPT prompt document; text-to-video historical script; D-Day beach selfie vlog; viral history series tutorial.

SPEECH PACK: One male speaker only. Tone is practical and energetic, emphasizing simplicity, virality, and repeatability. Stress “time travel vlogs,” “Higgsfield AI,” “ChatGPT prompts,” and the historical selfie angle.
Video
GLOBAL LOCK: vertical 9:16 static poster-style social promo, bold high-contrast creator-marketing layout, black background with bright yellow headline bars, two example phone-screen mockups centered in the composition, one showing a translucent human skeleton figure standing indoors and one showing the same skeleton in a domestic scene holding cookware, glossy thumbnail polish, crisp readable typography, tutorial-ad aesthetic, no camera shake, no extra elements, no watermark.

[00:00-00:02] Open on the full poster layout with a large all-caps headline reading how to make viral skeleton shorts. Two phone-style panels dominate the center: the left panel shows a translucent skeleton-like figure in a softly lit home interior, and the right panel shows a skeleton character in a more playful domestic pose, creating an immediate “viral AI content formula” feel.

[00:02-00:03] Hold the layout with a slight digital push-in so the example panels become more legible. Preserve the bright yellow headline bar, the black poster background, and the swipe-for-the-full-guide messaging at the bottom. The overall frame should still read like a reel cover or short-form promo graphic.

[00:03-00:05] Finish on the same static promo composition, optimized for mobile viewing and creator education. Keep the two skeleton examples clear, the tutorial promise dominant, and the bottom CTA visible so the final frame looks like a conversion-focused guide advertisement for AI short-form content creators.

NEGATIVE PROMPT: unreadable text, broken skeleton anatomy, extra limbs, warped phone frames, low-resolution poster, muddy contrast, duplicate panels, generic stock layout, flicker, watermark, distorted cookware, text artifacts, messy background clutter, weak CTA.

SPEECH PACK:
- Hook: Here’s how to make viral skeleton shorts like these.
- Beat 1: The format works because the character is instantly recognizable and the scenes are simple.
- Beat 2: Use a strong repeatable prompt structure and clear domestic actions.
- CTA: Swipe for the full guide.
Video
Create a vertical 9:16 futuristic AI product-promo visual centered on a hyper-realistic fashion portrait of a young woman with slicked-back hair, pale skin, blue-grey eyes, and bold matte red lipstick, wearing a reflective chrome silver high-collar outfit in a bright metallic environment filled with iridescent foil-like textures. Behind her, large bold yellow text reads Meta AI, integrated like a clean social-ad headline. The image should feel like a premium generative-AI campaign frame promoting free image generation and AI lip sync tools, combining polished beauty-editorial realism with tech branding. Keep the composition crisp, symmetrical, high contrast, and optimized for short-form creator marketing. No extra clutter, no subtitles, no cartoon styling, no unrelated props.
Video
by.shlabu

GLOBAL LOCK: Horizontal creator-demo video set in a minimalist white studio built around a glossy retro-futurist red terminal or kiosk branded as an AI creation device. The cast includes a young blonde man with curly hair and casual-cool styling, plus a brunette woman in a black camisole or simple fitted top. The red terminal has a built-in screen that first shows a crude stick-figure face, then transitions into a modern AI interface associated with Hedra Agent. The style blends real-life creator demo energy with clean commercial staging: white cyclorama backdrop, bold red hardware centerpiece, yellow subtitle captions, and fast transitions into generated outputs. The core promise is that casual natural-language requests can be turned into structured prompts, AI tool recommendations, and finished visuals.

[00:00-00:08] Open on a cinematic shot of the blonde man sitting in or beside a vintage car with bold yellow subtitle text. The mood feels like a lifestyle ad or stylized short film. The brunette woman appears in adjacent car shots, creating the impression of a polished generated scene.

[00:08-00:14] A pink title card or interstitial appears, then the video cuts into the white studio setup with the retro red terminal. The brunette woman stands beside it while the blonde man faces the screen. Yellow subtitle captions carry the spoken explanation.

[00:14-00:22] The terminal screen shows a simple stick figure, then switches to a Hedra-like interface asking what should be made today. This establishes the joke and the product capability at the same time: conversational input becomes creative output.

[00:22-00:32] Show the interface more clearly. A prompt field, asset options, and example thumbnails appear as the system loads. The presenter explains that the agent can understand casual requests, structure prompts, and route them toward the right generation tools and settings.

[00:32-00:42] Cut to the visual payoff: multiple styled versions of the same man appear side by side in different looks and outfits, demonstrating reference control and character transformation. The clean white background keeps attention on the generated variations and the tool logic above them.

[00:42-00:54] End with more polished studio shots of the brunette woman beside the red terminal while the narration frames Hedra Agent as an easier way to generate strong AI visuals. The overall tone should feel like a product demo wrapped in a playful, high-concept studio vignette.
Video

INVARIANTS TO LOCK
- Vertical 9:16 tutorial Reel about making handmade crafting videos with AI.
- Main talking-head presenter is a young adult man in a black hoodie and backwards black cap, framed mid-shot against a dark indoor background.
- Supporting example visuals show rustic village crafting scenes in warm earthy daylight: handmade sculptures, carved figures, people working outdoors, and a surreal luxury-car-in-village juxtaposition.
- Screen recordings and phone mockups demonstrate how to source or structure the content inside an app workflow.
- Tone is “how to make viral videos” with direct platform-growth framing.

SHOTLIST
1. [00:00-00:06] Open on high-view-count handmade village visuals with bold text like HOW TO MAKE VIRAL VIDEOS, showing a green luxury sports car in a rural handcrafted setting.
2. [00:06-00:12] Presenter appears explaining the concept while phone mockups show scrolling grids of crafting clips.
3. [00:12-00:18] More examples of handmade scenes: carved statues, people working in dirt courtyards, vertical-video thumbnails with large view counts.
4. [00:18-00:26] Screen recordings display app lists, prompts, and a selected topic or generator labeled Handmade Craft.
5. [00:26-00:35] Presenter returns to explain how to open the workflow, store the idea, and turn one handcrafted niche into repeatable viral short-form content.

STYLE BIBLE
Visual style: creator-growth tutorial mixed with rustic AI-generated craft-content examples.
Camera signature: static talking-head inserts, phone UI overlays, grid thumbnails, and attention-grabbing example clips.
Lighting signature: presenter in dim neutral indoor light; supporting examples in warm outdoor village daylight.
Grade signature: social-platform contrast, bright yellow text accents, earthy browns in craft scenes, clean UI whites.
Speech style: fast, instructional, platform-native, optimized around virality and repeatability.

MASTER PROMPT
GLOBAL LOCK: Create a vertical tutorial Reel showing how to make viral handmade crafting videos with AI. Keep a young male creator in a black hoodie and backwards black cap as the talking-head guide. Intercut him with grids of short-form craft examples, phone mockups, and screen recordings. The example scenes should feature rural handmade environments with dirt courtyards, sculpted figures, artisans, huts, and high-contrast visual hooks like a green luxury sports car appearing inside a village craft scene. The whole structure should feel like a growth hack breakdown for short-form platforms.

[00:00-00:06] Open with a collage of high-performing handmade-craft-style clips in a rural village aesthetic, including a green sports car absurdly placed in the scene. Overlay bold text such as how to make viral videos.

[00:06-00:11] Cut to the presenter speaking directly to camera while a phone screen mockup beside him shows a grid of vertical craft content. He frames the opportunity as a repeatable niche, not a one-off trend.

[00:11-00:17] Show more example clips: carved statues, artisans working outdoors, and platform thumbnails with large view counts. The imagery should feel satisfying and scrollable.

[00:17-00:24] Transition into process screens: app menus, topic selection, and a prompt or tool page labeled Handmade Craft. This is the workflow proof section.

[00:24-00:35] Return to the presenter to explain how to open the niche, store the workflow, and repeat the concept across TikTok and Shorts. End with the feeling that this can be productized into a repeatable content system.

NEGATIVE PROMPT
Do not make the craft examples generic factory footage or polished luxury ads. Avoid unclear handmade action, unreadable UI, weak hook text, or lifeless presenter delivery. The contrast between rustic craft content and strategic AI workflow is the point.

SPEECH PACK
[00:00-00:12] Speaker A. Meaning: handmade craft videos can be turned into viral AI content with the right format. Delivery: direct, energetic.
TAKE_A: “Here is how to make handmade crafting videos with AI that actually go viral.”
TAKE_B: “This niche is blowing up, and the format is way easier to build than people think.”
TAKE_C: “If you want a repeatable viral niche, handmade craft content is one of the best to study.”

[00:12-00:24] Speaker A. Meaning: show examples and platform distribution across TikTok and Shorts. Delivery: tutorial pacing.
TAKE_A: “The key is packaging the visuals in the right format and pushing them across TikTok and Shorts.”
TAKE_B: “You need satisfying scenes, strong thumbnail moments, and a workflow you can repeat fast.”
TAKE_C: “This works because the visuals are simple, surprising, and easy to scale.”

[00:24-00:35] Speaker A. Meaning: the workflow can be saved and reused as a system. Delivery: practical close.
TAKE_A: “Once you set up the workflow, this becomes a repeatable content machine.”
TAKE_B: “You are not making one video, you are building a niche system.”
TAKE_C: “Use the same workflow, swap the craft scenario, and you can keep publishing.”
Video

A) MISE EN PLACE

Reference summary
- Duration: 00:57.79
- Format: vertical 9:16, 720x1280, 24 fps
- Structure: talking-head tutorial reel demonstrating HeyGen AI Agent for UGC-style content creation
- Audio: direct-to-camera creator narration; exact words inferred best-effort from caption, visible UI, and pacing

Scene / shot segmentation
1. 00:00.00-00:10.00
   Hook section with phone-shot UGC example footage on screen, presenter lower center. A female creator-style vertical clip is shown as the practical target output while the host frames the feature as a new way to make UGC content.
2. 00:10.00-00:22.00
   More UGC examples and social-style before/after proof, including a hand pointing at the screen to emphasize generated results and mobile-native output.
3. 00:22.00-00:38.00
   HeyGen product interface section. Dark dashboard and setup screens take over, showing AI Agent-related controls, workflow panels, and configuration blocks while presenter keeps explaining.
4. 00:38.00-00:49.00
   Deeper editor / media management section. Grid-based asset views and back-office screens appear, suggesting avatar, scene, or media orchestration.
5. 00:49.00-00:57.79
   Presenter-forward close with strong CTA energy, likely asking viewers to comment “AI” for the link.

Visual evidence keyframes
- 00:00.00: UGC-style female selfie/creator shot framed on a phone screen, presenter lower center
- 00:08.00: finger pointing at screen, emphasizing mobile-native proof
- 00:16.00: second UGC-style clip with presenter continuing explanation
- 00:24.00: dark HeyGen interface with AI Agent-style workflow card and controls
- 00:32.00: dashboard-like panels and configuration widgets
- 00:40.00: media grid / project management view
- 00:52.00: presenter larger in frame with CTA close energy

Speech evidence (best-effort)
- speaker_count: 1
- speaker A: male-presenting creator speaking on-camera throughout
- speech style: upbeat tutorial narration, positioning the new HeyGen AI Agent feature as a way to produce UGC-style ad/social content
- likely content themes in order:
  1) how to create UGC-style content using HeyGen’s new AI Agent feature
  2) quick proof that the format works for social-style output
  3) walkthrough of the HeyGen setup / dashboard / workflow
  4) explanation of how the tool helps generate content faster
  5) comment “AI” for the link
- lip visibility: full for most presenter segments
- lip_sync_strictness: medium

Invariants list (LOCK THESE)
- presenter identity: male creator in casual cap, beard, light t-shirt, speaking directly to camera from a seated setup
- layout: presenter near bottom center while examples and interface screens rotate above and behind him
- product context: HeyGen AI Agent, UGC-style content creation, social media / ad creative workflow
- design language: creator tutorial, mobile-first, dark dashboard UI, concrete examples before tool explanation
- motion grammar: hard cuts between example clips and dashboard screens, no elaborate cinematic camera move
- lighting / grade: presenter evenly lit, warm-neutral skin tones, dark interface background, bright phone-screen examples
- audio style: concise, creator-education voice optimized for shorts/reels

Variables list (TWEAK THESE)
- exact UGC example faces and scenes
- exact dashboard panels and wording on HeyGen screens
- precise narration phrasing
- exact CTA wording beyond the comment-for-link mechanic

B) SHOTLIST

Shot 1
- shot_id: 1
- timecode_start: 00:00.00
- timecode_end: 00:10.00
- duration: 10.00s
- framing: presenter lower center beneath a large mobile-video example
- lens: presenter webcam/phone-style medium crop
- camera movement: static presenter crop, brisk background swaps
- subject: presenter introduces the HeyGen AI Agent use case for UGC content
- environment: female selfie-style UGC clip filling the upper frame, social-media-native layout
- speech/audio: Speaker A hook line about creating UGC-style content using the new feature

Shot 2
- shot_id: 2
- timecode_start: 00:10.00
- timecode_end: 00:22.00
- duration: 12.00s
- framing: more UGC proof clips and touch/point emphasis on screen
- camera movement: quick cuts and proof refreshes
- subject: presenter reinforces that the output looks like social-native creator content
- environment: phone-screen examples, finger pointing, comparative proof frames
- speech/audio: Speaker A highlights the outcome and use case

Shot 3
- shot_id: 3
- timecode_start: 00:22.00
- timecode_end: 00:38.00
- duration: 16.00s
- framing: HeyGen dashboard fills most of the frame, presenter remains lower center
- camera movement: rapid UI cuts
- subject: presenter explains AI Agent setup / workflow
- environment: dark product interface, cards, toggles, and pipeline sections
- speech/audio: Speaker A turns practical and tool-specific

Shot 4
- shot_id: 4
- timecode_start: 00:38.00
- timecode_end: 00:49.00
- duration: 11.00s
- framing: deeper project/media management screens
- camera movement: hard cuts through interface states
- subject: presenter explains scaling or organizing content generation
- environment: asset grid, project thumbnails, management view
- speech/audio: Speaker A continues the workflow explanation

Shot 5
- shot_id: 5
- timecode_start: 00:49.00
- timecode_end: 00:57.79
- duration: 8.79s
- framing: presenter-forward close with remaining dashboard context behind him
- camera movement: mostly static close
- subject: presenter lands the CTA and link offer
- environment: dark interface or blurred dashboard backdrop
- speech/audio: Speaker A asks viewers to comment “AI” for the link

C) STYLE BIBLE (GLOBAL)

- visual_style: AI creator tutorial reel, UGC marketing workflow breakdown
- camera_signature: persistent talking-head lower-third with changing proof and interface backgrounds
- lighting_signature: soft creator lighting on presenter; bright mobile examples contrasted with dark software UI
- grade_signature: warm-neutral presenter, darker dashboard, high-contrast phone-screen inserts
- texture_signature: crisp app interface, handheld/phone-look proof clips, creator desk setup feel
- pacing_signature: quick promise, quick proof, practical workflow, CTA
- speech_style: direct-to-camera tutorial narration
- speaker_profile: enthusiastic, practical, creator-marketer tone
- pronunciation_profile: casual English, medium-fast, emphasis on tool name and outcome
- mic_mix_profile: dry, clear creator audio with light compression

D) PROMPT SYNTHESIS

MASTER PROMPT

GLOBAL LOCK: Create a vertical 9:16 creator tutorial reel about using HeyGen’s new AI Agent feature to make UGC-style content. Keep one male creator presenter seated near the bottom center for most of the video. He has a short beard, baseball cap, casual light t-shirt, and speaks directly to camera with energetic but practical tutorial cadence. The background rotates between UGC-style phone footage, mobile-screen examples, dark HeyGen dashboard screens, AI Agent workflow panels, media-management views, and a final comment CTA. Preserve a mobile-first, scroll-stopping structure: proof first, interface next, conversion close. Lighting on the presenter stays soft and even, with a clean creator-desk feel.

[00:00-00:10.00] Open with a realistic UGC-style female selfie or creator clip filling the upper frame, as if viewed on a phone screen, while the presenter appears lower center and introduces how to create this kind of content using HeyGen’s new AI Agent feature. Keep the frame immediately legible for social media: the viewer should instantly understand that the end goal is ad-ready, creator-native short-form content. Speaker A is upbeat and explanatory, lips visible, medium lip-sync strictness.

[00:10.00-00:22.00] Continue with more proof-driven UGC examples and mobile-native frames. Include finger-pointing or screen-emphasis moments to make the tutorial feel tactile and practical rather than abstract. The presenter keeps speaking and gesturing while showing that the output can pass as social-ready creator content. Use quick cuts with clear result-first momentum.

[00:22.00-00:38.00] Transition into the HeyGen product interface. Show a dark dashboard with AI Agent workflow blocks, setup cards, toggles, and configuration panels. Keep the presenter lower center and have him explain how the feature works in practice. The background should clearly read as real software, not a mockup. Sync sentence accents to UI changes.

[00:38.00-00:49.00] Show deeper operational screens such as a media grid, project organization view, content assets, or an editor-style management panel. The presenter continues with a practical explanation about building, organizing, or scaling UGC outputs through the tool. Maintain a creator-tutorial pace with clean hard cuts and readable interface detail.

[00:49.00-00:57.79] Close with the presenter more dominant in the frame while HeyGen context remains visible behind him. End with a direct CTA asking viewers to comment “AI” for the link. Make the final frame readable, conversion-oriented, and clearly tied to the value already demonstrated.

NEGATIVE PROMPT

Avoid warped phone screens, unreadable dashboard text, messy cutout edges around the presenter, drifting face identity, fake-looking UGC footage, over-animated transitions, robotic narration, slurred speech, lip-sync mismatch, clipping, room echo, low-contrast CTA text, random wardrobe changes, muddy UI panels, flicker, frame jitter, and generic ad visuals that do not feel native to social feeds.

SHOT PROMPTS

- Hook delta: mobile-native UGC proof clip with presenter lower center
- Proof delta: more creator-style examples and finger-point emphasis
- Dashboard delta: dark HeyGen AI Agent setup interface
- Management delta: media grid / project organization view
- CTA delta: presenter-forward finish with comment-for-link ask

SPEECH PACK

Timecoded transcript (best-effort observable reconstruction)
- [00:00.00-00:10.00] Speaker A: “Here’s how to create UGC-style content using HeyGen’s new AI Agent feature.” Emotion: upbeat, hook-first.
- [00:10.00-00:22.00] Speaker A: “This lets you generate social-native creator content much faster while keeping the output usable for marketing.” Emotion: confident, proof-oriented.
- [00:22.00-00:38.00] Speaker A: “Let me show you the HeyGen workflow and how the AI Agent part fits in.” Emotion: practical, tutorial-focused.
- [00:38.00-00:49.00] Speaker A: “From here you can manage the content, examples, or project setup inside the dashboard.” Emotion: tactical, steady pace.
- [00:49.00-00:57.79] Speaker A: “Comment ‘AI’ for the link.” Emotion: punchy CTA close.

TAKE_A
- Keep the wording close to the lines above with creator-marketing energy.

TAKE_B
- Same meaning, slightly faster and more ad-operator focused.

TAKE_C
- Same meaning, calmer and more educational.

Closest audible version
- Exact speech was not transcribed verbatim, so the lines above represent closest observable tutorial intent supported by caption, UI context, and pacing.

Safe paraphrase version
- The reel explains how to use HeyGen AI Agent to create UGC-style content and ends by asking viewers to comment “AI” for the link.
Video

GLOBAL LOCK: vertical social-media promo/tutorial reel teaching viewers how to create viral matchstick-style shorts; static poster-like layout with bold headline text at top, swipe-callout at bottom, and a sequence of AI-generated matchstick or burning-object characters showcased in the center; examples include pop-culture-inspired figures, flaming drink cup character, and dark charred variants; clean creator-brand ad style; no unrelated scenes, no camera wandering, no color drift.

00:00-00:03
The reel opens with a bold tutorial poster layout introducing a “how to make viral matchstick shorts” concept. In the center, AI-generated matchstick-style characters appear side by side like examples from a creative prompt pack.

00:03-00:07
The showcased examples cycle through variations: a sponge-like cartoon-inspired figure, a pink starfish-inspired figure, and a flaming cup or beverage character with matchstick/fire aesthetics. The layout remains consistent like a swipe-worthy social ad.

00:07-00:10
The sequence ends with darker charred matchstick forms and a call-to-action style frame encouraging viewers to get the full guide or tutorial. The overall feel stays instructional, promotional, and optimized for social scrolling.

NEGATIVE PROMPT:
landscape format, naturalistic vlog, complex background scenes, no text layout, low-detail character examples, random unrelated footage, soft cinematic storytelling, chaotic motion blur, messy UI clutter, muted unreadable typography
Video

GLOBAL LOCK: A vertical 9:16 social video featuring one white European-looking man in his late 20s to early 30s with fair neutral skin, blue eyes, dark brown side-swept hair, athletic build, clean-shaven face, and fitted black t-shirt, always presented as the same creator across every shot. Keep his identity, facial proportions, hairstyle, shirt, black watch, and confident tutorial energy locked. The visual world alternates between a warm tungsten bedroom-office with textured walls, shelf decor, practical lamp glow, and shallow depth of field, and clean dark UI demo layouts with rounded white software panels floating above black backgrounds. Camera language is creator-economy cinematic UGC: medium close-ups, chest-up framings, slight handheld energy, occasional push-ins, and crisp eye-level talking-head setups. Lighting stays motivated and contrasty, with orange practical light on one side and cooler fill on the opposite side during the “after” setup. Grade is rich, warm, polished, slightly contrasty, with soft highlight rolloff and subtle skin texture. One male speaker only, on-camera and off-camera from the same person, speaking energetic tutorial English with quick cadence, punchy emphasis, clean studio-style voice, close microphone presence in the second half, and tight lip sync whenever his mouth is visible.

[00:00-00:03] In a dim, warm, low-budget looking setup, frame the creator seated against a textured gray wall, lit by a harsh orange practical glow. He gestures with both hands while looking directly into the lens. Large bold white words appear one beat at a time over his chest, matching the spoken hook: “you go from this”. Keep the frame slightly cramped, the background plain, and the mood intentionally mediocre to set up contrast.

[00:03-00:00:07] Smash cut to a cleaner, brighter version of the same creator in a polished bedroom-office. Use a centered medium shot with soft warm lamp light behind him, shelf decor and trailing green plant on camera left, and a vertical tube light on camera right. Continue the chest-level kinetic subtitles: “to this, or this”. Keep his black t-shirt and posture consistent while the room looks instantly more premium.

[00:07-00:10] Show another talking-head angle in the polished room, then briefly cut to a behind-the-scenes view with a large softbox, chair, phone, and wall, revealing the practical filming setup. Bold subtitle words continue timing with speech: “AI”, “a key”, “45 degrees”, “of you”, “pocket”, “background”. Preserve the tutorial rhythm and creator hand gestures.

[00:10-00:15] Cut to smartphone recording interface views and the creator framed vertically on a phone screen. Emphasize that a screenshot of the clip is taken. Show the recording button, the portrait frame, and a quick screen capture moment. Transition into dark UI layouts branded around ElevenLabs, keeping the creator visible in a picture-in-picture talking-head box at the bottom.

[00:15-00:24] On a dark background, display rounded white software panels with image upload areas and labels such as “Image refs” and “Nano Banana Pro.” The creator appears in a small lower talking-head box speaking directly into a large black microphone. He points upward and times his gestures to each UI step. Show his portrait reference being uploaded, then the generated clean headshot-style reference. Keep the mic close, the room tone dry, and the delivery crisp.

[00:24-00:33] Reveal multiple generated stills of the same creator in slightly different rooms and lighting conditions, including warm interiors and cool-blue accented backgrounds. Show a blue “Download” button on one version. Then move into a “Kling 2.6 Motion Control” interface with two slots labeled for a character image and a motion video. The creator keeps explaining while pointing up toward the interface, maintaining fast tutorial cadence.

[00:33-00:39] Fill the motion-control interface step by step: first add the clean portrait as the character reference, then add the motion source video, then display the prompt text instructing the tool to transfer the motion of the first attached video into the attached image perfectly. Show the cursor moving to the upload arrow. Keep the software card large, centered, and readable, with the presenter anchored below.

[00:39-00:46] Cut back to vertical result examples of the same creator composited into new backgrounds while preserving body motion and framing. Show one scene with a dark studio doorway and plants, another with a warm shelf-lit interior. The creator continues speaking into the mic from the lower frame, emphasizing that the method preserves motion while swapping environment.

[00:46-00:52] Switch to the ElevenLabs Creative Platform UI. Show the creator clip inside the workspace, then navigate into audio features. Surface labels like “Sound effects” and “Studio quality voice,” plus a dropdown list of available voices. Keep the UI white and minimal, floating on a black canvas, while the creator explains how to finish the polish.

[00:52-00:57] Display a detailed equipment/setup page with headings like camera and lens suggestions, price examples, and notes about depth of field and aperture. Then cut back to a dark layout where the motion-transfer prompt card is visible alongside stacked vertical examples of the creator in different backgrounds. The creator maintains an urgent, confident CTA tone.

[00:57-00:59] End on a strong conversion frame: oversized yellow and white text reads Comment “Setup” while the creator points upward with one finger from the bottom talking-head box. Keep the black background clean, the examples stacked above, and the CTA unmistakable, optimized for saves and comments.

NEGATIVE PROMPT: do not change the presenter’s face, hairstyle, age range, build, shirt color, or watch between shots; avoid extra fingers, warped arms, asymmetrical eyes, rubbery skin, unstable jawline, drifting hairline, or mismatched ear shape; avoid random wardrobe swaps, logo changes, or added accessories; no flicker, temporal jitter, morphing backgrounds, UI text corruption, duplicated limbs, or inconsistent room geometry; no muddy compression, over-sharpening, clipped highlights, strange shadow directions, or cartoon skin smoothing; do not let the microphone appear in shots where it should be absent; avoid robotic speech, flat cadence, clipped plosives, harsh sibilance, room echo, bad lip sync, or subtitles that lag the spoken emphasis.
Video
GLOBAL LOCK:
The video features a white male creator in his mid-30s with medium-length, wavy brown hair and a groomed beard, wearing a clean white t-shirt. He is positioned in a bright home office with a professional black condenser microphone on a boom arm in the foreground. The video uses a split-screen or multi-panel layout to compare "Source Video" (the creator) with "AI Generated Results" (various celebrities and characters). The AI characters must perfectly mirror the creator's head tilt, facial expressions, lip-sync, and hand gestures. The lighting is soft, natural window light from the side. The color grade is clean and realistic.

[00:00–00:03]
The screen is split into three vertical panels. Top panel: The creator waves both hands excitedly and points to his right. Middle panel: Sabrina Carpenter in a pink feathered dress mimics the exact hand wave and pointing. Bottom panel: Billie Eilish in a black outfit and sunglasses mimics the same gestures. High-fidelity lip-sync as they all say "Hear me out."

[00:03–00:07]
The layout shifts. Top panel: Creator continues talking with expansive hand gestures. Middle panel: Taylor Swift in a red dress mimics the gestures. Bottom panel: Kim Kardashian in a black tank top mimics the gestures. The transitions between characters are sharp cuts.

[00:07–00:10]
Split screen: Creator (top) vs. Queen Elizabeth II (bottom). The creator looks to his left and then back to the camera with a skeptical expression. The Queen, wearing a crown and sash, mirrors the look perfectly.

[00:10–00:13]
Split screen: Creator (top) vs. Edna Mode from The Incredibles (bottom). The creator scratches the top of his head with his right hand. Edna Mode, with her signature bob and glasses, scratches her head in perfect sync.

[00:13–00:20]
A screen recording of a software interface (Enhancor). A cursor selects the "Wan2.2" model from a dropdown menu. The UI shows a "Source Video" of the creator and a "Character Image" of a woman. The cursor toggles "Pro Mode" on and adjusts resolution to 720p.

[00:20–00:23]
Split screen: Creator (top) vs. a woman with long brown hair in a floral dress (bottom). They are both in the same room. The creator raises his hands in a "stop" gesture; the woman mirrors him perfectly.

[00:23–00:27]
The UI returns, showing the "Photo Animate" tab being selected. A different reference photo of the same woman is used. The cursor clicks "Generate Video."

[00:27–00:35]
Final comparison. Split screen: Creator (top) vs. the woman (bottom). The creator looks around the room and then smiles at the camera while touching his hair. The woman mirrors the hair-touching and the smile, but her background is now a different indoor setting matching her reference photo. The text "AI" appears centered on the screen.

NEGATIVE PROMPT:
Visual: flickering faces, distorted limbs, extra fingers, blurry textures, face-swapping artifacts, unnatural skin smoothing, background warping, robotic movements, low resolution, watermarks.
Speech: robotic voice, mismatched lip-sync, muffled audio, background noise, unnatural pauses, clipping audio.

SPEECH PACK:
[00:00–00:07]
Transcript: "Hear me out, all of your favorite movies and animations are going to be completely acted out by someone else in the next two years."
TAKE_A: Energetic, fast-paced, direct-to-camera.
TAKE_B: Mysterious, slightly slower, emphasizing "completely."
TAKE_C: Casual, conversational, like a friend sharing a secret.

[00:07–00:13]
Transcript: "So I'm going to teach you everything you need to know about this in the next 20 seconds so that you can do this for yourself and stay ahead of the curve."
TAKE_A: Authoritative, instructional, rhythmic.
TAKE_B: Helpful, warm, encouraging.
TAKE_C: Urgent, fast-talking to fit the "20 seconds" claim.

[00:13–00:35]
Transcript: "So right now you have two options with this new AI video model called Wan 2.2. The first option is Character Swap... The second option is Photo Animate... This is absolutely mind-blowing. Comment AI for the link."
TAKE_A: Professional narrator style, clear enunciation.
TAKE_B: Enthusiastic, high energy on "mind-blowing."
TAKE_C: Calm, tech-reviewer tone, clear CTA at the end.
Video
GLOBAL LOCK: a fast vertical promo montage designed as a repeating educational teaser card for social media, black background with bold yellow headline bars, every card reading HOW TO MAKE VIRAL AI ANIMATION at the top and SWIPE FOR THE FULL GUIDE at the bottom, two image examples centered in each card showing different viral AI animation concepts, no live-action people speaking, no on-camera presenter, no dialogue, no narration required in-frame, no lip-sync, no subtitles beyond the built-in card text, no logo changes, consistent black-yellow-white branding, clean carousel-trailer energy, static card-by-card cuts rather than camera motion, rapid pacing that highlights niche variety: sharks, monsters, giant creatures, dragons, fantasy attacks, strange animals, cinematic action scenes, and surreal spectacle.

[00:00-00:04]
Begin with the branded promo card in full vertical frame: black background, thick yellow title block at the top reading HOW TO MAKE VIRAL AI ANIMATION in bold uppercase, two dramatic AI example thumbnails below showing giant shark and creature-action concepts, and a bottom call-to-action bar reading SWIPE FOR THE FULL GUIDE. The layout is static and poster-like, with hard cuts between card variations rather than any camera movement.

[00:04-00:08]
Cycle through additional cards using the exact same layout and typography while swapping the example images: monstrous open mouths, giant sharks, underwater threat scenes, or fantasy attack imagery. Maintain strict brand consistency so the viewer instantly understands this is one guide being advertised through multiple niche examples. No zooms, no parallax, no presenter face, only brisk card replacement.

[00:08-00:12]
Continue the template rhythm with more examples, now introducing oversized animals, desert creatures, and surreal danger shots. Each card still has the same yellow headline, centered dual-image examples, and the same swipe call to action at the bottom. Keep transitions hard and rhythmic, like a carousel trailer previewing what the audience will unlock by swiping.

[00:12-00:16]
Move into cards featuring fantasy-adventure and cinematic-scale setups such as dragons, towering beasts, or dramatic character-versus-creature scenes. The typography and CTA remain fixed. The examples are there to signal breadth: this guide is not about one trick, but many viral AI animation niches packaged under one offer.

[00:16-00:20]
End with a final burst of cards that reinforce the same educational promise, including additional creature-action and cinematic concept thumbnails, while preserving the identical black-yellow layout and SWIPE FOR THE FULL GUIDE footer. Finish on the impression of a rapid-fire tutorial promo reel that sells variety, niche fluency, and creator utility rather than telling a continuous story.

NEGATIVE PROMPT: talking presenter, face-to-camera coaching, white background minimalist ad, handwritten typography, changing brand colors, soft pastel palette, slow cinematic camera movement, motion graphics explainer charts, subtitles beyond the card text, random logos, blurred thumbnails, weak contrast, crowded multi-column layout, realistic office scene, dialogue bubbles, lip-sync character, unrelated aesthetic examples, low-readability text, inconsistent CTA wording.

SPEECH PACK: no visible speaker, no lip-sync requirement, no dialogue performed on screen, any audio intent should support fast promo pacing only, with the visual structure driven by repeated headline cards and image swaps rather than spoken explanation.
Video
GLOBAL LOCK: vertical 9:16 creator-style AI demo video explaining how to turn an ordinary room recording into a high-end studio look using HeyGen digital twin / AI avatar tools. The presenter is a young adult man with beard, dark cap, black shirt, and a black microphone, speaking directly to camera in a simple indoor room. The core visual mechanic is that while he continues talking in the same pose and framing, the background and overall environment transform into different premium studio or luxury spaces. Use bold white headline text at the top such as “THIS IS WILD” with a fire emoji in repeated sections, and later a “2 minutes” overlay during the payoff section.

[00:00-00:08] Open on the presenter in a plain neutral room, seated and speaking naturally into a microphone. The composition is centered and clean, with a minimal home-office vibe. Large top text reads “THIS IS WILD” to frame the demo as a surprising tool discovery. The creator gestures with his hands while explaining the capability.

[00:08-00:18] Without changing his body position too much, swap the background into a polished luxury studio or cinematic interior. One variation should feel like an upscale pink-toned set with arches and refined lighting; another should look like a designer mountain-view backdrop with large windows or rocky scenery. The point is that the presenter remains consistent while the environment upgrades dramatically.

[00:18-00:30] Continue cycling through premium virtual environments: a modern living room with ocean or pool view, a high-end apartment, and a bright architectural interior. Keep the presenter lighting believable and integrated so it feels like he was recorded in those spaces. The creator continues speaking directly to the audience about HeyGen and the speed of creating studio-quality backgrounds.

[00:30-00:40] Emphasize the transformation speed and usefulness for creators. Show another round of polished environments while preserving the same speaking performance, framing, and microphone setup. The visual message should be: one ordinary room recording can become many professional background looks without reshooting.

[00:40-00:59] End with a stronger UI-style or promo-style payoff. Introduce a “2 minutes” top text card over a soft gradient or glow-backed composition that frames the presenter as if inside a premium AI video product ad. The last section should feel like a clean branded conclusion: high-quality AI avatar output, studio-grade backgrounds, and content-ready presentation in minutes.

VISUAL DNA:
- Same male presenter throughout, seated and speaking into microphone.
- Repeated background swaps into luxury studio, architectural, or scenic high-end environments.
- Bold top text like “THIS IS WILD” and later “2 minutes.”
- Creator-economy ad / demo energy, not cinematic fiction.
- Clean direct-to-camera delivery with practical use-case framing.

STYLE LOCK:
- Social-native creator tutorial / ad hybrid.
- Realistic background replacement rather than fantasy visual effects.
- Emphasis on consistency of the speaker while the environment changes.
- Useful for creators, educators, and marketers wanting professional video presence.

NEGATIVE PROMPT: full body avatar walking around, no microphone, gaming stream layout, cartoon avatar, low-quality green-screen edges, chaotic meme montage, political content, horror styling, dark dystopian sets, no text overlays, no background variation, noisy office clutter, dramatic action scene, crowded room, subtitles burned in, unrelated app dashboard dominating the frame.

SHOT PROMPTS:
SHOT 1: plain-room talking-head creator with “THIS IS WILD” text.
SHOT 2: same speaker composited into luxury pink arch studio and scenic mountain backdrop.
SHOT 3: modern sea-view or high-end apartment studio background while creator keeps talking.
SHOT 4: repeated premium room transformations showing HeyGen-style consistency.
SHOT 5: final “2 minutes” payoff frame with polished AI-avatar studio presentation.

SPEECH PACK:
[00:00-00:59] Natural creator commentary explaining that HeyGen can turn any regular room recording into a professional-looking studio setup and digital twin style output without needing a complex physical set.
Video
GLOBAL LOCK: A vertical 9:16 creator-economy tutorial reel that alternates between one male presenter speaking directly to camera and rounded-corner cinematic demo clips or dark-mode screen recordings above him. The presenter is a light-skinned man in his 20s or early 30s with side-parted brown hair, clean-shaven face, slim build, expressive hands, and a friendly but high-energy delivery style. He wears a cream textured overshirt or knit jacket over a black crew-neck shirt and speaks into a black podcast microphone positioned centrally in front of him. The base environment is a dark charcoal studio with soft frontal key light, warm amber background glow, crisp digital sharpness, and social-first edit pacing. The insert window above him cycles through realistic AI film shots, portrait references, and Higgsfield/Kling 3.0 interface screens. Speech should feel like an enthusiastic tutorial and sales-demo hybrid: one speaker, close-mic audio, clean articulation, medium-fast cadence, excited emphasis on realism, workflow ease, and the CTA to comment for the guide.

[00:00-00:07] Open on a dark vertical layout with bold white headline text reading “100% Made with AI” across the top. In the upper rounded insert window, show moody green-and-gold cinematic scenes with shallow depth of field, including a dim interior and an extreme close-up of a burning match or cigarette ember touching the floor. In the lower rounded talking-head panel, the creator points upward and speaks directly into the microphone with animated eyebrows and raised finger, introducing how realistic the AI results now look. Keep the lighting warm on his face and the lip-sync fairly tight.

[00:07-00:14] Accelerate into a realism montage in the upper insert: a boxing-ring close-up with a glove pushing into lens, a sharply lit city-street action shot of a man smashing glass with a bat, and a vintage car interior with a suited man driving through daylight streets. In the lower panel the same presenter keeps talking continuously, hands moving in small punches that match edit accents. Preserve clean, close podcast audio and energetic tutorial cadence.

[00:14-00:20] Cut to a portrait-reference stage. In the upper portion, show a full-body male character standing barefoot in a Japanese-style tatami room under a paper lantern, with the word “PORTRAIT” visible above. The man has dark hair, a dark hoodie, and light sweatpants, arms folded, used as the identity anchor for later generations. The presenter below explains this is the starting character image or reference needed for consistent output. Lighting in the reference image is neutral indoor daylight with soft warm wood trim.

[00:20-00:26] Transition to a dark-mode Higgsfield interface screen recording. The cursor scrolls past model cards where “Kling AI 3.0” is clearly visible, along with other video-generation options. The creator remains in the lower panel, still speaking in a persuasive, teacher-like tone about using the newest model and current offer. UI motion is smooth and cursor-driven; edits land on emphasized words.

[00:26-00:35] Move deeper into the workflow. Show upload panels, prompt fields, and example cinematic stills in the upper insert while the creator explains how to set up the generation. One prompt card references a character smoking and another visible text prompt describes the person getting frustrated while drawing, tearing up the page, and throwing it away. Keep the interface dark, minimal, and product-demo realistic. The presenter below gestures with one hand while staying centered in the lower frame.

[00:35-00:45] Display the generated sketching sequence in the upper insert: the same male character sits in a workshop or cluttered room with a cigarette in his mouth, sketching intensely on paper under greenish tungsten lighting. Follow with a close-up of the pencil drawing a car, then show a start-frame and end-frame layout above a bright yellow “Generate” button, making the interpolation workflow obvious. Speech continues as a single uninterrupted explanation about how to prompt scenes and transitions while preserving realism and identity.

[00:45-00:54] Finish with a rapid cinematic payoff montage. The upper insert cycles through fireworks reflecting in a man’s sunglasses, a pink balloon near an older man’s face, a fiery explosion in the sky, a plane-window travel shot, and finally a suited man by the airplane window. Over the top, bold CTA text appears: “Comment ‘AI’”. The presenter below raises his finger again and delivers the closing call to action for the guide and links. Audio remains one-speaker, close-mic, confident, slightly urgent, with no crowd noise and with the final CTA synced to the on-screen text.

NEGATIVE PROMPT: inconsistent face shape between shots, different hair color, extra fingers, broken glasses reflections, rubber skin, flat UI screenshots, unreadable prompt boxes, cheap green-screen compositing, low-detail backgrounds, jittery motion, robotic lips, muddy audio, crowd ambience, subtitles, watermarks, duplicated props, oversaturated neon color cast.

SHOT PROMPTS: dark studio creator tutorial; rounded-corner insert window; 100 percent made with AI hook; cinematic realism montage; boxing insert; glass-smash action shot; vintage car driver; portrait reference in tatami room; Higgsfield dark-mode UI; Kling 3.0 model card; upload-image workflow; prompt field; frustrated drawing prompt; cigarette sketching scene; start-frame end-frame generation; fireworks reflected in glasses; plane-window final montage; comment AI CTA.

SPEECH PACK: Single male speaker only. Tone should be excited, persuasive, and instructional, like a creator sharing a breakthrough workflow and an exclusive offer. Keep close-mic podcast texture, medium-fast pace, clear consonants, and strong emphasis on “Kling 3.0,” “realism,” and the final “comment AI” call to action.
Video

MASTER PROMPT
GLOBAL LOCK: Vertical creator tutorial reel about making rust-cleaning videos with generative AI. A male host in a cap speaks to camera while the reel alternates between rusty object examples, cleaning transformations, tool screens, and before-after visuals. Keep the process clear and the transformation obvious.

[00:00-00:05] Open on bold text and rust-cleaning examples.
[00:05-00:12] Show the host plus rusty object references and early before-after visuals.
[00:12-00:20] Move through workflow pages and tool screens.
[00:20-00:30] Show more transformation outputs from rusted to clean metal.
[00:30-00:44] End with recap examples and workflow close.

NEGATIVE PROMPT
Avoid muddy textures, weak before-after contrast, unreadable UI, inconsistent object geometry, and robotic host delivery.

SPEECH PACK
Open by framing the topic as how to make rust-cleaning videos. Walk through references, tools, prompts, and transformations. Close by reinforcing the workflow and creator use case.

How to Make an AI Meme Video

Making an AI meme video usually starts with a simple joke, reaction, or trend idea and then turns that idea into a short clip people want to share. Beginners usually get the best results when they choose one format early: a reaction clip, a captioned short, a loop, or a remix built from an existing source. The goal is to keep the setup simple enough that the punchline still lands.

A practical workflow is easy to follow. Pick a tool, prepare the source image or clip, decide where the captions should appear, and make sure the final video is short enough for the platform you want to post on. If the first result feels flat, adjust the timing, caption placement, or pacing instead of rebuilding the whole thing. The easiest meme videos usually come from small edits that make the joke read faster.

FAQ

What do I need to make an AI meme video?

You need a tool, a joke or reaction idea, and a source image or clip that can be turned into a short format.

How should I choose the format?

Pick the format that matches the joke best, such as a reaction layout, short loop, or caption-driven clip.

What if the meme video does not feel funny enough?

Adjust the timing, caption placement, or pacing so the punchline lands more clearly in the first few seconds.

How do I export for the right platform?

Choose a final size and length that fit the platform you want to post on, then keep the clip short and easy to replay.

How to Make an AI Meme Video: Beginner Workflow Guide | Alici.AI