AI Meme Video Generator

AI meme video generator pages are for creators who want motion to be part of the joke. They are not looking for a static meme with text dropped on top. They want loops, short edits, moving captions, or punchlines that land better in video form. This page helps you compare meme video directions that feel native to TikTok, Reels, X, and other fast-sharing platforms.

Video
GLOBAL LOCK: 
Subject: A young woman in her mid-20s, light skin with warm undertones, long wavy dark brown hair parted in the middle. She wears a white ribbed turtleneck sweater and a silver watch on her left wrist. 
Environment: A clean studio with a soft purple and pink gradient background. A dark desk and the edge of a laptop are visible in the foreground.
Style: High-definition UGC tech tutorial, clean lighting, vibrant colors.
AI Animation Style: High-fidelity 3D cartoon animation (saturated colors, smooth motion) and cinematic photorealistic action.
Speech: Female voice, enthusiastic and clear, medium pace, professional mic quality with slight room resonance.

[00:00–00:02]
Subject: Host looking directly at camera, speaking.
B-roll Overlay: A cinematic, high-speed desert buggy racing through sand dunes, massive dust clouds billowing behind it. High-contrast, bright sunlight.
Action: Host gestures slightly with hands. Buggy moves rapidly from left to right.
Speech: "There's a website where you can create"
Sync: High lip-sync strictness.

[00:02–00:04]
Subject: Host speaking.
B-roll Overlay: Close-up of the buggy's wheels churning sand, intense motion blur. Text "Consistent AI Videos" appears in bold yellow.
Action: Fast-paced action shot.
Speech: "consistent AI videos like Higgsfield AI"
Sync: Cut lands on "Higgsfield".

[00:04–00:06]
Subject: Host speaking, smiling.
B-roll Overlay: Higgsfield AI logo (green square with a black squiggle).
Speech: "and it's completely free."
Sync: High lip-sync strictness.

[00:06–00:09]
Subject: Host speaking.
B-roll Overlay: Screen recording of the Higgsfield interface. A panda is shown in a video preview. Text "Just paste or prompt" appears.
Action: Mouse cursor hovers over the "Create Video" button.
Speech: "Just paste your image or prompt and the platform"

[00:09–00:13]
Subject: Host speaking.
B-roll Overlay: A scrolling list of AI models (Claude, Gemini, Grok) followed by a grid of AI video tool logos.
Action: Rapid scrolling motion.
Speech: "generates the video for you. Now you might think every AI video tool can do that,"

[00:13–00:17]
Subject: Host speaking, leaning forward slightly.
B-roll Overlay: Screen recording of a 3D cartoon cat chasing a mouse. A right-click menu appears over the video.
Action: Mouse selects "Copy Video Frame".
Speech: "but here's what makes this one special. After the video is generated,"

[00:17–00:20]
Subject: Host speaking.
B-roll Overlay: The copied frame is pasted into the prompt box.
Action: UI interaction showing the image being uploaded.
Speech: "you can right-click and copy the last frame, then paste that frame"

[00:20–00:26]
Subject: Host speaking (small window) / Full-screen animation.
Visual: The 3D cartoon cat continues the chase, running through a hole in the wall. The mouse is seen inside the wall with a piece of cheese.
Action: Smooth, high-speed character animation. The cat looks frustrated.
Speech: "back into the tool and continue the prompt. The AI will continue the story exactly where the first video ended,"

[00:26–00:31]
Subject: Host speaking.
B-roll Overlay: A new animation of a stylized 3D family (father, mother, two children) standing outside a house. A yellow school bus drives into the frame.
Action: The bus stops, the camera pans slightly.
Speech: "keeping the same characters and visual style. So instead of random clips, you can create consistent story-based videos"

[00:31–00:34]
Subject: Host speaking directly to camera, friendly expression.
Visual: Text "Comment Video" and "send you Video" appears in yellow.
Action: Host clasps hands on the desk.
Speech: "scene after scene. Want to try it yourself? Comment 'Video' and I'll send you over."
Sync: High lip-sync strictness on CTA.

NEGATIVE PROMPT: Visual artifacts, distorted faces, flickering backgrounds, inconsistent clothing colors, robotic mouth movements, blurry UI text, harsh shadows on the host, unnatural hair physics in animation, audio clipping, background noise, muffled speech.

SPEECH PACK:
[00:00-00:04] "There's a website where you can create consistent AI videos like Higgsfield AI"
TAKE_A: (Enthusiastic, fast) "There's a website where you can create consistent AI videos like Higgsfield AI"
TAKE_B: (Informative, steady) "There's a website... where you can create consistent AI videos... like Higgsfield AI"

[00:04-00:13] "and it's completely free. Just paste your image or prompt and the platform generates the video for you. Now you might think every AI video tool can do that,"
TAKE_A: (Emphasizing 'free' and 'every') "and it's completely FREE. Just paste your image or prompt and the platform generates the video for you. Now you might think EVERY AI video tool can do that,"

[00:13-00:20] "but here's what makes this one special. After the video is generated, you can right-click and copy the last frame, then paste that frame"
TAKE_A: (Intriguing tone) "but here's what makes THIS one special. After the video is generated, you can right-click and copy the last frame, then paste that frame"

[00:20-00:34] "back into the tool and continue the prompt. The AI will continue the story exactly where the first video ended, keeping the same characters and visual style. So instead of random clips, you can create consistent story-based videos scene after scene. Want to try it yourself? Comment 'Video' and I'll send you over."
TAKE_A: (Helpful and encouraging) "back into the tool and continue the prompt. The AI will continue the story EXACTLY where the first video ended... keeping the same characters and visual style. So instead of random clips, you can create consistent story-based videos scene after scene. Want to try it yourself? Comment 'Video' and I'll send you over!"
Video
GLOBAL LOCK: preserve a creator-led talking-head tutorial format mixed with vertical phone screen recordings. Keep one young male creator in a backward black cap and dark hoodie speaking directly to camera in a studio setup with a microphone. Intercut iPhone-style screen captures showing ChatGPT/OpenAI image workflow steps, uploaded object photos, prompt entry, and AI video generation screens. Maintain a practical “make from your phone” educational reel structure. No random B-roll, no unrelated tools, no logo overlays beyond app UI already present in the source.

Create a 37.8-second social-first AI tutorial reel showing how to turn ordinary phone photos into animated AI character videos. Begin with a hook using a simple hand-held object photo and bold on-screen teaching posture from the creator. Then show phone interfaces: photo selection, ChatGPT or image-tool screens, prompt entry, image transformation results, switching to an AI video tool, uploading the generated image, entering a motion prompt, and generating the final animated output. Use repeated face-cam segments where the creator explains the steps and emphasizes that the workflow can be done from a phone.

Include the specific examples visible in the source: tiny object/food photos held in a hand, ChatGPT app icon and mobile interface, typed prompts that turn objects into cute expressive characters, a generated pear-like baby character image, a switch to another AI generation interface, upload and prompt steps for video, and a final generated moving result shown on-screen. Preserve the educational pacing and creator-marketing vibe.

SHOT SEGMENTS:
[00:00-00:06] Hook with object photos in hand and creator talking-head intro about making AI content from your phone.
[00:06-00:14] Mobile screens show ChatGPT / image workflow setup, app screens, and prompt entry.
[00:14-00:22] Creator explains the key steps while on-screen phone UI shows prompt refinement and generated object-to-character image outputs.
[00:22-00:30] The tutorial switches to an AI video tool, showing upload, prompt, and generation steps from the phone.
[00:30-00:37.8] Final result displays the generated animated character clip, while the creator closes with a call to try the workflow.

ENVIRONMENT: creator desk/studio face-cam plus crisp mobile screen recordings. CAMERA: direct-to-camera presenter shots alternating with full-screen phone UI captures. LIGHTING: clean creator-studio lighting on face-cam; bright legible phone UI on inserts. MOTION: tutorial pacing, finger taps on phone UI, creator emphasis gestures, no cinematic narrative scenes.

NEGATIVE PROMPT: generic AI ad montage, unrelated tools, desktop-only workflow, no phone UI, missing creator face-cam, subtitles replacing the actual visible UI, blurry screens, watermark, logo overlays.

SPEECH PACK: creator-to-camera tutorial speech implied, but do not transcribe captions here.
Video
GLOBAL LOCK:
- Format: vertical 9:16 short-form tutorial reel, creator-education pacing, black background UI inserts, high contrast social video polish.
- Keep one consistent male creator for all talking-head shots: young adult male, light skin, black backwards baseball cap, black hoodie/jacket, seated at desk, direct-to-camera framing, confident tutorial delivery.
- Keep one consistent demo subject inside the generated example image/video: a plush panda lying on a worn circular rug in a dim rustic room with warm overhead spotlight, scattered objects around the floor, soft moody shadows.
- No character drift, no costume drift, no sudden age changes, no extra presenters, no unrelated cutaways.

SHOT TIMELINE:

[00:00-00:03]
Talking-head intro. Creator sits centered against dark background and speaks straight to camera with energetic tutorial tone. Large editorial text overlays summarize the hook: make cinematic scenes from your phone. Insert fast teaser flashes of social posts showing the panda image/video result and yellow headline blocks.

[00:03-00:06]
Phone close-up UI. Vertical smartphone screen fills frame. A circularly framed panda image appears inside a social-style composition. Overlaid kinetic words emphasize the concept of turning a phone photo into a scene. Screen recording aesthetic should remain crisp and legible.

[00:06-00:09]
Back to talking head. Creator gestures lightly while saying the workflow starts by opening the app. Tight chest-up framing, direct eye contact, subtle head movement, clean synced speech.

[00:09-00:12]
Phone settings interface. User taps through app menu and settings-like pages to reach AI generation tools. Interface is dark mode, minimal, modern, with distinct list items and icons.

[00:12-00:16]
Prompt-building section on phone. Search field, model selection, and text-entry screens appear. User searches for GPT/prompt helper style tools, selects options, and opens a text area. On-screen rhythm should clearly communicate “build the prompt first.”

[00:16-00:20]
Text drafting flow on phone. Long paragraph prompt appears in a dark text box. User chooses/copies prompt text, then taps through action buttons. Highlight the exact motions: choose, copy, click, and go. The UI should feel like a real mobile workflow, not abstract fake panels.

[00:20-00:24]
Model/generation interface. User pastes the prompt into an AI image/video generation tool, selects the correct model or preset, and taps generate. Show dark-mode tool UI with image prompt area, buttons, and tabs.

[00:24-00:28]
Example asset preview returns. The panda scene appears again as a generated image/video preview. The phone screen cycles from prompt entry to generated result. Add supporting overlay words that reinforce the logic of generating the scene from a single photo.

[00:28-00:32]
Phone-to-output transition. The generated panda shot becomes larger and more immersive, as if stepping out of the interface into the final cinematic frame. Keep the panda, rug, spotlight, and room layout consistent with the reference image.

[00:32-00:35]
Talking-head recap. Creator returns on camera and explains the final step or CTA. He maintains same wardrobe and setup, speaking with persuasive, practical creator-teacher energy.

[00:35-00:39]
Final CTA and social proof. Talking-head remains center frame while comment-style overlays and platform UI elements appear below, suggesting engagement and repeatability. End on a clean, punchy tutorial finish.

VISUAL STYLE:
- Social tutorial reel, fast but readable editing.
- Mix talking-head shots with direct phone-screen recordings.
- Dark UI, white text, occasional high-contrast yellow hook text.
- Clean mobile creator aesthetic with authentic app interaction.

CAMERA AND EDITING:
- Talking-head: locked tripod or subtle digital push-in.
- Phone segments: full-screen mobile capture with smooth taps and transitions.
- Fast snap cuts between explanation, interface, and result.
- Keep chronological clarity so the viewer can follow the workflow in order.

SPEECH PACK:
- Spoken language: English.
- Creator voice: young male creator educator, confident, concise, practical, slightly hyped but not cheesy.
- Delivery style: short tutorial phrases, clear CTA emphasis, social-video pacing.
- Lip sync must stay natural and tightly aligned during talking-head shots.

NEGATIVE PROMPT:
- No extra hands floating over the phone.
- No unreadable UI gibberish replacing app text.
- No switching creator identity between talking-head shots.
- No panda changing species, color, pose logic, or room layout between preview and final output.
- No random additional animals or fantasy objects appearing in the room.
- No horizontal framing, no cinematic letterboxing, no documentary cutaways.
- No blurred phone screens, broken typography, or unusable interface text.
Video
GLOBAL LOCK: A 9:16 vertical creator tutorial video showing how to build cinematic AI videos inside Freepik Spaces using Kling 3.0. The structure alternates between a casual male creator talking directly to camera, screen-like workflow panels, and polished AI-generated example sequences. The speaker is a white male in his 20s or 30s with beard, cap, and casual streetwear, filmed in a warm apartment or studio environment. He should feel approachable, creator-native, and energetic rather than corporate. Keep the edit fast and legible, with repeated “How to do this” framing, visual examples of cinematic shots, and interface scenes that imply prompt building, scene sequencing, and generation controls. Audio is speech-first and educational, with the creator explaining the workflow in concise steps.

[00:00-00:05] Open on a catchy example visual or lifestyle shot with bold tutorial framing like “How to do this,” immediately pairing aspirational output with educational intent.

[00:05-00:10] Cut to the creator talking directly to camera in a casual indoor setup, hands gesturing upward as he introduces the workflow and hooks viewers with the promise of showing the full process.

[00:10-00:18] Alternate between creator face-cam, finished AI shots, and screen-style panels showing thumbnails or interface blocks, making it clear that multiple scenes are being built inside one pipeline.

[00:18-00:28] Include more practical inserts: example frames, real-world pose or filming inspiration, and workflow interface layouts that suggest prompt control, shot planning, and visual refinement.

[00:28-00:40] Keep cycling between explanation and proof, with the creator speaking in short, punchy segments while the examples show the quality ceiling of the method.

[00:40-00:56] End with a clearer recap feel: more screen panels, more finished outputs, and a final face-cam summary that reinforces this as a repeatable Freepik Spaces plus Kling production workflow.

NEGATIVE PROMPT: dry webinar, plain slideshow only, no example outputs, stiff face-cam, dark podcast studio, random office footage, unreadable UI, over-designed captions everywhere, broken hands, uncanny face, robotic speech, disconnected examples, generic stock footage, text-heavy PowerPoint feel, poor pacing, muddy screen inserts, lip-sync errors, low-quality AI art, unrelated memes.

SHOT PROMPT DELTAS:
1) Aspirational example frame with tutorial hook text treatment.
2) Casual creator face-cam explaining workflow.
3) Screen-style interface panels and scene thumbnails.
4) Example cinematic outputs paired with explanation.
5) Final recap with tools, outputs, and creator closeout.

SPEECH PACK:
[00:00-00:56] One male speaker throughout. Tone should be concise, confident, and creator-educational, explaining how to structure prompts, build shots, and use Freepik Spaces with Kling 3.0 to generate cinematic AI videos. Medium lip-sync strictness when on-camera.
Video
GLOBAL LOCK:
The video features a white male creator in his mid-30s with medium-length, wavy brown hair and a groomed beard, wearing a clean white t-shirt. He is positioned in a bright home office with a professional black condenser microphone on a boom arm in the foreground. The video uses a split-screen or multi-panel layout to compare "Source Video" (the creator) with "AI Generated Results" (various celebrities and characters). The AI characters must perfectly mirror the creator's head tilt, facial expressions, lip-sync, and hand gestures. The lighting is soft, natural window light from the side. The color grade is clean and realistic.

[00:00–00:03]
The screen is split into three vertical panels. Top panel: The creator waves both hands excitedly and points to his right. Middle panel: Sabrina Carpenter in a pink feathered dress mimics the exact hand wave and pointing. Bottom panel: Billie Eilish in a black outfit and sunglasses mimics the same gestures. High-fidelity lip-sync as they all say "Hear me out."

[00:03–00:07]
The layout shifts. Top panel: Creator continues talking with expansive hand gestures. Middle panel: Taylor Swift in a red dress mimics the gestures. Bottom panel: Kim Kardashian in a black tank top mimics the gestures. The transitions between characters are sharp cuts.

[00:07–00:10]
Split screen: Creator (top) vs. Queen Elizabeth II (bottom). The creator looks to his left and then back to the camera with a skeptical expression. The Queen, wearing a crown and sash, mirrors the look perfectly.

[00:10–00:13]
Split screen: Creator (top) vs. Edna Mode from The Incredibles (bottom). The creator scratches the top of his head with his right hand. Edna Mode, with her signature bob and glasses, scratches her head in perfect sync.

[00:13–00:20]
A screen recording of a software interface (Enhancor). A cursor selects the "Wan2.2" model from a dropdown menu. The UI shows a "Source Video" of the creator and a "Character Image" of a woman. The cursor toggles "Pro Mode" on and adjusts resolution to 720p.

[00:20–00:23]
Split screen: Creator (top) vs. a woman with long brown hair in a floral dress (bottom). They are both in the same room. The creator raises his hands in a "stop" gesture; the woman mirrors him perfectly.

[00:23–00:27]
The UI returns, showing the "Photo Animate" tab being selected. A different reference photo of the same woman is used. The cursor clicks "Generate Video."

[00:27–00:35]
Final comparison. Split screen: Creator (top) vs. the woman (bottom). The creator looks around the room and then smiles at the camera while touching his hair. The woman mirrors the hair-touching and the smile, but her background is now a different indoor setting matching her reference photo. The text "AI" appears centered on the screen.

NEGATIVE PROMPT:
Visual: flickering faces, distorted limbs, extra fingers, blurry textures, face-swapping artifacts, unnatural skin smoothing, background warping, robotic movements, low resolution, watermarks.
Speech: robotic voice, mismatched lip-sync, muffled audio, background noise, unnatural pauses, clipping audio.

SPEECH PACK:
[00:00–00:07]
Transcript: "Hear me out, all of your favorite movies and animations are going to be completely acted out by someone else in the next two years."
TAKE_A: Energetic, fast-paced, direct-to-camera.
TAKE_B: Mysterious, slightly slower, emphasizing "completely."
TAKE_C: Casual, conversational, like a friend sharing a secret.

[00:07–00:13]
Transcript: "So I'm going to teach you everything you need to know about this in the next 20 seconds so that you can do this for yourself and stay ahead of the curve."
TAKE_A: Authoritative, instructional, rhythmic.
TAKE_B: Helpful, warm, encouraging.
TAKE_C: Urgent, fast-talking to fit the "20 seconds" claim.

[00:13–00:35]
Transcript: "So right now you have two options with this new AI video model called Wan 2.2. The first option is Character Swap... The second option is Photo Animate... This is absolutely mind-blowing. Comment AI for the link."
TAKE_A: Professional narrator style, clear enunciation.
TAKE_B: Enthusiastic, high energy on "mind-blowing."
TAKE_C: Calm, tech-reviewer tone, clear CTA at the end.
Video
GLOBAL LOCK: A vertical 9:16 creator-economy tutorial reel that alternates between one male presenter speaking directly to camera and rounded-corner cinematic demo clips or dark-mode screen recordings above him. The presenter is a light-skinned man in his 20s or early 30s with side-parted brown hair, clean-shaven face, slim build, expressive hands, and a friendly but high-energy delivery style. He wears a cream textured overshirt or knit jacket over a black crew-neck shirt and speaks into a black podcast microphone positioned centrally in front of him. The base environment is a dark charcoal studio with soft frontal key light, warm amber background glow, crisp digital sharpness, and social-first edit pacing. The insert window above him cycles through realistic AI film shots, portrait references, and Higgsfield/Kling 3.0 interface screens. Speech should feel like an enthusiastic tutorial and sales-demo hybrid: one speaker, close-mic audio, clean articulation, medium-fast cadence, excited emphasis on realism, workflow ease, and the CTA to comment for the guide.

[00:00-00:07] Open on a dark vertical layout with bold white headline text reading “100% Made with AI” across the top. In the upper rounded insert window, show moody green-and-gold cinematic scenes with shallow depth of field, including a dim interior and an extreme close-up of a burning match or cigarette ember touching the floor. In the lower rounded talking-head panel, the creator points upward and speaks directly into the microphone with animated eyebrows and raised finger, introducing how realistic the AI results now look. Keep the lighting warm on his face and the lip-sync fairly tight.

[00:07-00:14] Accelerate into a realism montage in the upper insert: a boxing-ring close-up with a glove pushing into lens, a sharply lit city-street action shot of a man smashing glass with a bat, and a vintage car interior with a suited man driving through daylight streets. In the lower panel the same presenter keeps talking continuously, hands moving in small punches that match edit accents. Preserve clean, close podcast audio and energetic tutorial cadence.

[00:14-00:20] Cut to a portrait-reference stage. In the upper portion, show a full-body male character standing barefoot in a Japanese-style tatami room under a paper lantern, with the word “PORTRAIT” visible above. The man has dark hair, a dark hoodie, and light sweatpants, arms folded, used as the identity anchor for later generations. The presenter below explains this is the starting character image or reference needed for consistent output. Lighting in the reference image is neutral indoor daylight with soft warm wood trim.

[00:20-00:26] Transition to a dark-mode Higgsfield interface screen recording. The cursor scrolls past model cards where “Kling AI 3.0” is clearly visible, along with other video-generation options. The creator remains in the lower panel, still speaking in a persuasive, teacher-like tone about using the newest model and current offer. UI motion is smooth and cursor-driven; edits land on emphasized words.

[00:26-00:35] Move deeper into the workflow. Show upload panels, prompt fields, and example cinematic stills in the upper insert while the creator explains how to set up the generation. One prompt card references a character smoking and another visible text prompt describes the person getting frustrated while drawing, tearing up the page, and throwing it away. Keep the interface dark, minimal, and product-demo realistic. The presenter below gestures with one hand while staying centered in the lower frame.

[00:35-00:45] Display the generated sketching sequence in the upper insert: the same male character sits in a workshop or cluttered room with a cigarette in his mouth, sketching intensely on paper under greenish tungsten lighting. Follow with a close-up of the pencil drawing a car, then show a start-frame and end-frame layout above a bright yellow “Generate” button, making the interpolation workflow obvious. Speech continues as a single uninterrupted explanation about how to prompt scenes and transitions while preserving realism and identity.

[00:45-00:54] Finish with a rapid cinematic payoff montage. The upper insert cycles through fireworks reflecting in a man’s sunglasses, a pink balloon near an older man’s face, a fiery explosion in the sky, a plane-window travel shot, and finally a suited man by the airplane window. Over the top, bold CTA text appears: “Comment ‘AI’”. The presenter below raises his finger again and delivers the closing call to action for the guide and links. Audio remains one-speaker, close-mic, confident, slightly urgent, with no crowd noise and with the final CTA synced to the on-screen text.

NEGATIVE PROMPT: inconsistent face shape between shots, different hair color, extra fingers, broken glasses reflections, rubber skin, flat UI screenshots, unreadable prompt boxes, cheap green-screen compositing, low-detail backgrounds, jittery motion, robotic lips, muddy audio, crowd ambience, subtitles, watermarks, duplicated props, oversaturated neon color cast.

SHOT PROMPTS: dark studio creator tutorial; rounded-corner insert window; 100 percent made with AI hook; cinematic realism montage; boxing insert; glass-smash action shot; vintage car driver; portrait reference in tatami room; Higgsfield dark-mode UI; Kling 3.0 model card; upload-image workflow; prompt field; frustrated drawing prompt; cigarette sketching scene; start-frame end-frame generation; fireworks reflected in glasses; plane-window final montage; comment AI CTA.

SPEECH PACK: Single male speaker only. Tone should be excited, persuasive, and instructional, like a creator sharing a breakthrough workflow and an exclusive offer. Keep close-mic podcast texture, medium-fast pace, clear consonants, and strong emphasis on “Kling 3.0,” “realism,” and the final “comment AI” call to action.
Video
A) MISE EN PLACE
1) Video segmented into scenes:
- [00:00-00:01]: Static UI establishment.
- [00:01-00:04]: First animation cycle (clips drop down).
- [00:04-00:05]: Retraction.
- [00:05-00:08]: Second animation cycle.
- [00:08-00:09]: Final retraction.
2) Visual evidence extracted:
- Keyframes show a dark UI background, bold yellow/white text top and bottom, a central horizontal video player, and a timeline strip.
3) Speech evidence:
- No original audio provided. Assuming a standard promotional voiceover matching the text.
4) Invariants list:
- Visuals: Black background, top text ("2: MEET THE AI TOOL THAT UNDERSTANDS YOUR VIDEO👇"), bottom text ("TIP: Comment 'AI' and I'll send it directly to your DMs right now"), pointing hand icon, central horizontal video player showing two men talking.
- Speech: Upbeat, clear promotional tone.
5) Variables list:
- Visuals: Position of the three vertical dropdown clips, position of the red playhead on the timeline.

B) SHOTLIST
- shot_id: 1, timecode: 00:00-00:09, duration: 9s
- framing: Full screen graphic layout.
- lens: N/A (2D motion graphics).
- camera movement: Static camera, elements animate within the frame.
- subject: UI elements.
- environment: Dark digital canvas.
- lighting: Flat, graphic illumination.
- color grade: High contrast, black background, bright yellow (#FFD700) and white text.
- motion cues: Vertical sliding of rectangular frames, horizontal sliding of a thin red line.
- SPEECH / AUDIO:
  - speech_present: true
  - speakers: [A] (Off-camera narrator)
  - transcript_segments:
    - {00:00-00:04, A, "Meet the AI tool that actually understands your video.", energetic, 150wpm}
    - {00:04-00:07, A, "It analyzes the entire thing and cuts the best takes.", informative, 150wpm}
    - {00:07-00:09, A, "Comment AI and I'll send it to your DMs.", call-to-action, 160wpm}
  - delivery_direction: Energetic, clear, direct-response marketing style.
  - mic_room_signature: Close mic, dry studio sound.
  - sync_requirements: None (off-camera).

C) STYLE BIBLE
- visual_style: Clean, modern 2D motion graphics / UI mockup.
- camera_signature: Completely static.
- lighting_signature: Flat graphic design.
- grade_signature: High contrast, dark mode aesthetic.
- pacing_signature: Fast, looping animation.
- SPEECH STYLE BIBLE:
  - speech_style: Ad VO.
  - speaker_profile: Energetic, authoritative but friendly.
  - pronunciation_profile: Crisp enunciation.
  - mic_mix_profile: Dry, highly compressed for clarity on mobile devices.

D) PROMPT SYNTHESIS

1. MASTER PROMPT:
GLOBAL LOCK: A 2D digital motion graphics screen recording. The background is solid black. At the top, bold sans-serif text reads "2: MEET THE AI TOOL THAT UNDERSTANDS YOUR VIDEO👇" with the word "UNDERSTANDS" in bright yellow and the rest in white. Below this is smaller white text: "This free AI analyzes your entire video and cuts the best takes." At the bottom, text reads "TIP: Comment 'AI' and I'll send it directly to your DMs right now" with "AI" in yellow. In the bottom right corner is a white outline icon of a hand pointing left. In the center of the screen is a mock video editing interface. It features a horizontal video player showing a podcast setup with two men sitting at a table. Directly below the video player is a horizontal filmstrip timeline showing thumbnails of the video. The overall style is clean, high-contrast UI animation.

[00:00–00:01] The screen is static, displaying the global lock layout clearly.
[00:01–00:04] Animation begins. Three vertical rectangular frames (9:16 aspect ratio) smoothly slide down from behind the horizontal timeline strip. Each vertical frame contains a cropped, vertical version of the central podcast video. On top of the left frame is an Instagram icon; on the middle frame is a TikTok icon; on the right frame is a YouTube Shorts icon. Simultaneously, a thin red vertical line (a playhead) moves steadily from left to right across the horizontal timeline strip.
[00:04–00:05] The three vertical rectangular frames quickly slide back up and disappear behind the horizontal timeline strip. The red playhead resets to the left.
[00:05–00:08] The animation repeats exactly as before. The three vertical rectangular frames with social icons slide down again. The red playhead moves from left to right across the timeline.
[00:08–00:09] The three vertical rectangular frames quickly slide back up and disappear, returning the screen to the static state seen at the beginning.

2. NEGATIVE PROMPT:
3D elements, realistic camera movement, lens flare, depth of field, live-action camera shake, messy text, misspelled words, blurry UI, low contrast, cluttered background, realistic lighting, shadows, temporal jitter, morphing text.

3. SHOT PROMPTS:
(Not applicable as this is a single continuous graphic shot)

4. SPEECH PACK:
Transcript:
[00:00-00:04] Meet the AI tool that actually understands your video.
[00:04-00:07] It analyzes the entire thing, and automatically cuts the best takes.
[00:07-00:09] Comment AI and I'll send it directly to your DMs right now.

TAKE_A (Energetic & Punchy):
[00:00-00:04] MEET the AI tool... that actually UNDERSTANDS your video.
[00:04-00:07] It analyzes the ENTIRE thing... and automatically cuts the BEST takes.
[00:07-00:09] Comment A-I... and I'll send it directly to your DMs right now.

TAKE_B (Smooth & Professional):
[00:00-00:04] Meet the AI tool that actually understands your video.
[00:04-00:07] It analyzes the entire thing, and automatically cuts the best takes.
[00:07-00:09] Just comment AI, and I'll send it directly to your DMs right now.

TAKE_C (Fast & Urgent):
[00:00-00:04] Meet the AI tool that actually understands your video!
[00:04-00:07] It analyzes the entire thing and automatically cuts the best takes!
[00:07-00:09] Comment AI and I'll send it directly to your DMs right now!
Video
Core format and topic lock: a vertical creator tutorial about using Runway Aleph or a similar in-context AI video editor to replace background, lighting, and clothing in video clips. The video uses a bald male subject as the demonstration character, showing before/after edits, green-screen style isolation, character-image inputs, driving-video inputs, and transformed outputs in different roles and environments such as a professional kitchen and a desert setting. A male presenter in a rounded webcam frame explains the workflow beneath the examples.

Shot-by-shot reconstruction

0.0s-12.0s
Open on a stacked before-and-after example of a bald male subject seated at a table. The lower example introduces a green replacement area or edited plate to demonstrate how the background can be swapped while preserving the subject.

12.0s-24.0s
Show the editing interface where the creator adds or references the subject image. Keep the focus on how the system understands the character identity as an editable element rather than just raw footage.

24.0s-42.0s
Display a transformed output where the same bald subject appears as a chef-like figure inside a commercial kitchen. The person remains recognizable while the environment, wardrobe cues, and overall scene treatment change.

42.0s-59.7
Show a more explicit character-image plus driving-video workflow with model selection and settings. End on comparison shots proving the same identity can be remapped into multiple contexts, such as a desert scene and a kitchen scene, demonstrating combined background, lighting, and clothing edits.

Visual style
Vertical AI editing tutorial, dark app interface, talking-head explainer overlay, clear before/after examples, practical creator-workflow presentation, no cinematic scene changes beyond app windows and example swaps.

Motion notes
Motion should come from interface navigation, example swaps, and the presenter’s gestures. Keep the same subject identity throughout the clip so the audience can clearly judge how the model changes environment and wardrobe while preserving facial consistency.

Negative prompt
messy UI, unreadable settings, extra presenters, watermark, subtitles unrelated to tutorial, random unrelated footage, broken face consistency, nonhuman subjects, unstable frame crops, complex cinematic montage unrelated to the workflow

Speech pack
English tutorial narration explaining how to swap backgrounds, relight scenes, and change clothing in video by combining a source character image, a driving clip, and the Runway Aleph editing workflow.
Video
by.shlabu
Create a short-form creator tutorial video about how to make cinematic AI clips from simple ideas. The piece should feel like an Instagram Reel or TikTok posted by an AI filmmaking educator, combining direct-to-camera instruction with polished cinematic sample shots and interface cutaways. Use a confident creator host in a dark studio or moody workspace, speaking naturally to camera while explaining a repeatable workflow for generating cinematic AI videos. The pacing should be fast, sharp, and social-first, with frequent visual resets to keep attention high.

Open with a strong hook where the creator talks directly to camera and promises to show viewers how to make cinematic AI clips that feel dramatic, polished, and scroll-stopping. Then cut into multiple example shots that look like finished outputs: moody action moments, dramatic close-ups, atmospheric character scenes, and premium-looking cinematic frames. Intercut those examples with prompt panels, tool UI, timeline views, or settings screens so the workflow feels grounded in real AI video creation rather than abstract inspiration.

The host should stay visually consistent across talking segments: same person, same wardrobe, same lighting setup, same direct creator-teacher tone. Their performance should feel natural and creator-native, not overly scripted. They should gesture casually, point toward on-screen examples, and deliver the lesson with energetic clarity, like someone used to teaching AI video tricks on social media.

The visual design should alternate between two clear modes. Mode one is the tutorial studio setup: dark background, controlled lighting, crisp face detail, shallow depth of field, subtle color accents, and a premium creator-desk atmosphere. Mode two is the cinematic demo footage: dramatic compositions, intentional movement, filmic contrast, moody lighting, and stronger environmental storytelling. Keep cutting between those modes so the audience always sees both the result and the process.

Keep the entire piece optimized for vertical video. For talking-head sections, use close-ups and medium close-ups with subtle push-ins or light handheld energy. For the cinematic examples, vary the framing with wides, dramatic close-ups, push-ins, tracking shots, and controlled motion that sells the idea of “cinematic” without becoming chaotic. Everything should feel curated and premium.

Lighting is important. The host footage should use flattering key light with soft falloff and a clean but moody creator-studio look. The cinematic sample shots should lean harder into contrast, rim light, atmosphere, practicals, and dramatic highlight control. The overall grade should feel modern, contrasty, and polished, with rich blacks, sharp visual separation, and subtle filmic texture.

Include insert shots of prompts, settings, or example workflow screens to reinforce the educational angle. These moments can show how ideas become prompts, how cinematic references are structured, or how the creator chooses scenes and visual style. The UI should feel real and useful, not decorative.

The edit should stay fast and social-first: hook, creator explanation, cinematic example, interface proof, another teaching beat, then more examples. Use cuts, punch-ins, overlays, and visual comparison moments so the viewer always feels momentum. The final result should feel like a practical creator tutorial that teaches viewers how to make cinematic AI clips while also showcasing enough premium output to inspire them to try the workflow themselves.
Video
WORKFLOW
A) MISE EN PLACE
1) Segment the video into scenes/shots:
- [00:00–00:05] Single continuous shot (A composite split-screen showing two distinct scenes simultaneously).

2) Extract visual evidence:
- Keyframes: 0s, 2s, 4s.
- Left Panel: Caucasian woman, early 30s, blonde hair in a messy ponytail, wearing a mustard-yellow zip-up bomber jacket over a black top. Sitting outdoors at a cafe, daylight, string lights in the blurred background. She is laughing.
- Right Panel: Same woman, identical hair and wardrobe. Sitting indoors at a bar, warm directional lighting, amber bokeh in the background. She is holding a pint glass of beer and taking a sip.
- Overlays: White sans-serif text at the top and bottom.

3) Extract speech evidence:
- No speech. Audio is likely a trending BGM track.

4) Create an "invariants list" (LOCK THESE):
- visuals: The split-screen layout (left/right). The exact appearance of the woman (facial features, blonde ponytail, mustard jacket, black shirt). The static camera framing (MCU) on both sides. The text overlays.
- speech: N/A.

5) Create a "variables list" (TWEAK THESE):
- visuals: The micro-expressions of the laugh on the left. The liquid movement inside the beer glass on the right. The subtle background motion (patrons, bokeh shimmer).

B) SHOTLIST
- shot_id: 1
- timecode_start: 00:00
- timecode_end: 00:05
- duration: 5s
- framing: Split-screen. Both sides are Medium Close-Up (MCU), eye-level camera.
- lens: 50mm equivalent feel, shallow depth of field, creamy bokeh on both sides.
- camera movement: Static on both sides.
- subject: Left: Laughing naturally, slight shoulder movement. Right: Bringing a beer glass to her lips, taking a sip, maintaining eye contact.
- environment: Left: Outdoor cafe, daytime. Right: Indoor bar, evening.
- lighting: Left: Soft, overcast natural daylight. Right: Warm, moody practical lights, directional key light on the face.
- color grade: Warm overall tint, high contrast between the cool/neutral left and the amber/orange right.
- motion cues: Left: Subtle hair movement in the breeze. Right: Liquid dynamics in the glass.
- SPEECH / AUDIO:
  - speech_present: false

C) STYLE BIBLE
- visual_style: Cinematic UGC / High-end lifestyle B-roll.
- camera_signature: Locked-off tripod feel, shallow depth of field to isolate the subject.
- lighting_signature: Motivated lighting (natural outdoors vs. practical indoors).
- grade_signature: Warm, filmic, rich skin tones, vibrant mustard yellow.
- texture_signature: Photorealistic, sharp subject with soft, pleasing background blur.
- pacing_signature: Slow, deliberate motion suitable for looping.

D) PROMPT SYNTHESIS

MASTER PROMPT
GLOBAL LOCK: A vertical 9:16 split-screen video divided exactly down the middle. On both sides, the exact same subject is featured: a 30-year-old Caucasian woman with blonde hair pulled back into a messy ponytail, wearing a distinctive mustard-yellow zip-up bomber jacket over a black t-shirt. The camera is static on both sides, framed as a Medium Close-Up (MCU) with a shallow depth of field. The top of the video features bold white sans-serif text: "STEP 5: ANIMATE YOUR VIDEOS AS B-ROLL OR TALKING HEAD VIDEOS". The bottom features text: "Animate using Google Veo 3.1 for perfect lip sync or Kling 2.6 Pro for smooth cinematic clips."

[00:00–00:05] The video plays as a continuous 5-second loop. 
ON THE LEFT SIDE: The woman is sitting at an outdoor cafe table during the day. The lighting is soft, natural daylight. The background is blurred, showing outdoor seating and string lights. She is looking directly at the camera, smiling broadly and laughing naturally, with subtle, realistic head and shoulder movements. 
ON THE RIGHT SIDE: The woman is sitting at an indoor bar. The lighting is warm, moody, and directional, casting a soft glow on her face. The background features rich, amber bokeh from pendant lights. She is holding a clear pint glass filled with beer. She slowly brings the glass to her mouth, takes a sip, and lowers it slightly, maintaining steady eye contact with the camera throughout the motion. The liquid in the glass moves realistically. Both sides play simultaneously in a photorealistic, cinematic style.

NEGATIVE PROMPT
morphing, warping, inconsistent facial features, changing clothes, different person on left and right, bad anatomy, extra fingers, distorted glass, floating objects, unnatural lighting, plastic skin texture, jittery motion, flickering text, spelling errors in text overlays.

SPEECH PACK
No speech present in the reference video.
Video
GLOBAL LOCK: 
Subject: A 25-year-old Caucasian woman, radiant skin with natural texture and visible pores, long straight blonde hair with subtle highlights, bright blue eyes, wearing bold glossy red lipstick. 
Wardrobe: Simple neutral-colored top, thin straps visible. 
Prop: Holding a small pink jar of "LANEIGE Lip Sleeping Mask" in her right hand, positioned near her chin. 
Environment: Indoor bedroom setting, soft-focus background with hints of green plants and white walls. 
Lighting: Strong natural sunlight from the side (golden hour), creating high-contrast highlights on the face and soft shadows. 
Color Grade: Warm, vibrant, high saturation on reds and pinks, cinematic editorial look. 
Camera: Medium Close-Up (MCU), static position, slight handheld micro-jitter for realism. 
Speech: Female voice, warm, energetic, clear articulation, medium pace.

[00:00–00:03]
Subject is looking directly into the camera lens with a friendly, knowing smile. She holds the pink jar steady. Her lips begin to move in perfect sync with the words: "This is my secret to waking up with smooth hydrated lips." Her head tilts slightly to the left as she emphasizes "secret." The sunlight catches the gloss on her lips and the highlights in her hair.

[00:03–00:06]
Subject continues speaking: "This mask works while I dream." She blinks naturally once. Her expression is soft and convincing. The camera remains in a tight MCU. The pink jar remains visible in the frame, slightly catching the light.

[00:06–00:08]
Subject finishes the sentence with a slight nod and a wider smile, maintaining eye contact with the camera. The video ends on a high-energy, positive note. The background remains softly blurred.

NEGATIVE PROMPT:
Uncanny valley, plastic skin, missing teeth, distorted fingers on the hand holding the jar, blurry product label, robotic head movements, frozen eyes, mismatched lip-sync, flickering lighting, low resolution, watermark, text artifacts on skin, unnatural hair movement, popping shadows.

SPEECH PACK:
Transcript: "This is my secret to waking up with smooth hydrated lips. This mask works while I dream."

TAKE_A (Energetic): "This is my **SECRET** to waking up with smooth... hydrated lips! This mask works while I **DREAM**." (High energy, emphasis on secret and dream).
TAKE_B (Soft/Intimate): "This is my secret... to waking up with smooth, hydrated lips. [breath] This mask works while I dream." (Lower volume, more breathy, slower pace).
TAKE_C (Direct/UGC): "This is my secret to waking up with smooth hydrated lips. This mask works while I dream!" (Fast-paced, casual, friendly).

Prosody: Pause after "lips" (0.5s). Emphasis on "Secret" and "Dream".
Sync: High strictness on "Secret", "Smooth", and "Dream".
Mic: Close-proximity condenser mic feel, dry room tone, no reverb.
Video
GLOBAL LOCK: A vertical 9:16 creator tutorial reel teaching how to make first-person time-travel vlogs with AI. The lower half of the video holds a young male creator speaking directly to camera in a dark studio with red side lighting, black hoodie or jacket, and a backward cap. The upper half alternates between social-proof examples, smartphone search screens, browser pages, prompt-writing documents, and final generated historical selfie videos. The core output style is a realistic vlog shot where a modern creator appears to be filming himself inside major historical moments such as Viking England, the Wild West, or D-Day. The entire reel should feel practical and system-driven, built for viewers who want repeatable viral history content.

[00:00-00:12] Open on two successful example clips above the speaker: one where a young woman appears to selfie-vlog among Vikings in England in 865 AD, and another where she appears in a Wild West town in 1880. Both examples should look like genuine first-person historical vlogs with modern camera behavior but era-correct surroundings. View counts or social-proof markers should be visible to show that this content format already works.

[00:12-00:28] Move into the workflow entry step through a smartphone UI. Show a phone search screen with “Time Travel” typed in, then a Google-like result page for “Higgsfield AI.” The creator below explains the process in clear terms, making the tutorial feel accessible. The emphasis is on how surprisingly simple the setup is once the right tools are known.

[00:28-00:46] Show prompt-building and script-generation stages. Display a prompt document or text page labeled for text-to-video prompts, with entries for historical scenarios like landing craft before a beach assault or other era-specific vlog scripts. The interface should feel like a practical creator workflow rather than a polished marketing demo. The point is that the output begins with scripting the right first-person historical situation.

[00:46-01:01] End on a dramatic finished example where the creator appears to be selfie-vlogging during a World War II beach landing, with smoke, soldiers, landing craft, and battlefield chaos behind him. Overlay a small thumbnail or packaging element suggesting how the final video can be turned into a clickable social or YouTube asset. The result should feel both absurd and convincing: modern vlog behavior dropped into a massive historical event.

NEGATIVE PROMPT: static history painting look, third-person documentary framing, no selfie perspective, bland phone UI, generic prompts, inconsistent main character face, casual modern backgrounds, low-detail crowds, weak historical setting, no social-proof packaging.

SHOT PROMPTS: Viking time-travel selfie vlog; Wild West selfie vlog; phone search Time Travel; Higgsfield AI search result; ChatGPT prompt document; text-to-video historical script; D-Day beach selfie vlog; viral history series tutorial.

SPEECH PACK: One male speaker only. Tone is practical and energetic, emphasizing simplicity, virality, and repeatability. Stress “time travel vlogs,” “Higgsfield AI,” “ChatGPT prompts,” and the historical selfie angle.
Video
GLOBAL LOCK:
Subject is a Caucasian male in his early 30s, dark wavy hair, well-groomed medium-length beard, expressive brown eyes. He maintains a consistent facial structure across all shots. The visual style is a mix of high-end editorial photography and UGC tutorial footage. Lighting is cinematic with soft key lights and motivated rim lighting. Color grade is professional with deep blacks and vibrant but natural skin tones. Speech is clear, energetic, and instructional, delivered with a warm, authoritative tone.

[00:00–00:01]
Subject: MCU of the man wearing a dark suit, white dress shirt, black tie, and a white baseball cap with a green brim.
Action: Talking directly to the camera. A vertical white rectangular mask moves across his face, revealing a slightly different version of the same scene.
Camera: Static MCU, eye-level.
Lighting: Soft studio lighting, neutral background.
Speech: "This is how you can create..."

[00:01–00:04]
Subject: Rapid montage of AI-generated images. 
1. Man in a dark suit and sunglasses driving a green car at night, "AI MAG" text overlay.
2. Man in a checkered blazer and paisley tie in front of a brick wall.
3. Man in a white short-sleeve shirt with multiple pens in his pocket, standing in a white studio.
Action: Static editorial poses.
Camera: Various (MS, MCU).
Lighting: Cinematic, high contrast, nighttime car lighting, studio softbox.
Grade: Magazine editorial style.

[00:05–00:08]
Subject: A 3x4 grid of 12 different AI portraits of the same man in various outfits (boxing gloves, red car, street style, suit).
Action: Static images.
Overlay: Large bold text "UNLIMITED GENERATIONS" in orange and blue.
Camera: Flat grid layout.
Lighting: Varied per image.

[00:09–00:14]
Environment: Screen recording of the Higgsfield.ai website interface. A cursor moves to click "Image" then "Soul ID Character".
Action: UI navigation.
Speech: "On Higgsfield.ai, go to image and select Soul ID Character..."

[00:15–00:20]
Subject: Picture-in-picture of the man talking (wearing a tan cap and beige shirt) over a screen recording of the "Make Your Own Character" page.
Action: Explaining the process while gesturing.
Speech: "...where you can actually create your own custom character of yourself by uploading a bunch of photos."

[00:21–00:24]
Subject: Montage of AI images with text prompts.
1. Man in a suit drinking from a glass (trippy lens effect).
2. Man in a tan suit with a "Micky Mouse Bag" in a city street.
3. Man in a white tank top and jeans in front of a "Tokyo Red Car".
Action: Posing.
Camera: Full body and MS.
Lighting: Bright daylight, stylized urban lighting.

[00:25–00:34]
Environment: Screen recording of the "Lipsync Studio" interface. Subject's PIP continues.
Action: Selecting "Video", then "Lipsync Studio", uploading an image of himself at the beach, and dragging an audio file named "voiceover.wav".
Speech: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio..."

[00:35–00:38]
Subject: CU of the man at a tropical beach. He is shirtless, wearing black swimming goggles on his head.
Action: He is lip-syncing perfectly to the audio, smiling slightly.
Environment: Bright blue ocean water with small waves in the background.
Camera: CU, static.
Lighting: Bright, direct sunlight with natural shadows.
Speech: "...and it will combine those two together with the best lip-sync models."

NEGATIVE PROMPT:
Visual: robotic movement, distorted facial features, inconsistent beard growth, blurry textures, flickering background, extra fingers, warped UI elements, low resolution, watermarks.
Speech: robotic monotone, lip-sync delay, muffled audio, background hiss, unnatural pauses, slurred consonants, popping sounds.

SPEECH PACK:
[00:00-00:08]
Transcript: "This is how you can create 25 magazine-ready images of yourself using AI and then you can even lip-sync on top of them with this brand new feature."
TAKE_A: (Energetic, fast-paced) "This is how you can create TWENTY-FIVE magazine-ready images of yourself using AI... and then you can even LIP-SYNC on top of them with this brand new feature!"

[00:09-00:20]
Transcript: "On Higgsfield.ai, go to image and select Soul ID Character where you can actually create your own custom character of yourself by uploading a bunch of photos."
TAKE_A: (Instructional, clear) "On Higgsfield dot A-I, go to image and select Soul I-D Character... where you can actually create your own custom character of yourself... by uploading a bunch of photos."

[00:25-00:38]
Transcript: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio and it will combine those two together with the best lip-sync models."
TAKE_A: (Helpful, concluding) "Now you can go to video at the top of the page and select the Lipsync Studio... where you can upload your photo and audio... and it will combine those two together with the best lip-sync models."
Video

MASTER PROMPT
GLOBAL LOCK: Vertical tutorial reel about making explainer videos with generative AI. Use a host in a cap, cartoon examples, script pages, tool dashboards, character sheets, and output previews. Keep the pace educational and process-driven.

[00:00-00:05] Open on text and visual examples that frame the topic as how to make explained videos.
[00:05-00:12] Show the host, profile-style context, and interface screens.
[00:12-00:20] Move through scripts, planning docs, and design boards.
[00:20-00:28] Show character sheets, tool dashboards, and generated assets.
[00:28-00:36] End on output recap and workflow close.

NEGATIVE PROMPT
Avoid unreadable UI, inconsistent characters, weak text legibility, random workflow jumps, and robotic host delivery.

SPEECH PACK
Open by framing the topic as how to make explained videos. Walk through the workflow from scripts and references to characters, tools, and output. Close by reinforcing that the process is repeatable for creators and brands.
Video
GLOBAL LOCK: A vertical social promo card for an educational AI content creator account. The layout is bold, static, and highly legible: a bright yellow uppercase headline at the top reading "HOW TO MAKE VIRAL 3D ANIMATION," two example 3D render thumbnails in the center, and a strong bottom call to action reading "SWIPE FOR THE FULL GUIDE." The lower area also includes the account branding and a small "Save for later" style UI button. Keep the whole composition graphic, centered, and poster-like rather than cinematic or narrative.

[00:00-00:02] Show the full promo card immediately with no reveal. The top headline dominates the frame in high-contrast yellow and white text. Two side-by-side sample images underneath show glossy anthropomorphic 3D characters, including a stylized drink carton on the left and a fiery or toasted food-like character on the right.

[00:00:02-00:04] Keep the card stable so viewers can read the offer. The CTA "SWIPE FOR THE FULL GUIDE" remains clear near the bottom, and the small account signature plus utility button reinforce that this is an educational carousel teaser rather than a standalone tutorial.

[00:00:04-00:05.1] End on the same locked layout with all text still readable. The visual priority stays on the headline, the two sample renders, and the swipe prompt, preserving maximum scroll-stopping clarity.

SUBJECT: one static educational promo card about viral 3D animation workflows, featuring two sample 3D character renders.

ENVIRONMENT: black poster-style background with bright high-contrast typography and simple social graphic layout, no physical setting.

ACTION: minimal motion or no motion; the card functions as a held teaser image in video format.

CAMERA: vertical 9:16, static full-frame graphic layout, no pan, no zoom.

LIGHTING: not scene-based; graphic design uses bold contrast and bright saturation for readability.

GRADE: crisp social-media promo aesthetic with strong yellow, white, and black contrast plus saturated example thumbnails.

MOTION: essentially static, optimized for reading rather than animation.

SPEECH PACK: no visible dialogue or subtitles required; this is a teaser card encouraging swipe-through behavior.

NEGATIVE PROMPT: cinematic live-action scene, cluttered infographic, soft pastel typography, low-contrast text, busy background, extra panels, missing CTA, watermark emphasis, unreadable headline, narrative character animation replacing the card layout.
Video
GLOBAL LOCK: A charismatic Black male instructor with long dreadlocks, wearing a black durag, black sunglasses, and a sharp black suit. He has a confident, tech-mogul persona. The environment is "AI Genesis Academy," a grand, cinematic university campus with a mix of classical architecture and futuristic high-tech labs. Lighting is bright, cinematic, and high-contrast. Color grade is warm and saturated for exteriors, cool and blue-tinted for interiors. Speech is energetic, direct-to-camera, with a crisp, professional mic signature.

[00:00–00:05]
Subject: Wide shot of the instructor walking toward the camera in front of a massive stone wall with a gold "AI GENESIS" logo.
Environment: Grand university campus, lush green grass, students in white uniforms walking in the background.
Action: Instructor gestures toward the sign while walking confidently.
Camera: Low-angle tracking shot, moving backward.
Lighting: Bright golden hour sunlight.
Speech: "AI Genesis Academy is the best place to learn AI."

[00:06–00:12]
Subject: Instructor walking through a futuristic indoor hallway.
Environment: High-tech corridor with white marble pillars and floating holographic AR screens showing human anatomy.
Action: Instructor looks at the camera, gesturing to the screens.
Camera: Medium tracking shot, eye-level.
Lighting: Cool blue interior lighting with bright practical lights.
Speech: "In this academy, we teach everything about the AI visual world, from prompting to making masterpieces."

[00:13–00:20]
Subject: A young Asian student sitting at a desk, wearing a white uniform.
Environment: A classroom filled with students using holographic keyboards and floating UI screens.
Action: The student is typing; a red holographic error box appears. The instructor enters the frame and points at the screen.
Camera: Medium shot, slight pan to reveal the instructor.
Lighting: Soft, diffused lab lighting.
Speech: "Here's all the beginners learning their prompt in class. You need to practice your JSON prompting technique more often."

[00:21–00:34]
Subject: Instructor and a student in a grand hallway.
Environment: Classical architecture with "IMAGE GENERATION" sign above a doorway.
Action: A massive silver semi-truck suddenly manifests in the hallway, nearly hitting the student. The student, wearing a VR headset, looks terrified.
Camera: Wide shot to capture the scale of the truck, then a quick cut to a medium reaction shot.
Lighting: Natural light from large windows.
Speech: "Once students really get the hang of prompting, that's when the fun starts. This is where they move on to image generation. And honestly, I'm especially proud of this class because—"
Student: "I'm so sorry! I forgot to specify the truck size in the prompt!"

[00:35–00:45]
Subject: Close-up of the instructor standing next to the massive truck.
Environment: The truck's chrome grill is visible behind him.
Action: Instructor leans against the truck, looking coolly into the camera.
Camera: Close-up (CU), shallow depth of field.
Lighting: Rim lighting on the instructor's suit.
Speech: "All right. What AI tool did you use for this? Nano Banan 2. And this is exactly why this academy stays ahead. We teach every latest AI tool the moment it drops."

[00:46–01:00]
Subject: Instructor on a high stone balcony, pointing toward a field.
Environment: A vast green field where students in white are lined up. A massive rocket structure is being built in real-time by orange holographic beams.
Action: The rocket launches with a massive burst of fire and purple smoke.
Camera: Extreme wide shot (EWS) from the balcony, then a tracking shot following the rocket up.
Lighting: Bright daylight, lens flare.
Speech: "And last, but not least, here's the video generation class. They need more space to let their creativity speak. Comment the word JOIN to be an elite AI creative."

[01:01–01:06]
Subject: End card with "AI GENESIS" logo.
Environment: Purple ethereal background with floating bubbles and UI buttons: "Ai images", "Ai videos", "Prompt mastery", "Control".
Action: Logo glows and pulses.
Camera: Static graphic.
Lighting: Neon purple glow.

NEGATIVE PROMPT: blurry faces, inconsistent dreadlock length, flickering holographic screens, distorted truck wheels, unnatural rocket smoke, robotic speech cadence, muffled audio, low-resolution textures, jittery camera movement.

AI Meme Video Generator

AI meme video generator content matters when the motion itself is part of the humor. A lot of creators are no longer satisfied with static image memes alone. They want clips that loop well, captions that hit at the right moment, and edits that feel natural inside short-form feeds. This makes video timing just as important as the joke itself.

The best examples in this category usually feel short, sharp, and easy to repost. The meme lands because the movement adds something the still image cannot. When you compare ideas on this page, focus on whether the clip structure supports the joke, whether the pacing feels social-first, and whether the end result looks like something built for real sharing rather than a generic video effect demo.

FAQ

What is an AI meme video generator used for?

It is used to create animated meme clips, short loops, and moving joke formats that work better in video feeds than static posts.

Why choose video memes over image memes?

Video can add timing, reactions, and looping humor that make the joke hit harder on platforms centered around motion.

Who is this page helpful for?

It is helpful for creators posting on TikTok, Instagram Reels, X, Discord, and anywhere short-form video humor spreads fast.

What should I compare on this page?

Compare pacing, loop quality, caption timing, and whether each output feels ready to share as a real meme clip.