Image To Video AI Meme
Turn a static meme image into motion without rebuilding the joke from zero. This page should help users find image-to-video meme workflows that animate a still, preserve recognizability, and make the result feel newly shareable.
GLOBAL LOCK: The video features a white male creator in his mid-30s with medium-length, wavy brown hair and a groomed beard, wearing a clean white t-shirt. He is positioned in a bright home office with a professional black condenser microphone on a boom arm in the foreground. The video uses a split-screen or multi-panel layout to compare "Source Video" (the creator) with "AI Generated Results" (various celebrities and characters). The AI characters must perfectly mirror the creator's head tilt, facial expressions, lip-sync, and hand gestures. The lighting is soft, natural window light from the side. The color grade is clean and realistic. [00:00–00:03] The screen is split into three vertical panels. Top panel: The creator waves both hands excitedly and points to his right. Middle panel: Sabrina Carpenter in a pink feathered dress mimics the exact hand wave and pointing. Bottom panel: Billie Eilish in a black outfit and sunglasses mimics the same gestures. High-fidelity lip-sync as they all say "Hear me out." [00:03–00:07] The layout shifts. Top panel: Creator continues talking with expansive hand gestures. Middle panel: Taylor Swift in a red dress mimics the gestures. Bottom panel: Kim Kardashian in a black tank top mimics the gestures. The transitions between characters are sharp cuts. [00:07–00:10] Split screen: Creator (top) vs. Queen Elizabeth II (bottom). The creator looks to his left and then back to the camera with a skeptical expression. The Queen, wearing a crown and sash, mirrors the look perfectly. [00:10–00:13] Split screen: Creator (top) vs. Edna Mode from The Incredibles (bottom). The creator scratches the top of his head with his right hand. Edna Mode, with her signature bob and glasses, scratches her head in perfect sync. [00:13–00:20] A screen recording of a software interface (Enhancor). A cursor selects the "Wan2.2" model from a dropdown menu. The UI shows a "Source Video" of the creator and a "Character Image" of a woman. The cursor toggles "Pro Mode" on and adjusts resolution to 720p. [00:20–00:23] Split screen: Creator (top) vs. a woman with long brown hair in a floral dress (bottom). They are both in the same room. The creator raises his hands in a "stop" gesture; the woman mirrors him perfectly. [00:23–00:27] The UI returns, showing the "Photo Animate" tab being selected. A different reference photo of the same woman is used. The cursor clicks "Generate Video." [00:27–00:35] Final comparison. Split screen: Creator (top) vs. the woman (bottom). The creator looks around the room and then smiles at the camera while touching his hair. The woman mirrors the hair-touching and the smile, but her background is now a different indoor setting matching her reference photo. The text "AI" appears centered on the screen. NEGATIVE PROMPT: Visual: flickering faces, distorted limbs, extra fingers, blurry textures, face-swapping artifacts, unnatural skin smoothing, background warping, robotic movements, low resolution, watermarks. Speech: robotic voice, mismatched lip-sync, muffled audio, background noise, unnatural pauses, clipping audio. SPEECH PACK: [00:00–00:07] Transcript: "Hear me out, all of your favorite movies and animations are going to be completely acted out by someone else in the next two years." TAKE_A: Energetic, fast-paced, direct-to-camera. TAKE_B: Mysterious, slightly slower, emphasizing "completely." TAKE_C: Casual, conversational, like a friend sharing a secret. [00:07–00:13] Transcript: "So I'm going to teach you everything you need to know about this in the next 20 seconds so that you can do this for yourself and stay ahead of the curve." TAKE_A: Authoritative, instructional, rhythmic. TAKE_B: Helpful, warm, encouraging. TAKE_C: Urgent, fast-talking to fit the "20 seconds" claim. [00:13–00:35] Transcript: "So right now you have two options with this new AI video model called Wan 2.2. The first option is Character Swap... The second option is Photo Animate... This is absolutely mind-blowing. Comment AI for the link." TAKE_A: Professional narrator style, clear enunciation. TAKE_B: Enthusiastic, high energy on "mind-blowing." TAKE_C: Calm, tech-reviewer tone, clear CTA at the end.
WORKFLOW A) MISE EN PLACE 1) Segment the video into scenes/shots: - [00:00–00:05] Single continuous shot (A composite split-screen showing two distinct scenes simultaneously). 2) Extract visual evidence: - Keyframes: 0s, 2s, 4s. - Left Panel: Caucasian woman, early 30s, blonde hair in a messy ponytail, wearing a mustard-yellow zip-up bomber jacket over a black top. Sitting outdoors at a cafe, daylight, string lights in the blurred background. She is laughing. - Right Panel: Same woman, identical hair and wardrobe. Sitting indoors at a bar, warm directional lighting, amber bokeh in the background. She is holding a pint glass of beer and taking a sip. - Overlays: White sans-serif text at the top and bottom. 3) Extract speech evidence: - No speech. Audio is likely a trending BGM track. 4) Create an "invariants list" (LOCK THESE): - visuals: The split-screen layout (left/right). The exact appearance of the woman (facial features, blonde ponytail, mustard jacket, black shirt). The static camera framing (MCU) on both sides. The text overlays. - speech: N/A. 5) Create a "variables list" (TWEAK THESE): - visuals: The micro-expressions of the laugh on the left. The liquid movement inside the beer glass on the right. The subtle background motion (patrons, bokeh shimmer). B) SHOTLIST - shot_id: 1 - timecode_start: 00:00 - timecode_end: 00:05 - duration: 5s - framing: Split-screen. Both sides are Medium Close-Up (MCU), eye-level camera. - lens: 50mm equivalent feel, shallow depth of field, creamy bokeh on both sides. - camera movement: Static on both sides. - subject: Left: Laughing naturally, slight shoulder movement. Right: Bringing a beer glass to her lips, taking a sip, maintaining eye contact. - environment: Left: Outdoor cafe, daytime. Right: Indoor bar, evening. - lighting: Left: Soft, overcast natural daylight. Right: Warm, moody practical lights, directional key light on the face. - color grade: Warm overall tint, high contrast between the cool/neutral left and the amber/orange right. - motion cues: Left: Subtle hair movement in the breeze. Right: Liquid dynamics in the glass. - SPEECH / AUDIO: - speech_present: false C) STYLE BIBLE - visual_style: Cinematic UGC / High-end lifestyle B-roll. - camera_signature: Locked-off tripod feel, shallow depth of field to isolate the subject. - lighting_signature: Motivated lighting (natural outdoors vs. practical indoors). - grade_signature: Warm, filmic, rich skin tones, vibrant mustard yellow. - texture_signature: Photorealistic, sharp subject with soft, pleasing background blur. - pacing_signature: Slow, deliberate motion suitable for looping. D) PROMPT SYNTHESIS MASTER PROMPT GLOBAL LOCK: A vertical 9:16 split-screen video divided exactly down the middle. On both sides, the exact same subject is featured: a 30-year-old Caucasian woman with blonde hair pulled back into a messy ponytail, wearing a distinctive mustard-yellow zip-up bomber jacket over a black t-shirt. The camera is static on both sides, framed as a Medium Close-Up (MCU) with a shallow depth of field. The top of the video features bold white sans-serif text: "STEP 5: ANIMATE YOUR VIDEOS AS B-ROLL OR TALKING HEAD VIDEOS". The bottom features text: "Animate using Google Veo 3.1 for perfect lip sync or Kling 2.6 Pro for smooth cinematic clips." [00:00–00:05] The video plays as a continuous 5-second loop. ON THE LEFT SIDE: The woman is sitting at an outdoor cafe table during the day. The lighting is soft, natural daylight. The background is blurred, showing outdoor seating and string lights. She is looking directly at the camera, smiling broadly and laughing naturally, with subtle, realistic head and shoulder movements. ON THE RIGHT SIDE: The woman is sitting at an indoor bar. The lighting is warm, moody, and directional, casting a soft glow on her face. The background features rich, amber bokeh from pendant lights. She is holding a clear pint glass filled with beer. She slowly brings the glass to her mouth, takes a sip, and lowers it slightly, maintaining steady eye contact with the camera throughout the motion. The liquid in the glass moves realistically. Both sides play simultaneously in a photorealistic, cinematic style. NEGATIVE PROMPT morphing, warping, inconsistent facial features, changing clothes, different person on left and right, bad anatomy, extra fingers, distorted glass, floating objects, unnatural lighting, plastic skin texture, jittery motion, flickering text, spelling errors in text overlays. SPEECH PACK No speech present in the reference video.
GLOBAL LOCK: Create a vertical tutorial-style AI motion-control reel that demonstrates how a Frida Kahlo-inspired woman and Diego Rivera-inspired man can be animated into a realistic couple dance. Preserve the recognizable art-inspired styling of both characters: Frida with floral hair adornments, traditional dress, and bold folk-art color accents; Diego with a fuller build, blue shirt, dark trousers, and painterly portrait realism. Structure the video as a workflow reel with three layers: final generated output, software interface walkthrough, and live-action dance reference. Keep motion fidelity, couple synchronization, and art-character identity stable throughout. No dialogue or lip sync. [00:00-00:06.5] Open on the final generated result: a Frida-and-Diego-inspired painted couple dancing together in a warm indoor room. They step side to side, raise arms, and move in sync while a small reference dancer inset appears near the lower edge of frame. Keep the art style stable and the choreography readable. [00:06.5-00:19.0] Cut to a dark software interface screen recording showing the motion-control workflow inside an AI tool. Display generation settings, control panels, and progress elements that explain how the dance is created from source footage. Keep the interface legible and clearly framed as a process demonstration. [00:19.0-00:24.5] Show the live-action reference pair in a phone-like vertical frame performing the original couple dance. Their steps, arm lifts, and body timing should match the generated output logic. Keep this section straightforward and tutorial-oriented. [00:24.5-00:35.2] Return to the final generated Frida-and-Diego dance result, now letting the viewer compare it mentally against the reference. Preserve the pair's stable identities, coordinated body movement, and painterly cultural styling while they continue dancing side by side in the warm interior.
GLOBAL LOCK: preserve a creator-led talking-head tutorial format mixed with vertical phone screen recordings. Keep one young male creator in a backward black cap and dark hoodie speaking directly to camera in a studio setup with a microphone. Intercut iPhone-style screen captures showing ChatGPT/OpenAI image workflow steps, uploaded object photos, prompt entry, and AI video generation screens. Maintain a practical “make from your phone” educational reel structure. No random B-roll, no unrelated tools, no logo overlays beyond app UI already present in the source. Create a 37.8-second social-first AI tutorial reel showing how to turn ordinary phone photos into animated AI character videos. Begin with a hook using a simple hand-held object photo and bold on-screen teaching posture from the creator. Then show phone interfaces: photo selection, ChatGPT or image-tool screens, prompt entry, image transformation results, switching to an AI video tool, uploading the generated image, entering a motion prompt, and generating the final animated output. Use repeated face-cam segments where the creator explains the steps and emphasizes that the workflow can be done from a phone. Include the specific examples visible in the source: tiny object/food photos held in a hand, ChatGPT app icon and mobile interface, typed prompts that turn objects into cute expressive characters, a generated pear-like baby character image, a switch to another AI generation interface, upload and prompt steps for video, and a final generated moving result shown on-screen. Preserve the educational pacing and creator-marketing vibe. SHOT SEGMENTS: [00:00-00:06] Hook with object photos in hand and creator talking-head intro about making AI content from your phone. [00:06-00:14] Mobile screens show ChatGPT / image workflow setup, app screens, and prompt entry. [00:14-00:22] Creator explains the key steps while on-screen phone UI shows prompt refinement and generated object-to-character image outputs. [00:22-00:30] The tutorial switches to an AI video tool, showing upload, prompt, and generation steps from the phone. [00:30-00:37.8] Final result displays the generated animated character clip, while the creator closes with a call to try the workflow. ENVIRONMENT: creator desk/studio face-cam plus crisp mobile screen recordings. CAMERA: direct-to-camera presenter shots alternating with full-screen phone UI captures. LIGHTING: clean creator-studio lighting on face-cam; bright legible phone UI on inserts. MOTION: tutorial pacing, finger taps on phone UI, creator emphasis gestures, no cinematic narrative scenes. NEGATIVE PROMPT: generic AI ad montage, unrelated tools, desktop-only workflow, no phone UI, missing creator face-cam, subtitles replacing the actual visible UI, blurry screens, watermark, logo overlays. SPEECH PACK: creator-to-camera tutorial speech implied, but do not transcribe captions here.
GLOBAL LOCK: A Caucasian male in his mid-30s with wavy brown hair, light stubble, and a friendly, approachable expression. He wears a black zip-up jacket over a plain white t-shirt. The environment is a bright, sunny urban street with a white corner building featuring large windows and a "COFFEE" sign. The lighting is natural, direct sunlight with clear blue skies and palm trees in the background. Cinematic color grade with warm highlights and deep, natural shadows. [00:00–00:03] Wide shot (WS) of the man walking confidently toward the camera on a city sidewalk. He is holding a large, lush bouquet of flowers with yellow roses, white lilies, and green foliage. The camera is positioned at a low angle, slowly dollying backward to maintain a consistent distance as he walks. The background shows a street intersection with traffic lights and a white building. Natural motion blur on his legs and the swaying flowers. [00:03–00:06] A clean cut to a Close-up (CU) of the man's face. He is now closer to the lens, looking directly at the viewer and breaking into a genuine, warm smile. The background is heavily blurred with a creamy bokeh effect, showing hints of the urban street. The lighting is soft on his face, highlighting the texture of his skin and hair. The top of the flower bouquet is visible at the bottom of the frame. NEGATIVE PROMPT: flickering, character drift, distorted face, extra limbs, blurry subject, low resolution, cartoonish, oversaturated, robotic movement, inconsistent lighting, floating objects, text overlays on the subject, watermarks. SPEECH PACK: (No speech present in the original video, but if adding VO:) [00:00-00:06] TAKE_A: "This is Nano Banana Pro. I spent the last two days testing it. It is mind-blowing." (Energetic, tech-enthusiast tone) TAKE_B: "Check out this consistency. From the scene to the character, it's perfect." (Calm, instructional tone) TAKE_C: "AI video just changed forever. Look at these cinematic shots." (Awestruck, slow pacing)
MASTER PROMPT GLOBAL LOCK: A vertical split-screen composition. The left half features a 30-year-old Caucasian woman with blonde hair tied back, wearing a chunky pink knit sweater and blue jeans, sitting by a window in a cozy cafe with warm, natural daylight. The right half features the exact same woman, wearing a plain white t-shirt, sitting at a modern desk with a silver laptop, illuminated by cool, soft office lighting. Both sides maintain photorealistic cinematic quality, 35mm lens feel, soft depth of field, and identical facial features. [00:00–00:05] Left side: The woman sits relaxed, holding a white ceramic mug with both hands near her chest. She looks out the window to her left with a gentle, serene expression. Subtle movement in her hair and slight breathing motion. Right side: The woman is focused, looking down at the silver laptop screen, typing on the keyboard. At 00:02, she pauses typing, reaches with her right hand to pick up a grey ceramic mug on the desk, brings it to her lips to take a sip, places it back down, and resumes typing. The camera remains completely static on both sides throughout the duration. No speech. NEGATIVE PROMPT text, watermarks, logos, split-screen bleeding, morphing faces, inconsistent identity between left and right, unnatural hand anatomy, extra fingers, flickering lighting, temporal jitter, robotic movements, exaggerated expressions, harsh shadows, low resolution. SHOT PROMPTS Shot 1 (Left Side Base): A 30-year-old Caucasian woman with blonde hair tied back, wearing a chunky pink knit sweater, sitting by a window in a cozy cafe, warm natural daylight, holding a white ceramic mug, looking out the window, serene expression, cinematic, 35mm lens, soft depth of field. Shot 2 (Right Side Base): A 30-year-old Caucasian woman with blonde hair tied back, wearing a plain white t-shirt, sitting at a modern desk typing on a silver laptop, cool soft office lighting, focused expression, cinematic, 35mm lens, soft depth of field. SPEECH PACK speech_present: false transcript_segments: [] delivery_direction: N/A mic_room_signature: N/A sync_requirements: N/A mix_notes: Background music only, no dialogue or voiceover.
GLOBAL LOCK: A 9:16 vertical creator tutorial video showing how to build cinematic AI videos inside Freepik Spaces using Kling 3.0. The structure alternates between a casual male creator talking directly to camera, screen-like workflow panels, and polished AI-generated example sequences. The speaker is a white male in his 20s or 30s with beard, cap, and casual streetwear, filmed in a warm apartment or studio environment. He should feel approachable, creator-native, and energetic rather than corporate. Keep the edit fast and legible, with repeated “How to do this” framing, visual examples of cinematic shots, and interface scenes that imply prompt building, scene sequencing, and generation controls. Audio is speech-first and educational, with the creator explaining the workflow in concise steps. [00:00-00:05] Open on a catchy example visual or lifestyle shot with bold tutorial framing like “How to do this,” immediately pairing aspirational output with educational intent. [00:05-00:10] Cut to the creator talking directly to camera in a casual indoor setup, hands gesturing upward as he introduces the workflow and hooks viewers with the promise of showing the full process. [00:10-00:18] Alternate between creator face-cam, finished AI shots, and screen-style panels showing thumbnails or interface blocks, making it clear that multiple scenes are being built inside one pipeline. [00:18-00:28] Include more practical inserts: example frames, real-world pose or filming inspiration, and workflow interface layouts that suggest prompt control, shot planning, and visual refinement. [00:28-00:40] Keep cycling between explanation and proof, with the creator speaking in short, punchy segments while the examples show the quality ceiling of the method. [00:40-00:56] End with a clearer recap feel: more screen panels, more finished outputs, and a final face-cam summary that reinforces this as a repeatable Freepik Spaces plus Kling production workflow. NEGATIVE PROMPT: dry webinar, plain slideshow only, no example outputs, stiff face-cam, dark podcast studio, random office footage, unreadable UI, over-designed captions everywhere, broken hands, uncanny face, robotic speech, disconnected examples, generic stock footage, text-heavy PowerPoint feel, poor pacing, muddy screen inserts, lip-sync errors, low-quality AI art, unrelated memes. SHOT PROMPT DELTAS: 1) Aspirational example frame with tutorial hook text treatment. 2) Casual creator face-cam explaining workflow. 3) Screen-style interface panels and scene thumbnails. 4) Example cinematic outputs paired with explanation. 5) Final recap with tools, outputs, and creator closeout. SPEECH PACK: [00:00-00:56] One male speaker throughout. Tone should be concise, confident, and creator-educational, explaining how to structure prompts, build shots, and use Freepik Spaces with Kling 3.0 to generate cinematic AI videos. Medium lip-sync strictness when on-camera.
GLOBAL LOCK: Subject is a Black male in his mid-20s, long dark dreadlocks, wearing a black fur-trimmed ushanka hat, a black quilted leather jacket, a large silver star pendant necklace, and dark sunglasses. Skin tone is deep with warm undertones. Environment is a dry, yellow-grass field at golden hour with a blurred black pickup truck in the background. Cinematic editorial style, 70mm lens feel, shallow depth of field, warm color grade with high contrast. Speech is energetic, confident, and rhythmic. [00:00–00:03] The subject is initially sitting in a camping chair but suddenly jumps up and lunges to the left as a large black pickup truck speeds past him from behind, narrowly missing him. Handheld camera shake to simulate impact. Dust and debris kick up from the ground. Lighting is strong golden hour backlight. [00:03–00:07] Medium close-up of the subject standing, facing the camera directly. He is talking with high energy, gesturing with his hands to emphasize his points. The background shows the truck stopped in the distance with dust settling. Lips are clearly visible and synced to the dialogue: "This is how you make a hook with AI in only three simple steps." [00:07–00:13] Split screen or overlay showing the subject on the bottom and a digital UI (Arcads) on top. The UI shows a text prompt being typed: "A Black man stands casually directly facing the camera... rugged off-road truck... golden hour." The subject continues talking, pointing upwards toward the UI. [00:13–00:18] Full-screen cinematic shot of the generated image: A Black man in a dark utility jacket standing in front of a truck that has just skidded to a stop, surrounded by a massive cloud of dust and smoke. The man is perfectly still while the dust particles are suspended in the air. High-end editorial magazine aesthetic. [00:18–00:22] Return to the subject in the field, medium close-up. He is smiling and pointing at the camera. A UI overlay for "Video Settings" (Kling 3.0) appears next to him, showing a cursor selecting "Kling 3.0" from a dropdown menu. Dialogue: "Then take that generated image, animate it with Kling 3.0..." [00:22–00:26] The subject walks back to his camping chair and sits down casually. He picks up a silver water bottle and takes a sip. A large text overlay appears: "COMMENT HOOK FOR THE FORMULA." The camera slowly zooms out. The lighting is a soft, fading golden hour glow. NEGATIVE PROMPT: Visuals: cartoonish, low resolution, blurry face, inconsistent clothing, extra limbs, static dust, flat lighting, cold colors, robotic movement, flickering background. Speech: monotone delivery, robotic cadence, muffled audio, background noise, lip-sync mismatch, stuttering, flat intonation. SPEECH PACK: [00:03–00:07] Transcript: "This is how you make a hook with AI in only three simple steps." TAKE_A: (Energetic, fast-paced, emphasis on "HOOK" and "THREE") TAKE_B: (Confident, instructional, slight pause after "AI") TAKE_C: (Hype-man style, loud and punchy) [00:07–00:13] Transcript: "First, take the time to gather the most absurd idea you have in the back of your mind and make a personalized prompt with them." TAKE_A: (Thoughtful, then building excitement) TAKE_B: (Clear, instructional, emphasis on "ABSURD IDEA") [00:22–00:26] Transcript: "Just comment HOOK below for my viral AI formula." TAKE_A: (Casual, inviting, pointing at camera) TAKE_B: (Direct, authoritative, emphasis on "HOOK")
INVARIANTS TO LOCK - Vertical 9:16 split-comparison Reel. - Same young adult white male creator in every shot: light skin, slim build, side-swept brown hair, clean-shaven, expressive face. - Neutral studio setup with soft gray background, clean frontal lighting, medium framing from chest to head. - Video alternates between “Original:” and “AI:” versions of the same gesture performance. - The AI versions keep the exact body movement and timing, but swap wardrobe, accessories, and visual effects. - Tone is demo-first, highly legible, fast, and social-native. SHOTLIST 1. [00:00-00:02] AI label over a dark tactical outfit, then a red-and-blue spider-inspired superhero suit, then a brown aviator jacket with patches and sunglasses. Matching “Original:” frames underneath show the presenter in a plain black shirt doing the same finger snap gesture. 2. [00:02-00:05] The comparison continues with the aviator look in a warmer room setting with vertical blinds and a plant, still mirroring the original hand choreography. 3. [00:05-00:07] Fire effects appear behind and around the AI version while the original remains clean and unstyled below. 4. [00:07-00:09] Large subtitle CTA appears over the AI version: comment “AI” for guide. Final frames push the fiery transformation while the original keeps the same open-handed pose. STYLE BIBLE Visual style: creator demo of motion-consistent character transformation. Camera signature: locked tripod, eye-level medium shot, no camera movement. Lighting signature: soft even front light on the original clip; AI variants maintain similar face lighting while changing wardrobe and environment mood. Grade signature: clean studio neutrals in the original; richer contrast and warmer highlights in the AI versions. Speech style: brief solo creator commentary or silent caption-driven demo; if voice is present, it should sound casual, impressed, and direct. MASTER PROMPT GLOBAL LOCK: Create a vertical 9:16 Instagram Reel that compares an original studio performance against AI-transformed outputs. Use the same young adult white male creator with light skin, slim build, side-swept brown hair, and clean-shaven face throughout. Keep the original clip on a soft gray studio background with the creator in a plain fitted black shirt, medium framing, frontal lighting, and simple hand gestures. Every AI version must preserve identical timing, pose, eye line, and hand motion, while changing outfit, accessories, background mood, and effects. Use bold yellow labels “AI:” and “Original:” so the comparison is instantly readable. [00:00-00:02] Show the creator snapping or flicking his fingers in sync across paired comparison frames. In the AI version, first dress him in a dark armored tactical costume, then switch to a red-and-blue spider-inspired superhero suit, then to a brown aviator jacket with sewn patches and black sunglasses. In the original version, keep the same gesture in a plain black shirt against a gray backdrop. [00:02-00:05] Continue the gesture-matched comparison. The AI variant now settles into the aviator look in a warmer cinematic room with vertical blinds and a leafy plant, preserving exact mouth shape and hand timing from the original clip. The original remains unchanged below, emphasizing how the motion has been transferred rather than reanimated from scratch. [00:05-00:07] Add stylized flames behind the AI character and subtle orange light wrapping around the jacket sleeves. Keep the original clip clean and neutral for contrast. Maintain sharp alignment between both performances so viewers can read the transformation as one-to-one motion mapping. [00:07-00:09] End with the most dramatic fiery aviator transformation while overlaying a clear CTA: comment “AI” for guide. The original clip still mirrors the same open-handed pose. Finish on a high-energy, creator-demo beat. NEGATIVE PROMPT Do not drift the face identity, hairstyle, body proportions, or gesture timing between original and AI versions. Avoid extra fingers, broken sunglasses, distorted jacket patches, muddy flames, inconsistent eye direction, unreadable labels, flickering backgrounds, or cartoonish facial deformation. Do not let the AI transformation lose the exact one-to-one motion match with the original clip. SPEECH PACK [00:00-00:04] Speaker A, direct-to-camera, meaning: this is how the same motion can be restyled with AI. Delivery: short, confident, creator-demo cadence. TAKE_A: “Same motion, completely different character styling.” TAKE_B: “This is the exact same performance, just transformed with AI.” TAKE_C: “Watch how the motion stays locked while the look changes.” [00:04-00:09] Speaker A or on-screen text, meaning: these tools save creators time and a guide is available by comment. Delivery: casual CTA. TAKE_A: “Comment AI if you want the full guide.” TAKE_B: “If you want the workflow, comment AI below.” TAKE_C: “Comment AI and I will send the guide.”
MASTER PROMPT GLOBAL LOCK: Vertical 9:16 creator explainer reel. Lower half shows one male creator in a warm studio, fair skin, brown side-parted hair, slim build, dark shirt, speaking into camera with fast creator energy. Upper half shows cinematic AI action visuals with dark backgrounds, strong orange firelight, rugged male hero styling, and motion-enhanced fantasy frames. Keep the host stable and readable while the upper visuals deliver the proof. Audio is one male speaker, close mic, dry room, fast CTA cadence. [00:00-00:03] Split-screen opening. Upper half shows a dramatic fantasy action frame with fire, smoke, and a rugged male character. Lower half shows the host addressing camera and asking viewers to comment AI for the link and quick guide. Strong contrast, warm studio below, dark cinematic image above. [00:03-00:06] The upper visual changes to another fire-lit action shot in the same style family. The host explains that Kling Motion Control is his favorite AI tool right now, especially inside Higgsfield. Keep the edit fast, clean, and social-native. [00:06-00:09] Finish on one more strong action visual while the host repeats the CTA. Preserve the orange-black grade, premium game-trailer feel, and simple direct recommendation energy. NEGATIVE PROMPT Avoid muddy firelight, broken armor or costume details, plastic skin, weak motion, unreadable split-screen layout, host identity drift, robotic voice, bad lip sync, and sloppy action-frame artifacts. SPEECH PACK [00:00-00:03] Closest audible: Comment AI and I will send you the link and a quick guide. Safe paraphrase: Open with a simple keyword CTA tied to a guide. [00:03-00:06] Closest audible: Kling Motion Control is my favorite AI tool so far, especially inside Higgsfield. Safe paraphrase: He recommends the workflow as practical and fun. [00:06-00:09] Closest audible: Comment AI for the full guide. Safe paraphrase: Close by repeating the same easy CTA.
GLOBAL LOCK: A Caucasian male in his mid-30s with short, light brown hair and a full, well-groomed beard. He is wearing a vibrant, solid red crewneck sweatshirt. The environment is the interior of a vintage luxury car, featuring tan leather upholstery and a polished dark wood dashboard. The lighting is warm, golden-hour sunlight coming from the side. The camera is positioned for a medium side-profile shot. [00:00–00:07] The man is seated in the driver's seat of a classic Rolls-Royce. His right hand is firmly gripping the black steering wheel, while his left arm rests naturally. He is looking straight ahead through the windshield with a focused but calm expression. Outside the window, a scenic coastal highway is visible, with the blue ocean and green cliffs rushing past in a motion-blurred parallax effect. The steering wheel has subtle, realistic micro-movements as if he is steering. The sunlight creates a sharp rim light on his beard and hair. The texture of the red sweatshirt and the grain of the wood dashboard are highly detailed. No speech is present, but the man's jaw is set firmly. The camera has a very slight handheld shake to simulate the vibration of a moving car. High-quality cinematic film stock appearance with natural color saturation. NEGATIVE PROMPT: blurry face, inconsistent beard shape, changing sweatshirt color, distorted car interior, static background, robotic movement, flickering lighting, extra fingers, morphing steering wheel, low resolution, cartoonish texture, text overlays, logos.
GLOBAL LOCK: Subject: A 25-year-old Caucasian woman, radiant skin with natural texture and visible pores, long straight blonde hair with subtle highlights, bright blue eyes, wearing bold glossy red lipstick. Wardrobe: Simple neutral-colored top, thin straps visible. Prop: Holding a small pink jar of "LANEIGE Lip Sleeping Mask" in her right hand, positioned near her chin. Environment: Indoor bedroom setting, soft-focus background with hints of green plants and white walls. Lighting: Strong natural sunlight from the side (golden hour), creating high-contrast highlights on the face and soft shadows. Color Grade: Warm, vibrant, high saturation on reds and pinks, cinematic editorial look. Camera: Medium Close-Up (MCU), static position, slight handheld micro-jitter for realism. Speech: Female voice, warm, energetic, clear articulation, medium pace. [00:00–00:03] Subject is looking directly into the camera lens with a friendly, knowing smile. She holds the pink jar steady. Her lips begin to move in perfect sync with the words: "This is my secret to waking up with smooth hydrated lips." Her head tilts slightly to the left as she emphasizes "secret." The sunlight catches the gloss on her lips and the highlights in her hair. [00:03–00:06] Subject continues speaking: "This mask works while I dream." She blinks naturally once. Her expression is soft and convincing. The camera remains in a tight MCU. The pink jar remains visible in the frame, slightly catching the light. [00:06–00:08] Subject finishes the sentence with a slight nod and a wider smile, maintaining eye contact with the camera. The video ends on a high-energy, positive note. The background remains softly blurred. NEGATIVE PROMPT: Uncanny valley, plastic skin, missing teeth, distorted fingers on the hand holding the jar, blurry product label, robotic head movements, frozen eyes, mismatched lip-sync, flickering lighting, low resolution, watermark, text artifacts on skin, unnatural hair movement, popping shadows. SPEECH PACK: Transcript: "This is my secret to waking up with smooth hydrated lips. This mask works while I dream." TAKE_A (Energetic): "This is my **SECRET** to waking up with smooth... hydrated lips! This mask works while I **DREAM**." (High energy, emphasis on secret and dream). TAKE_B (Soft/Intimate): "This is my secret... to waking up with smooth, hydrated lips. [breath] This mask works while I dream." (Lower volume, more breathy, slower pace). TAKE_C (Direct/UGC): "This is my secret to waking up with smooth hydrated lips. This mask works while I dream!" (Fast-paced, casual, friendly). Prosody: Pause after "lips" (0.5s). Emphasis on "Secret" and "Dream". Sync: High strictness on "Secret", "Smooth", and "Dream". Mic: Close-proximity condenser mic feel, dry room tone, no reverb.
GLOBAL LOCK: The protagonist is a Black man in his late 20s with long, dark dreadlocks, wearing a black durag/headband, vibrant orange-tinted wrap-around sports sunglasses, and a plain black t-shirt. The setting is a bright, sunny Miami-style city with glass skyscrapers, palm trees, and wide asphalt roads. The lighting is high-contrast daylight with deep shadows and bright highlights. The camera style is cinematic action, utilizing low-angle tracking shots, handheld shakes for intensity, and wide-angle lenses for stunts. The color grade is vibrant with saturated blues and yellows. Speech is energetic and shouting, with high lip-sync strictness for on-camera dialogue. [00:00–00:05] Subject: Close-up of the protagonist riding a motorcycle. Action: He is leaning forward, shouting intensely at the camera while the city blurs behind him. Camera: Low-angle MCU, tracking the motorcycle's movement with a slight handheld vibration. Lighting: Hard sunlight hitting his face from the side. Speech: "Your AI suck because you let it guess for you. Do this instead." (Energetic, shouting over wind/engine). [00:05–00:08] Subject: The protagonist on his motorcycle performing a stunt. Action: The motorcycle launches into the air, jumping over a large white truck and a police car with flashing lights. Smoke and debris are visible on the ground. Camera: Wide shot, side profile, tracking the arc of the jump. Motion: High-speed motion with a slight slow-motion ramp at the peak of the jump. Environment: Busy city street with multiple police cars in pursuit. [00:08–00:11] Subject: Four diverse teenagers (Black and Asian) sitting on a city bus. Action: They are sitting side-by-side, looking down at their smartphones, appearing bored. Camera: Static medium shot from across the aisle. Lighting: Soft, natural light coming through the bus windows. Environment: Interior of a modern city bus with blue seats and "Wheelchair Priority Seating" signs. [00:11–00:16] Subject: The motorcycle chase continues. Action: The protagonist weaves through traffic. A quick cut to a tall, modern white skyscraper with glass balconies against a bright blue sky. Camera: Fast tracking shot of the motorcycle followed by a rapid tilt-up to the building. Lighting: High-key, brilliant sunshine. [00:16–00:20] Subject: A man in a dark suit and tie on a rooftop. Action: He is lying prone, aiming a modern sniper rifle with a large scope. Camera: Medium shot, side view, low-angle looking up at the rooftop edge. Environment: Clean, sunlit rooftop with a beige stone ledge. [00:20–00:24] Subject: POV through the sniper scope. Action: The crosshairs are centered on the protagonist riding the motorcycle on the road far below. Cars move quickly across the frame. Camera: POV through a circular scope mask, slight jitter to simulate aiming. Lighting: Slightly desaturated and technical. [00:24–00:28] Subject: The protagonist on the motorcycle and a woman in a red convertible sports car. Action: They are driving side-by-side at high speed under a concrete highway overpass. Camera: Low-angle tracking shot moving between the two vehicles. Environment: Urban underpass with large concrete pillars and palm trees in the background. [00:28–00:34] Subject: Close-up of the woman in the red car. Action: She has dark hair, heavy makeup (smudged), and is shouting to the motorcyclist while driving. Camera: MCU from the passenger side of the car. Speech: "You got it on. You brought it." (Urgent, shouting). [00:34–00:40] Subject: Close-up of the protagonist on the motorcycle. Action: He looks over at the woman and shouts back, then looks ahead. Camera: MCU, tracking his profile. Speech: "You mean the AI playbook to make the best AI visuals? Yeah, I've got it. And that's exactly why the cops are chasing us right now." [00:40–00:46] Subject: The woman in the red car and the protagonist. Action: She shouts "Good! Get in!" and he prepares to jump from the moving motorcycle into the moving red convertible. Camera: Wide tracking shot showing both vehicles under the overpass. Speech: "Good! Get in! We've got the future in our hands." Motion: High-speed parallax between the cars and the concrete pillars. NEGATIVE PROMPT: Visual artifacts, morphing limbs, flickering backgrounds, inconsistent character features, blurry faces, robotic lip-sync, unnatural physics, muted colors, low contrast, text/logos on clothing, floating objects, temporal jitter. SPEECH PACK: [00:00-00:05] "Your AI suck because you let it guess for you. Do this instead." TAKE_A: (Aggressive, high energy) TAKE_B: (Authoritative, shouting) TAKE_C: (Fast-paced, urgent) [00:29-00:31] "You got it on. You brought it." TAKE_A: (Screaming over engine noise) TAKE_B: (Relieved but urgent) [00:31-00:36] "You mean the AI playbook to make the best AI visuals? Yeah, I've got it." TAKE_A: (Confident, shouting) TAKE_B: (Cool, collected despite the chase) [00:37-00:40] "Good! Get in! We've got the future in our hands." TAKE_A: (Exhilarated, shouting) TAKE_B: (Determined, loud)
GLOBAL LOCK: A vertical 9:16 AI demo video for Pollo.ai Mimic Motion featuring a male creator with short reddish-blond hair, fair skin, trimmed beard, and a light t-shirt speaking directly to camera in front of a warm wooden wall. A black podcast-style microphone sits in front of him. The key visual structure is a stacked comparison layout where the creator's exact expressions, head movement, hand gestures, and lip-sync are transferred onto multiple different characters. The swapped identities should include high-recognition fantasy and movie-inspired figures such as a Shrek-style ogre, a half-human cyborg reminiscent of Terminator, a Gollum-like creature, a Harry Potter-style wizard, a Pennywise-style clown, and a Tyler Durden-style gritty male lead. The demo should feel clear, fast, and proof-driven rather than cinematic storytelling. [00:00-00:10] Open on a three-panel stacked comparison. The top panel shows the original creator speaking with both hands raised and expressive brows. The middle and bottom panels show alternate characters performing the exact same mouth movement, gaze direction, and hand pose in sync. Start with obvious contrast pairings like Shrek and a cyborg face to make the motion transfer immediately readable. [00:10-00:24] Continue the stacked format while rotating through more dramatic character swaps. Show the same creator performance mapped onto a gaunt cave-dweller like Gollum, a young wizard in glasses, a white-faced clown with red makeup lines, and a gritty sunglass-wearing antihero. Each variant must preserve the exact source rhythm and gesture language, with only the identity layer changing. [00:24-00:35] Transition back to the original creator in a single full-screen talking-head view with the microphone clearly visible. Let him continue speaking and gesturing naturally so viewers understand that the earlier transformations all came from this simple source performance. Keep the overall tone instructional and creator-focused. NEGATIVE PROMPT: unsynced lip movement between variants, different poses in each comparison panel, heavy VFX clutter, cinematic story scenes replacing the demo structure, inaccurate parody costumes, random background changes, low-detail face swaps, no microphone or creator setup, generic montage without proof. SHOT PROMPTS: creator talking-head source video; stacked mimic motion comparison panels; Shrek-style face swap synced to creator; cyborg half-face character remap; Harry Potter and clown motion transfer demo; original creator talking to microphone after swaps. SPEECH PACK: One male speaker only. The important audio behavior is clean creator-style direct-to-camera speech with lip-sync accuracy preserved across every swapped character.
Create a short-form creator tutorial video about how to make cinematic AI clips from simple ideas. The piece should feel like an Instagram Reel or TikTok posted by an AI filmmaking educator, combining direct-to-camera instruction with polished cinematic sample shots and interface cutaways. Use a confident creator host in a dark studio or moody workspace, speaking naturally to camera while explaining a repeatable workflow for generating cinematic AI videos. The pacing should be fast, sharp, and social-first, with frequent visual resets to keep attention high. Open with a strong hook where the creator talks directly to camera and promises to show viewers how to make cinematic AI clips that feel dramatic, polished, and scroll-stopping. Then cut into multiple example shots that look like finished outputs: moody action moments, dramatic close-ups, atmospheric character scenes, and premium-looking cinematic frames. Intercut those examples with prompt panels, tool UI, timeline views, or settings screens so the workflow feels grounded in real AI video creation rather than abstract inspiration. The host should stay visually consistent across talking segments: same person, same wardrobe, same lighting setup, same direct creator-teacher tone. Their performance should feel natural and creator-native, not overly scripted. They should gesture casually, point toward on-screen examples, and deliver the lesson with energetic clarity, like someone used to teaching AI video tricks on social media. The visual design should alternate between two clear modes. Mode one is the tutorial studio setup: dark background, controlled lighting, crisp face detail, shallow depth of field, subtle color accents, and a premium creator-desk atmosphere. Mode two is the cinematic demo footage: dramatic compositions, intentional movement, filmic contrast, moody lighting, and stronger environmental storytelling. Keep cutting between those modes so the audience always sees both the result and the process. Keep the entire piece optimized for vertical video. For talking-head sections, use close-ups and medium close-ups with subtle push-ins or light handheld energy. For the cinematic examples, vary the framing with wides, dramatic close-ups, push-ins, tracking shots, and controlled motion that sells the idea of “cinematic” without becoming chaotic. Everything should feel curated and premium. Lighting is important. The host footage should use flattering key light with soft falloff and a clean but moody creator-studio look. The cinematic sample shots should lean harder into contrast, rim light, atmosphere, practicals, and dramatic highlight control. The overall grade should feel modern, contrasty, and polished, with rich blacks, sharp visual separation, and subtle filmic texture. Include insert shots of prompts, settings, or example workflow screens to reinforce the educational angle. These moments can show how ideas become prompts, how cinematic references are structured, or how the creator chooses scenes and visual style. The UI should feel real and useful, not decorative. The edit should stay fast and social-first: hook, creator explanation, cinematic example, interface proof, another teaching beat, then more examples. Use cuts, punch-ins, overlays, and visual comparison moments so the viewer always feels momentum. The final result should feel like a practical creator tutorial that teaches viewers how to make cinematic AI clips while also showcasing enough premium output to inspire them to try the workflow themselves.
GLOBAL LOCK: Horizontal creator-demo video set in a minimalist white studio built around a glossy retro-futurist red terminal or kiosk branded as an AI creation device. The cast includes a young blonde man with curly hair and casual-cool styling, plus a brunette woman in a black camisole or simple fitted top. The red terminal has a built-in screen that first shows a crude stick-figure face, then transitions into a modern AI interface associated with Hedra Agent. The style blends real-life creator demo energy with clean commercial staging: white cyclorama backdrop, bold red hardware centerpiece, yellow subtitle captions, and fast transitions into generated outputs. The core promise is that casual natural-language requests can be turned into structured prompts, AI tool recommendations, and finished visuals. [00:00-00:08] Open on a cinematic shot of the blonde man sitting in or beside a vintage car with bold yellow subtitle text. The mood feels like a lifestyle ad or stylized short film. The brunette woman appears in adjacent car shots, creating the impression of a polished generated scene. [00:08-00:14] A pink title card or interstitial appears, then the video cuts into the white studio setup with the retro red terminal. The brunette woman stands beside it while the blonde man faces the screen. Yellow subtitle captions carry the spoken explanation. [00:14-00:22] The terminal screen shows a simple stick figure, then switches to a Hedra-like interface asking what should be made today. This establishes the joke and the product capability at the same time: conversational input becomes creative output. [00:22-00:32] Show the interface more clearly. A prompt field, asset options, and example thumbnails appear as the system loads. The presenter explains that the agent can understand casual requests, structure prompts, and route them toward the right generation tools and settings. [00:32-00:42] Cut to the visual payoff: multiple styled versions of the same man appear side by side in different looks and outfits, demonstrating reference control and character transformation. The clean white background keeps attention on the generated variations and the tool logic above them. [00:42-00:54] End with more polished studio shots of the brunette woman beside the red terminal while the narration frames Hedra Agent as an easier way to generate strong AI visuals. The overall tone should feel like a product demo wrapped in a playful, high-concept studio vignette.
Image To Video AI Meme
Image To Video AI Meme is for users who already have a still meme image and want to animate it into a video format. The page should guide them toward examples and prompts built around preserving the original joke, adding motion without losing readability, and turning static meme material into something new enough to post again.
The strongest angle is conversion. Users here are not looking for general meme generation. They want the specific workflow of taking an image they already understand and making it move. The copy should keep that narrow, practical use case explicit.
What this page should make clear: - The workflow starts from an existing still meme image. - The goal is to add motion without destroying the original joke. - This style works for repostable remixes, animated reaction memes, and short-form updates to familiar formats. - The best examples feel like a clean upgrade from image to motion, not a total rewrite.
FAQ
Q: What is an image-to-video AI meme? A: It is a workflow that takes a static meme image and turns it into an animated or video meme.
Q: Why do this instead of making a new meme? A: It keeps a familiar joke structure while making it feel fresh and more dynamic.
Q: What is it best for? A: Animated remixes, reaction updates, repostable meme motion, and bringing still images to life.