AI meme from photo pages are for people who already have the raw material and want the joke to happen fast. The photo might be a selfie, a pet picture, a friend reaction shot, or something chaotic from the camera roll. This page helps you compare ways to turn a photo into a meme that feels easy to caption, quick to remix, and strong enough to post without much extra editing.

Video
A vertical educational social post built around the classic “distracted boyfriend” street photo composition. Place the original meme-like image near the top of a black background: a young man in a blue plaid short-sleeve shirt walks with his girlfriend on a busy European stone-paved street in daylight, but turns back over his shoulder to stare at another woman in a red sleeveless dress crossing the foreground. The girlfriend, wearing a light blue sleeveless top, looks at him with disbelief and irritation. Below the image, add the heading “Prompt” and a dense block of small yellowish-white text formatted like a detailed AI generation prompt describing subject positions, movement vectors, shallow depth of field, camera behavior, and cinematic grain. At the bottom, add a bright call-to-action line: “Save this post!” The overall design should feel like an AI prompt-education carousel cover turned into a short looping video: black background, meme image, compact typography, creator-tip format, high contrast, legible social layout.
Video
Create a vertical 9:16 futuristic AI product-promo visual centered on a hyper-realistic fashion portrait of a young woman with slicked-back hair, pale skin, blue-grey eyes, and bold matte red lipstick, wearing a reflective chrome silver high-collar outfit in a bright metallic environment filled with iridescent foil-like textures. Behind her, large bold yellow text reads Meta AI, integrated like a clean social-ad headline. The image should feel like a premium generative-AI campaign frame promoting free image generation and AI lip sync tools, combining polished beauty-editorial realism with tech branding. Keep the composition crisp, symmetrical, high contrast, and optimized for short-form creator marketing. No extra clutter, no subtitles, no cartoon styling, no unrelated props.
Video
GLOBAL LOCK: A vertical 9:16 creator-marketing Reel, approximately 33 seconds, built around one recurring host and a dark-mode AI character-generation interface. Keep three visual layers consistent across the whole video: (1) the host, a white male in his late 20s to early 30s with side-parted brown hair, slim build, expressive face, clean-shaven, wearing a fitted off-white knit sweater and speaking into a matte-black desktop microphone, lit by a warm amber key and soft vignetted studio background; (2) stylized portrait outputs of the same handsome male AI character, usually white, early 20s to early 30s, chiseled jaw, thick dark hair, slim-athletic build, shown in different fashion/editorial presets such as city streetwear, convenience-store candid, studio portrait, tank-top fashion, foggy road noir, cowboy desert, and black-and-white urban scenes; (3) Higgsfield.ai interface captures in dark mode featuring the Character section, Higgsfield Soul 2.0 highlighted in the left model list, a grid of example source faces, preset tiles labeled Editorials, Fashion, Street Photography, Double exposure, a bright lime-green Generate button with a coin cost indicator, and an Animate button on selected outputs. The pacing must stay aggressive and social-native with a new visual beat every one to two seconds, strong contrast between warm host footage and colder generated sample cards, crisp UI sharpness, black/charcoal backgrounds, neon-lime accent labels, and one energetic male speaker throughout with close-mic, dry, high-intelligibility audio. Lips are visible during all host sections and sync must feel tight.

[00:00-00:03] Start on a dark background with bold white uppercase text reading STOP DOING THIS, flanked by red X marks. Under the headline, show generic AI male portrait samples: first a black-coat city street shot, then a casual black sweater portrait, then another generic urban fashion image. The host appears in a rounded rectangle at the bottom, urgently raising one hand toward camera as if interrupting the viewer. Audio: same male host delivers a sharp pattern-break hook telling viewers to stop making the same boring AI character photos.

[00:03-00:07] Cut between the host in warm studio close-up and more bland sample outputs: a crouched white-sweater studio pose, a convenience-store fashion portrait with bomber jacket and bow tie, another convenience-store variation. The host points upward with both index fingers while speaking quickly. Camera on the host remains static medium close-up with 35mm to 50mm lens feel, shallow depth, warm amber falloff. Audio: one speaker, emphatic, corrective tone, lips fully visible.

[00:07-00:11] Introduce stronger preset-driven examples. Show a clean editorial portrait card labeled Editorials, then a Fashion preset with a white ribbed tank top, then Street Photography over a bright outdoor male portrait, then Double exposure with a grayscale silhouette overlay. Each sample occupies the upper two-thirds while the host continues in the lower panel. The transition rhythm should feel like flipping through creative options rather than a tutorial menu. Audio: host pivots from criticism to the better alternative.

[00:11-00:14] Briefly isolate the Higgsfield.ai logo on a dark bar, then cut to the platform interface. Show the Character tab area with Soul 2.0 in the model list highlighted and the host below continuing to explain. Use dark graphite UI, lime-green badges, and readable white text. Audio: same speaker names the tool and frames it as an easier route to ultra-realistic character creation.

[00:14-00:18] Show a grid of source reference portraits inside the character workflow: multiple male selfies and studio shots, the cursor hovering over them as if choosing a base identity. Host remains bottom-center, speaking calmly but with momentum. Emphasize that one character identity can be turned into many outputs. Audio: host explains consistency and customization, crisp consonants, no background reverb.

[00:18-00:21] Cut to a full-height preset card of a standing male figure against a white seamless with a lime Presets label, then to the generation composer showing a dark prompt box, a character token or preset mention, and a lime Generate button with a coin cost. Cursor movement should imply that generation is about to happen. Audio: host explains that the system can create polished images in a couple of clicks.

[00:21-00:24] Reveal generated outputs in different environments: a dark cinematic portrait of a bespectacled man, a convenience-store streetwear shot with Presets badge, and an outdoor coastal portrait with Animate highlighted in lime. The host gestures with one hand as if listing options. Color shifts between cool storefront daylight, neutral portrait lighting, and warm natural outdoor scenes while the UI frame stays dark.

[00:24-00:28] Expand the sample range further with a foggy road full-body shot in a long black coat, a desert cowboy standing in front of a stepped stone structure, and a top-down tank-top fashion portrait. These three outputs should feel dramatically different in location and styling while keeping premium realism and the same polished character aesthetic. Audio: same male narrator sells variety, speed, and realism for creators.

[00:28-00:31] Tighten into darker cinematic portraits: a serious close-up male face against a charcoal backdrop, then a black-and-white street portrait with overlaid CTA text Comment "AI", then a fashion portrait with the same CTA treatment. Keep typography large, bold, white, and lime-yellow, centered over the images. The host points upward from the bottom frame to reinforce the CTA timing.

[00:31-00:33] End on another fast CTA repetition using the strongest portrait samples while the host lands the final line. Maintain the warm studio box below, sharp microphone silhouette, and dark premium brand palette. Audio: one male speaker, punchy final comment-gate instruction, no fade, no music swell overpowering the words.

NEGATIVE PROMPT: avoid identity drift between generated male portraits, avoid uncanny skin texture, avoid distorted eyes or asymmetrical jawlines, avoid over-smoothed plastic faces, avoid broken hands in host gestures, avoid unreadable UI labels, avoid cluttered text overlays beyond STOP DOING THIS and Comment "AI", avoid fake logos, avoid low-resolution preset cards, avoid inconsistent sweater color on the host, avoid muddy shadows on the warm studio shot, avoid robotic speech, lip-sync mismatch, clipped peaks, harsh sibilance, or over-compressed voice.
Video
GLOBAL LOCK: High-definition screen recording of a professional web application interface (Freepik AI Suite). The UI is clean, minimalist, with a white and light gray color palette. The cursor is a standard black pointer. The video features a persistent black header at the top with white text "4. How to get started 👇" and a persistent black footer at the bottom with white text "Swipe for more —>".

[00:00–00:02]
The screen shows the "AI Suite" dashboard with categories for IMAGE, VIDEO, AUDIO, and DESIGN. The cursor moves smoothly to the "Image Editor" link under the "IMAGE" column and clicks it. The UI is bright and responsive.

[00:02–00:04]
The browser transitions to the "Image Editor" page. A file explorer window briefly appears over the interface. The cursor selects a file named "Google Nano Banana". The background of the editor shows a "Drop an image or video" area before the image loads.

[00:04–00:07]
The selected image loads into the center of the editor. The image is a cinematic, low-light portrait of a young woman with dark hair and a white top, holding a bright yellow banana near her face. The lighting is warm and urban. The editor UI shows tools like "Retouch," "Resize," and "Upscale" below the image.

[00:07–00:10]
The user clicks an "Add annotations" or "AI Edit" button. A small text input box appears over the banana in the image. The user types the phrase "add text 'nano banana'". A blue progress bar at the bottom right of the editor indicates the AI is processing the request. The camera remains static on the browser window.

NEGATIVE PROMPT: blurry UI, shaky camera, low resolution, messy desktop background, visible browser tabs, slow loading times, distorted faces in the AI image, robotic cursor movement, flickering screen, watermark on the UI.

SPEECH PACK:
(No speech present in the original video. The video relies on visual UI cues and background music.)
TRANSCRIPT: [Silence/Background Music]
DELIVERY_DIRECTION: N/A
MIC_ROOM_SIGNATURE: N/A
SYNC_REQUIREMENTS: Visual sync between cursor clicks and UI transitions is high priority.
Video

GLOBAL LOCK: A vertical 9:16 creator explainer video with a matte-black background and subtle neon grid-floor perspective, a large rounded-rectangle demo panel on the upper half showing Higgsfield x NanoBanana editing examples, and a bottom talking-head creator framed from chest up in a softly lit indoor room. The speaker is a white male creator in his late 20s to mid 30s with medium brown hair, short beard, light skin, wearing a beige baseball cap backwards and a slate-blue oversized T-shirt with cream sleeve/shoulder panels. Keep the top caption text locked in bright yellow-green reading “Higgsfield x NanoBanana” followed by a banana emoji. The upper demo panel should alternate between sketch-to-image, pose sketch editing, character/IP remix examples, product insertion, and draw-to-edit interface states with clear toolbar icons and a bright lime-green “Higgsfield” or “Generate” button. The style is creator-news meets product-demo: clean UI, high readability, quick example swaps, no cinematic camera movement, one presenter speaking directly to camera with energetic but controlled gestures. Speech is English direct-to-camera narration, one speaker only, close-mic, dry room sound, informative hype tone, with lips visible most of the time and cuts aligned to example changes.

[00:00-00:05] The video opens with the title “Higgsfield x NanoBanana” at the top over a dark background. In the large upper panel, a rough black-line sketch appears on a white canvas with small reference images tucked into the corners, showing a loose hand-drawn figure pose. The presenter appears in the lower third, facing camera and raising one hand while introducing the collaboration. Framing is static vertical medium shot, warm lamp light on the face, dark background around him, no extra text beyond the title. Speaker A introduces the partnership and signals that a powerful new editing capability is available.

[00:05-00:10] The top panel switches from sketch to a polished cinematic result resembling pop-culture character imagery, showing how the rough drawing can become a finished scene. The creator below leans in slightly and gestures with both hands, emphasizing the transformation. Maintain crisp UI borders and a clean black margin around the demo panel. Speaker A explains that the tool can take rough input and generate controlled visual outcomes.

[00:10-00:18] The upper examples continue rotating: a fashion-like full-body figure on a clean white stage, seated-pose line drawings, and a stylized scene with a man in dark clothes sitting in a sunlit interior while a branded bottle or product card appears at the side. The presenter keeps speaking with measured, open-palm gestures. The key idea is controllable composition, pose, and inserted elements rather than random generation.

[00:18-00:26] The demo panel moves into more explicit pose-control examples: a sketched figure carrying another body, with character references like Joker and Batman pinned in the corners, followed by drawn action silhouettes with face references. Keep the toolbar visible at the bottom of the upper panel and the bright action button readable. Speaker A explains the flexibility of using sketches, references, and image guidance to direct the final scene. Lips visible, medium lip-sync strictness, emphasis on edit control and freedom.

[00:26-00:38] A rapid set of sketch-to-scene and sketch-plus-reference examples continues, including drawn bodies, anime-like or stylized references, and dramatic generated outcomes. The presenter below stays constant, nodding and gesturing in rhythm with the example swaps. The tone should feel like “look how much control this gives you,” not a calm tutorial. No secondary speakers, no music-led montage logic.

[00:38-00:50] The top panel shifts to a more app-like frame with visible mode tabs such as “Draw to Edit” and “Draw to Video,” then shows a humorous generated image of the creator composited with a celebrity in matching tuxedo-like outfits holding prop weapons. The UI looks more like a final product window rather than a floating demo card. Speaker A stresses that the workflow is practical and fun for creators, not just a research toy.

[00:50-00:62.4] The ending holds on further edit examples and interface states, reinforcing that rough sketches, masks, and reference images can steer image edits with high fidelity. The presenter keeps speaking directly to camera, hands opening and closing as he lands the CTA. Finish with the sense that the feature is live, generous, and worth trying immediately. One speaker only, close and intelligible, no other dialogue.

NEGATIVE PROMPT: no second presenter, no podcast framing, no desktop clutter, no cinematic handheld motion, no dark horror grade, no missing top title, no wrong cap orientation, no inconsistent shirt colors, no melted faces, no distorted reference thumbnails, no unreadable toolbar, no broken sketch anatomy, no random extra UI windows, no fake watermark overload, no low-resolution outputs, no jitter between example swaps, no extra fingers, no robotic lip movement, no echo, no crowd noise, no background chatter, no subtitles unrelated to the observed title or UI.

SHOT PROMPTS:
[00:00-00:10] Black background with neon-grid floor, title “Higgsfield x NanoBanana”, upper panel showing sketch-to-image transformation, bottom talking-head creator in backwards beige cap and slate-blue shirt.
[00:10-00:26] Controlled editing showcase: body pose sketches, seated figure scene, branded product insert, reference-driven transformations, toolbar and bright green action button visible.
[00:26-00:38] More advanced sketch plus reference examples emphasizing pose control, identity guidance, and scene remixing while the creator speaks enthusiastically below.
[00:38-00:62.4] Product-window UI with Draw to Edit / Draw to Video modes and playful high-fidelity generated examples, creator closes with try-it-now energy.

SPEECH PACK:
[00:00-00:10] Speaker A: announces Higgsfield x NanoBanana and frames it as a big update for creators. TAKE_A: excited reveal. TAKE_B: cleaner product-news tone. TAKE_C: hype-driven introduction.
[00:10-00:18] Speaker A: explains that sketches and rough drawings can be turned into polished outputs with strong control. TAKE_A: practical tone. TAKE_B: slightly more amazed tone. TAKE_C: creator-benefit emphasis.
[00:18-00:26] Speaker A: says you can use pose guides, references, and edits to shape the scene you want. TAKE_A: workflow explanation. TAKE_B: feature-summary cadence. TAKE_C: punchier social-video cadence.
[00:26-00:50] Speaker A: expands on creative flexibility, showing character remixes, product insertions, and more expressive control than normal image generation. TAKE_A: informative. TAKE_B: feature-hype balance. TAKE_C: tool-for-creators framing.
[00:50-00:62.4] Speaker A: closes with urgency that the offer is live for Pro+ users and worth testing now, likely tied to a comment CTA. TAKE_A: clear CTA. TAKE_B: more urgent CTA. TAKE_C: softer invitation to try. Prosody markup: energetic sentence starts, brief pauses between examples, emphasis on tool names and control words. Closest audible version: creator explains Higgsfield x NanoBanana editing control and limited-time availability. Safe paraphrase version: one-speaker explainer about a sketch-and-reference-driven AI editor that creators should try this week.
Video
GLOBAL LOCK: preserve a creator-led talking-head tutorial format mixed with vertical phone screen recordings. Keep one young male creator in a backward black cap and dark hoodie speaking directly to camera in a studio setup with a microphone. Intercut iPhone-style screen captures showing ChatGPT/OpenAI image workflow steps, uploaded object photos, prompt entry, and AI video generation screens. Maintain a practical “make from your phone” educational reel structure. No random B-roll, no unrelated tools, no logo overlays beyond app UI already present in the source.

Create a 37.8-second social-first AI tutorial reel showing how to turn ordinary phone photos into animated AI character videos. Begin with a hook using a simple hand-held object photo and bold on-screen teaching posture from the creator. Then show phone interfaces: photo selection, ChatGPT or image-tool screens, prompt entry, image transformation results, switching to an AI video tool, uploading the generated image, entering a motion prompt, and generating the final animated output. Use repeated face-cam segments where the creator explains the steps and emphasizes that the workflow can be done from a phone.

Include the specific examples visible in the source: tiny object/food photos held in a hand, ChatGPT app icon and mobile interface, typed prompts that turn objects into cute expressive characters, a generated pear-like baby character image, a switch to another AI generation interface, upload and prompt steps for video, and a final generated moving result shown on-screen. Preserve the educational pacing and creator-marketing vibe.

SHOT SEGMENTS:
[00:00-00:06] Hook with object photos in hand and creator talking-head intro about making AI content from your phone.
[00:06-00:14] Mobile screens show ChatGPT / image workflow setup, app screens, and prompt entry.
[00:14-00:22] Creator explains the key steps while on-screen phone UI shows prompt refinement and generated object-to-character image outputs.
[00:22-00:30] The tutorial switches to an AI video tool, showing upload, prompt, and generation steps from the phone.
[00:30-00:37.8] Final result displays the generated animated character clip, while the creator closes with a call to try the workflow.

ENVIRONMENT: creator desk/studio face-cam plus crisp mobile screen recordings. CAMERA: direct-to-camera presenter shots alternating with full-screen phone UI captures. LIGHTING: clean creator-studio lighting on face-cam; bright legible phone UI on inserts. MOTION: tutorial pacing, finger taps on phone UI, creator emphasis gestures, no cinematic narrative scenes.

NEGATIVE PROMPT: generic AI ad montage, unrelated tools, desktop-only workflow, no phone UI, missing creator face-cam, subtitles replacing the actual visible UI, blurry screens, watermark, logo overlays.

SPEECH PACK: creator-to-camera tutorial speech implied, but do not transcribe captions here.
Video
GLOBAL LOCK:
Subject is a young South Asian woman with long, straight dark hair parted in the middle. She wears a light lavender/purple long-sleeved ribbed top. She holds a small black Rode wireless microphone with a grey "deadcat" windscreen. The environment is a modern indoor room with a blurred bookshelf and a dark decorative object on the left. Lighting is soft, warm, and cinematic with a shallow depth of field. The video has a high-quality mobile aesthetic with a slight chromatic aberration effect at the edges. Speech is clear, energetic, and direct-to-camera.

[00:00–00:03]
The subject is in a medium close-up, looking directly at the camera, holding the microphone near her chest. She is speaking with an expressive face. Text "don't have a passport size picture" appears in green at the top.
[00:03–00:04]
A graphic overlay of a white government form (Election Commission of India) appears over the subject. The subject is still visible behind the form.
[00:04–00:07]
Screen recording of a mobile browser showing the Google Gemini interface. A finger/cursor clicks the "Upload" button and selects a selfie of the subject from a gallery.
[00:07–00:10]
The screen shows the Gemini chat box where the following prompt is typed: "Convert the attached portrait into a professional passport photo (35x45mm or 2x2in). Please replace the background with solid white (#FFFFFF), digitally swap the current attire for a formal blazer and white shirt, and remove any eyewear. Ensure studio-quality lighting, a neutral facial expression, and absolute likeness to the original subject. Output must be print-ready and high-resolution."
[00:10–00:13]
The screen reveals the generated image: the subject now appears in a professional navy blue blazer and white shirt against a plain white background. The subject (in real life) is shown in the background, gesturing with her hands.
[00:13–00:17]
The subject is back in full view, speaking. An overlay of a printing app UI ("Documents" and "Passport photos") appears, showing options for paper quality and quantity.
[00:17–00:22]
The subject continues speaking. An overlay of an "old" passport photo with a blue background and a stamp appears, creating a contrast with the new AI version.
[00:22–00:25]
Final shot of the subject speaking to the camera, gesturing for the viewer to follow. Text "the CYBORG girl" appears at the bottom with a glowing effect.

NEGATIVE PROMPT:
Visual: blurry face, inconsistent hair texture, distorted hands, robotic movements, flickering background, low resolution, harsh shadows, unnatural skin smoothing.
Speech: robotic voice, monotone delivery, background noise, lip-sync delay, muffled audio, heavy breathing sounds.

SPEECH PACK:
[00:00-00:03] "If you don't have a latest passport size picture for forms like this..."
[00:04-00:07] "Head to Gemini, upload a picture of yourself, preferably a selfie."
[00:08-00:10] "Prompt this, hit generate, and boom!"
[00:11-00:13] "You have a passport size photo ready as well."
[00:14-00:17] "You can use your printer or your five-minute apps to get it printed ASAP."
[00:18-00:22] "So you no longer have to look like this in your pictures or use your five-year-old passport size photos."
[00:23-00:25] "And for cool AI hacks as such, follow the Cyborg Girl for more."

TAKE_A (Energetic): High energy, fast pace, emphasis on "Gemini" and "Boom!"
TAKE_B (Helpful): Calm, instructional tone, steady pace.
TAKE_C (Casual): Relaxed, conversational, like talking to a friend.
Video
GLOBAL LOCK: The subject is a Caucasian male in his early 30s with medium-length, wavy brown hair and a full, well-groomed brown beard. He consistently wears a dark forest-green crewneck sweatshirt and a cream-colored trucker hat with a black "VANS" logo on the front. The lighting is bright, professional studio lighting. The video style is a high-energy montage of photorealistic AI-generated scenes mixed with a UI walkthrough.

[00:00–00:01]
Subject: Matthew McConaughey lookalike in a blue Dodgers jersey, holding a plastic cup of beer and a hot dog.
Environment: A sunny, crowded baseball stadium (Dodger Stadium) with "DODGERS WIN" on the big screen.
Action: Smiling broadly at the camera.
Camera: Medium shot, static.
Lighting: Bright, direct afternoon sunlight.
Grade: Saturated, vibrant colors.

[00:01–00:02]
Subject: Kai Cenat (Black male with dreadlocks) and Steve Jobs (older Caucasian male with glasses and black turtleneck).
Environment: A modern podcast studio with professional microphones and soundproofing.
Action: Kai is pointing and laughing; Steve Jobs is smiling and looking at a monitor.
Camera: Medium shot, side-by-side composition.
Lighting: Soft studio lighting with green LED accents in the background.

[00:02–00:04]
Subject: A basketball player in a white Lakers jersey being interviewed by a female reporter. A person in a giant yellow banana mascot suit stands behind them.
Environment: An indoor basketball arena (Crypto.com Arena) with "LAKERS WIN" on the screens.
Action: The reporter holds an ESPN microphone; the banana mascot waves.
Camera: Medium wide shot, broadcast TV style.
Lighting: Bright arena floodlights.

[00:04–00:06]
Subject: The GLOBAL LOCK subject (creator) wearing a teal-green "Squid Game" tracksuit with the number "456".
Environment: The glass bridge from Squid Game, high above a dark abyss.
Action: The subject is lying flat on a glass pane, looking down with a terrified expression.
Camera: High-angle shot looking down, then a low-angle shot looking up at him.
Lighting: Moody, dramatic, with cool blue and green tones.

[00:06–00:08]
Subject: The GLOBAL LOCK subject in the Squid Game tracksuit.
Environment: A CNN-style news studio with a "BREAKING NEWS" ticker that says "SQUID GAME 'SURVIVOR' SPEAKS OUT".
Action: The subject is being interviewed by a news anchor, gesturing with his hands while speaking.
Camera: Medium shot, over-the-shoulder of the anchor.
Lighting: Flat, bright newsroom lighting.

[00:08–00:10]
Subject: The GLOBAL LOCK subject and an older male commentator.
Environment: An F1 commentary booth overlooking a race track with cars speeding by in the rain.
Action: The subject is shouting into a headset, giving a "thumbs up" and looking ecstatic.
Camera: Medium shot inside the booth.
Lighting: Natural overcast light from the track mixed with warm interior booth lights.

[00:10–00:13]
Environment: A large, empty, modern white living room with light wood floors and large windows.
Action: Furniture (sofas, rugs, chairs, lamps) appears in a "pop-in" animation, fully furnishing the room.
Camera: Wide shot, static.
Lighting: Bright, airy, natural daylight.

[00:13–00:16]
Visual: A hand with a yellow pencil drawing a 6-panel storyboard.
Action: The sketches transform into finished, colored comic-book style panels showing a man drinking a Red Bull and gaining wings to run a race.
Camera: Top-down view of the paper.

[00:16–00:19]
Visual: A blue architectural blueprint of a two-story house.
Action: The blueprint seamlessly transitions into a photorealistic 3D render of the finished house with a green lawn and stone path.
Camera: Front elevation view.

[00:19–00:22]
Subject: The GLOBAL LOCK subject.
Action: An extreme close-up of his face, focusing on the eye and skin texture.
Camera: Extreme close-up (ECU).
Lighting: Soft, directional light highlighting skin pores and beard detail.
Text: "4K Resolution" overlays the screen.

[00:22–00:35]
Visual: Screen recording of the Higgsfield AI interface.
Action: A cursor navigates through "Explore", "Image", and selects "Nano Banana Pro". A face photo of the subject is uploaded. A prompt is typed into the box: "the bachelor tv show, with the tv ui interface around it". The "1k" quality button is clicked, showing a dropdown for "4k". The "Generate" button is pressed.

[00:35–00:40]
Subject: The GLOBAL LOCK subject in a white t-shirt and his "Vans" hat.
Environment: The set of "The Bachelor" finale, with a host and several female contestants in evening gowns on couches.
Action: The subject is sitting on the couch, looking slightly awkward but smiling, clapping his hands.
Camera: Wide shot of the set, then a medium shot of the subject.
Lighting: Warm, high-key romantic studio lighting.

NEGATIVE PROMPT: robotic movement, distorted faces, inconsistent beard growth, blurry textures, low resolution, flickering lights, extra fingers, warped background architecture, unnatural lip-sync, watermarks, text logos on clothing (except VANS), jittery camera motion.

SPEECH PACK:
[00:00–00:01] "Holy sh*t, Google's done it again." (TAKE_A: High energy, shocked. TAKE_B: Fast, breathless. TAKE_C: Deep, impressed.)
[00:01–00:04] "You can now create AI imagery that is so realistic, that it's indistinguishable from reality." (TAKE_A: Authoritative, clear. TAKE_B: Enthusiastic, rhythmic. TAKE_C: Slow, emphasizing 'indistinguishable'.)
[00:04–00:10] "And you can even be the main character in any scene that you can dream of." (TAKE_A: Personal, inviting. TAKE_B: Fast-paced, exciting. TAKE_C: Warm, storytelling tone.)
[00:10–00:19] "You can upload six reference images and combine it into one scene. And the creative application that people are using this for right now is genuinely mind-blowing." (TAKE_A: Informative, steady. TAKE_B: Punchy on 'mind-blowing'. TAKE_C: Professional, instructional.)
[00:19–00:22] "The crazy part is is that you can generate images in 4k resolution." (TAKE_A: Whispered excitement. TAKE_B: Direct to camera, confident. TAKE_C: Emphasizing '4k'.)
[00:22–00:35] "To access it, go to Higgsfield and go to image and select Nano Banana Pro. From here, upload a reference image of your face and put in a basic prompt. Select this button and you can generate images in 4k resolution and it's unlimited with 65% off right now." (TAKE_A: Fast tutorial pace. TAKE_B: Clear, step-by-step. TAKE_C: Sales-oriented, energetic.)
[00:35–00:40] "So if you want to try it out, type AI in the comments and I'll send you the link." (TAKE_A: Direct CTA, friendly. TAKE_B: Pointing up, engaging. TAKE_C: Casual, helpful.)
Video
GLOBAL LOCK:
The video features a white male creator in his mid-30s with medium-length, wavy brown hair and a groomed beard, wearing a clean white t-shirt. He is positioned in a bright home office with a professional black condenser microphone on a boom arm in the foreground. The video uses a split-screen or multi-panel layout to compare "Source Video" (the creator) with "AI Generated Results" (various celebrities and characters). The AI characters must perfectly mirror the creator's head tilt, facial expressions, lip-sync, and hand gestures. The lighting is soft, natural window light from the side. The color grade is clean and realistic.

[00:00–00:03]
The screen is split into three vertical panels. Top panel: The creator waves both hands excitedly and points to his right. Middle panel: Sabrina Carpenter in a pink feathered dress mimics the exact hand wave and pointing. Bottom panel: Billie Eilish in a black outfit and sunglasses mimics the same gestures. High-fidelity lip-sync as they all say "Hear me out."

[00:03–00:07]
The layout shifts. Top panel: Creator continues talking with expansive hand gestures. Middle panel: Taylor Swift in a red dress mimics the gestures. Bottom panel: Kim Kardashian in a black tank top mimics the gestures. The transitions between characters are sharp cuts.

[00:07–00:10]
Split screen: Creator (top) vs. Queen Elizabeth II (bottom). The creator looks to his left and then back to the camera with a skeptical expression. The Queen, wearing a crown and sash, mirrors the look perfectly.

[00:10–00:13]
Split screen: Creator (top) vs. Edna Mode from The Incredibles (bottom). The creator scratches the top of his head with his right hand. Edna Mode, with her signature bob and glasses, scratches her head in perfect sync.

[00:13–00:20]
A screen recording of a software interface (Enhancor). A cursor selects the "Wan2.2" model from a dropdown menu. The UI shows a "Source Video" of the creator and a "Character Image" of a woman. The cursor toggles "Pro Mode" on and adjusts resolution to 720p.

[00:20–00:23]
Split screen: Creator (top) vs. a woman with long brown hair in a floral dress (bottom). They are both in the same room. The creator raises his hands in a "stop" gesture; the woman mirrors him perfectly.

[00:23–00:27]
The UI returns, showing the "Photo Animate" tab being selected. A different reference photo of the same woman is used. The cursor clicks "Generate Video."

[00:27–00:35]
Final comparison. Split screen: Creator (top) vs. the woman (bottom). The creator looks around the room and then smiles at the camera while touching his hair. The woman mirrors the hair-touching and the smile, but her background is now a different indoor setting matching her reference photo. The text "AI" appears centered on the screen.

NEGATIVE PROMPT:
Visual: flickering faces, distorted limbs, extra fingers, blurry textures, face-swapping artifacts, unnatural skin smoothing, background warping, robotic movements, low resolution, watermarks.
Speech: robotic voice, mismatched lip-sync, muffled audio, background noise, unnatural pauses, clipping audio.

SPEECH PACK:
[00:00–00:07]
Transcript: "Hear me out, all of your favorite movies and animations are going to be completely acted out by someone else in the next two years."
TAKE_A: Energetic, fast-paced, direct-to-camera.
TAKE_B: Mysterious, slightly slower, emphasizing "completely."
TAKE_C: Casual, conversational, like a friend sharing a secret.

[00:07–00:13]
Transcript: "So I'm going to teach you everything you need to know about this in the next 20 seconds so that you can do this for yourself and stay ahead of the curve."
TAKE_A: Authoritative, instructional, rhythmic.
TAKE_B: Helpful, warm, encouraging.
TAKE_C: Urgent, fast-talking to fit the "20 seconds" claim.

[00:13–00:35]
Transcript: "So right now you have two options with this new AI video model called Wan 2.2. The first option is Character Swap... The second option is Photo Animate... This is absolutely mind-blowing. Comment AI for the link."
TAKE_A: Professional narrator style, clear enunciation.
TAKE_B: Enthusiastic, high energy on "mind-blowing."
TAKE_C: Calm, tech-reviewer tone, clear CTA at the end.
Video
GLOBAL LOCK:
Subject is a Caucasian male in his early 30s, dark wavy hair, well-groomed medium-length beard, expressive brown eyes. He maintains a consistent facial structure across all shots. The visual style is a mix of high-end editorial photography and UGC tutorial footage. Lighting is cinematic with soft key lights and motivated rim lighting. Color grade is professional with deep blacks and vibrant but natural skin tones. Speech is clear, energetic, and instructional, delivered with a warm, authoritative tone.

[00:00–00:01]
Subject: MCU of the man wearing a dark suit, white dress shirt, black tie, and a white baseball cap with a green brim.
Action: Talking directly to the camera. A vertical white rectangular mask moves across his face, revealing a slightly different version of the same scene.
Camera: Static MCU, eye-level.
Lighting: Soft studio lighting, neutral background.
Speech: "This is how you can create..."

[00:01–00:04]
Subject: Rapid montage of AI-generated images. 
1. Man in a dark suit and sunglasses driving a green car at night, "AI MAG" text overlay.
2. Man in a checkered blazer and paisley tie in front of a brick wall.
3. Man in a white short-sleeve shirt with multiple pens in his pocket, standing in a white studio.
Action: Static editorial poses.
Camera: Various (MS, MCU).
Lighting: Cinematic, high contrast, nighttime car lighting, studio softbox.
Grade: Magazine editorial style.

[00:05–00:08]
Subject: A 3x4 grid of 12 different AI portraits of the same man in various outfits (boxing gloves, red car, street style, suit).
Action: Static images.
Overlay: Large bold text "UNLIMITED GENERATIONS" in orange and blue.
Camera: Flat grid layout.
Lighting: Varied per image.

[00:09–00:14]
Environment: Screen recording of the Higgsfield.ai website interface. A cursor moves to click "Image" then "Soul ID Character".
Action: UI navigation.
Speech: "On Higgsfield.ai, go to image and select Soul ID Character..."

[00:15–00:20]
Subject: Picture-in-picture of the man talking (wearing a tan cap and beige shirt) over a screen recording of the "Make Your Own Character" page.
Action: Explaining the process while gesturing.
Speech: "...where you can actually create your own custom character of yourself by uploading a bunch of photos."

[00:21–00:24]
Subject: Montage of AI images with text prompts.
1. Man in a suit drinking from a glass (trippy lens effect).
2. Man in a tan suit with a "Micky Mouse Bag" in a city street.
3. Man in a white tank top and jeans in front of a "Tokyo Red Car".
Action: Posing.
Camera: Full body and MS.
Lighting: Bright daylight, stylized urban lighting.

[00:25–00:34]
Environment: Screen recording of the "Lipsync Studio" interface. Subject's PIP continues.
Action: Selecting "Video", then "Lipsync Studio", uploading an image of himself at the beach, and dragging an audio file named "voiceover.wav".
Speech: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio..."

[00:35–00:38]
Subject: CU of the man at a tropical beach. He is shirtless, wearing black swimming goggles on his head.
Action: He is lip-syncing perfectly to the audio, smiling slightly.
Environment: Bright blue ocean water with small waves in the background.
Camera: CU, static.
Lighting: Bright, direct sunlight with natural shadows.
Speech: "...and it will combine those two together with the best lip-sync models."

NEGATIVE PROMPT:
Visual: robotic movement, distorted facial features, inconsistent beard growth, blurry textures, flickering background, extra fingers, warped UI elements, low resolution, watermarks.
Speech: robotic monotone, lip-sync delay, muffled audio, background hiss, unnatural pauses, slurred consonants, popping sounds.

SPEECH PACK:
[00:00-00:08]
Transcript: "This is how you can create 25 magazine-ready images of yourself using AI and then you can even lip-sync on top of them with this brand new feature."
TAKE_A: (Energetic, fast-paced) "This is how you can create TWENTY-FIVE magazine-ready images of yourself using AI... and then you can even LIP-SYNC on top of them with this brand new feature!"

[00:09-00:20]
Transcript: "On Higgsfield.ai, go to image and select Soul ID Character where you can actually create your own custom character of yourself by uploading a bunch of photos."
TAKE_A: (Instructional, clear) "On Higgsfield dot A-I, go to image and select Soul I-D Character... where you can actually create your own custom character of yourself... by uploading a bunch of photos."

[00:25-00:38]
Transcript: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio and it will combine those two together with the best lip-sync models."
TAKE_A: (Helpful, concluding) "Now you can go to video at the top of the page and select the Lipsync Studio... where you can upload your photo and audio... and it will combine those two together with the best lip-sync models."
Video
GLOBAL LOCK: a polished vertical talking-head tutorial video for social media, one young woman presenter with straight brown hair speaking directly to camera in a clean bright studio-like setup, wearing a fitted dark top, centered medium close-up framing, confident explanatory delivery, modern creator-education vibe, intercut with screen recordings of an AI image-generation interface and visual before/after examples, crisp subtitles style text overlays, smooth jump cuts, high clarity, no background clutter.

[00:00-00:05] Open with the presenter speaking directly to camera in a clean bright room, introducing a better workflow for getting stronger AI image results, using clear on-screen keywords and a confident educational tone.

[00:05-00:12] Cut to example outputs while the presenter continues as voice-led tutorial content: anime-style dancing woman clips, then beauty or portrait examples labeled with terms like Nano Banana Pro, showing that the method can transform prompt quality and consistency.

[00:12-00:18] Show more case studies: dramatic portrait emotion, facial realism, cinematic close-ups, and a workflow explanation that the improvement comes from using a prompt processor rather than typing basic prompts manually.

[00:18-00:25] Move into screen-recorded UI footage of an AI generation interface where prompt settings and toggles are adjusted, the cursor selecting options and triggering generation, while the presenter explains how the processor expands simple ideas into stronger structured prompts.

[00:25-00:32] Intercut more examples such as cartoon/comic outputs, newspaper or infographic-style results, skincare-detail close-ups, and realistic texture samples, emphasizing how the same system improves style control across different use cases.

[00:32-00:40] Return to the presenter full-screen for the close, speaking directly to camera with a call to action, explaining that viewers can comment to get the exact process or best settings, ending on a clean, creator-coach style talking-head frame.

NEGATIVE PROMPT: messy background, shaky vlog camera, dark room, casual selfie style, fantasy scenery, low-resolution UI, unreadable screen recording, inconsistent presenter identity, extra people, chaotic editing, no tutorial feel, poor skin detail, text glitches, watermark clutter, low-quality stock footage.
Video
GLOBAL LOCK: 
Subject: A Caucasian male in his late 20s with a short brown beard and mustache. He wears a variety of casual headwear (green trucker hat, blue baseball cap, tan cap) and hoodies (brown, grey). 
Visual Style: Split-screen composition. Top half is cinematic, high-fidelity AI video with vibrant colors and professional grading. Bottom half is UGC-style, handheld or static phone footage in a domestic indoor setting with natural/practical lighting.
Consistency: The subject's facial features and beard must remain identical across all AI-generated scenes, matching the real person in the bottom half.
Speech: Energetic, fast-paced narration with clear enunciation. Mic is close-up, dry, and professional.

[00:00–00:06]
Top: A cinematic wide shot of the Leaning Tower of Pisa under a bright blue sky. The subject, wearing a white t-shirt and green trucker hat, stands in the foreground, smiling and holding his hands up as if leaning against the tower. High saturation, sharp details.
Bottom: The real subject in a home office, wearing a brown hoodie and blue cap, mimics the same pose against a black metal bookshelf.
Speech: "Hey, get this picture. No one's ever thought of this, but it's gonna look like I'm pushing the tower."

[00:07–00:09]
Top: A close-up of the subject in a snowy arctic environment, wearing a "Vans" t-shirt and green hat. He is playfully interacting with a large, realistic polar bear that is nuzzling his head. Cold blue color grade.
Bottom: The real subject in a hallway, wearing the brown hoodie, mimics the nuzzling motion against thin air.
Speech: "Okay guys, I'm here with a [unclear] polar bear."

[00:10–00:16]
Top: A wide shot at night in Giza, Egypt. The subject sits atop a camel with the Great Pyramid in the background. He is wearing a white t-shirt and shorts, gesturing with a "call me" sign. Warm, golden-hour lighting.
Bottom: The real subject sits on a white kitchen counter, mimicking the camel-riding posture and hand gesture.
Speech: "It's a Tuesday and I'm on a [unclear] camel. What do you mean you're at work? Just have your mom and dad pay for it."

[00:17–00:23]
Top: A low-angle medium shot in a sunny LA suburb. The subject sits on the ground in front of a bright red Ferrari. He wears a purple graphic tee and a tan hat, holding up a red car key fob. High-contrast, commercial aesthetic.
Bottom: The real subject sits on a wooden chair in his living room, holding up a small white object (a piece of cheese or soap) as if it were the key fob.
Speech: "Just bought my first car, age 23 by the way. Even got the keys. Whew!"

[00:24–00:48]
The video transitions to a full-screen UI walkthrough of the Higgsfield website. The subject appears in a circular talking-head overlay at the bottom.
Visuals: Cursor navigates through "Create Image," "Higgsfield Soul," "Character Upload," and "Motion Control" menus.
Final Shot (00:43-00:46): A split screen showing the subject in a high-end casino wearing a white tuxedo (AI) vs. the subject in his home office (Real), both performing a "come here" hand gesture.
Speech: "You can do this for yourself by going to Higgsfield. Select image, then go to Higgsfield Soul... upload a bunch of images of yourself... choose the style you want... then go to Kling Motion Control... upload your driving video and the image... and it will create this effect. Comment AI and I'll send you the link."

NEGATIVE PROMPT: Visual artifacts, flickering, face swapping glitches, inconsistent beard shape, blurry textures in the AI half, robotic lip-sync, muffled audio, background noise, distorted limbs, unnatural camel movement.

SPEECH PACK:
[00:00-00:06] "Hey, get this picture. No one's ever thought of this, but it's gonna look like I'm pushing the tower."
TAKE_A: (Excited, fast) "Hey, get this shot! Nobody's done this, it'll look like I'm holding up the tower!"
TAKE_B: (Sarcastic, deadpan) "Check this out. Totally original. I'm pushing the Leaning Tower."

[00:10-00:16] "It's a Tuesday and I'm on a camel. What do you mean you're at work? Just have your mom and dad pay for it."
TAKE_A: (Arrogant influencer tone) "Tuesday vibes on a camel. Why are you working? Just get your parents to fund it."
TAKE_B: (Playful) "Just riding a camel on a weekday. Work? Never heard of her. Ask your parents for the cash."
Video
GLOBAL LOCK:
Subject is a Caucasian male, mid-30s, with a well-groomed dark beard and mustache. In the cinematic sequence, he is wearing a full suit of polished silver medieval knight armor with intricate engravings. He wears a dark green baseball cap backwards under his helmet or as a stylistic choice. The environment is a dramatic, smoky battlefield with an overcast, moody sky and orange flames/explosions in the background. The color grade is cinematic, desaturated with high contrast and warm highlight roll-off from the fires. Camera movement is dynamic, following the subject.

[00:00–00:05]
Split-screen view. Bottom: Creator talking to camera in a white/black striped hoodie and "VANS" cap. Top: A dark digital interface showing a node-based workflow with lines connecting "Creation," "Text," and "Image Generator" boxes. The creator points down toward the microphone.

[00:05–00:10]
Top screen: A full-body photo of the male subject in a white t-shirt and striped pants against a white wall. The background of the photo then turns into a bright, solid green screen.

[00:10–00:15]
Top screen: Individual 3D-rendered silver armor pieces (gauntlet, chest plate, greaves) float around the subject on the green screen, then snap onto his body, replacing his clothes.

[00:15–00:25]
Top screen: The subject, now in full knight armor, is seated on a majestic white horse. The background is still a green screen. A white horse asset appears and he is composited onto it.

[00:25–00:45]
Top screen: A cinematic wide shot of the knight on the white horse galloping through a war-torn field. Thick grey smoke billows behind him. He holds a large red and green flag with a "GenHQ" logo that waves violently in the wind. Explosions of orange fire erupt in the background. The camera tracks the horse's movement with a slight handheld shake.

[00:45–00:51]
The cinematic knight sequence continues. Large white text "Comment 'AI'" is centered on the screen. The creator in the bottom frame continues to speak and gesture enthusiastically. The horse slows to a trot as the flag continues to wave.

NEGATIVE PROMPT:
Visual: robotic movement, distorted face, inconsistent armor textures, blurry horse legs, floating objects, cartoonish colors, low resolution, flickering lighting, extra limbs, text/logos other than specified.
Speech: robotic tone, muffled audio, background noise, lip-sync mismatch, stuttering, flat delivery.

SPEECH PACK:
[00:00–00:05]
"This new method of creating AI generated content gives us so much control over the output."
TAKE_A: (Enthusiastic, fast-paced) "This new method of creating AI generated content gives us so much control over the output!"
TAKE_B: (Authoritative, measured) "This new method... of creating AI generated content... gives us so much control over the output."
TAKE_C: (Casual, friendly) "Check this out—this new AI method gives you total control over what you're making."

[00:45–00:51]
"So if you want to try this out for yourself, type AI in the comments and I'll send you the link."
TAKE_A: (Direct, urgent) "So if you want to try this out for yourself, type AI in the comments and I'll send you the link!"
TAKE_B: (Warm, inviting) "Want the link? Just type AI in the comments and I'll send it right over."
TAKE_C: (Punchy, instructional) "Type AI below and I'll DM you the link to try this yourself."
Video
GLOBAL LOCK: 
Subject identity must transition from high-profile celebrities to a consistent female creator. 
Celebrity segment: Chris Hemsworth (Caucasian male, short blonde hair, beard, black suit), Sydney Sweeney (Caucasian female, blonde hair, red lipstick, black dress), Timothée Chalamet (Caucasian male, curly brown hair, black blazer), Zendaya (African-American/Caucasian female, slicked back hair, silver choker). 
Creator segment: Caucasian female, mid-20s, wavy light brown hair, wearing a beige/cream blazer over a ribbed tan top. 
Environment: High-end studio backgrounds (dark green, white, grey) for celebrities; modern, bright office/indoor setting for creator. 
Lighting: Professional studio lighting with soft key and rim lights. 
Color Grade: High saturation and contrast for hooks; neutral, warm tones for tutorial. 
Speech: Energetic, clear female voiceover, direct-to-camera delivery.

[00:00–00:02]
Extreme close-up of Chris Hemsworth, sharp focus on eyes, studio lighting against a dark green textured background. Rapid cut to Sydney Sweeney, front-facing, bright red lipstick, white background. Camera is static. High-contrast color grading.

[00:02–00:04]
Extreme close-up of Timothée Chalamet, neutral expression, curly hair detail visible. Rapid cut to Zendaya, split-screen effect showing a "before and after" lighting change on her face. Text overlay "Photographers are officially cooked" in yellow and white.

[00:04–00:07]
A 4-way grid appears featuring the previous four celebrity portraits. The grid is static, then zooms in slightly. The text overlay remains at the bottom.

[00:07–00:10]
Screen recording of the Google Gemini mobile interface. A thumb taps the "+" icon, selects a selfie of a woman in a beige blazer, and types the text "selfie of yourself". The UI is in dark mode. The movement is smooth and functional.

[00:10–00:12]
The screen recording shows the AI processing and then reveals a stunning, professional headshot of the woman. The woman has wavy brown hair and is wearing a professional beige suit in a blurred modern office background.

[00:12–00:14]
Cut to the real-life creator (matching the AI headshot's identity). She is in a medium close-up, gesturing with her hands, speaking directly to the camera. Her expression is enthusiastic. Background is a bright, out-of-focus indoor space.

[00:14–00:16]
The creator continues speaking. A large text overlay appears: Comment "Photo". She points towards the camera/text. The cut is clean. The audio is crisp with a slight room resonance.

NEGATIVE PROMPT: 
Blurry faces, distorted eyes, inconsistent hair textures between cuts, robotic voice, laggy screen recording, messy background, low lighting, oversaturated skin tones, visible AI artifacts on hands or clothing, text flickering.

SPEECH PACK:
[00:05–00:16]
"Photographers are officially cooked, because you can go to Google's Gemini, upload any basic selfie of yourself and get this stunning professional headshot. It's that simple. If you want to try this, comment 'Photo' and I'll send you the prompt."

TAKE_A (Energetic/Fast): "Photographers are officially COOKED! Go to Google Gemini, upload a selfie, and boom—professional headshot. Simple. Comment 'Photo' for the prompt!"
TAKE_B (Informative/Steady): "Photographers are officially cooked. You can now use Google Gemini to turn any basic selfie into a stunning professional headshot. If you want to try it, just comment 'Photo' and I'll send the prompt."
TAKE_C (Casual/Friendly): "So, photographers are officially cooked. Just go to Gemini, upload your selfie, and get a pro headshot instantly. Want the prompt? Comment 'Photo' below!"

Prosody: Emphasis on "COOKED", "Gemini", and "PHOTO". Short pause after "headshot".
Sync: High lip-sync strictness for the final 4 seconds. Phrase boundaries aligned to cuts at 00:12.
Video
GLOBAL LOCK: A vertical 9:16 AI demo video for Pollo.ai Mimic Motion featuring a male creator with short reddish-blond hair, fair skin, trimmed beard, and a light t-shirt speaking directly to camera in front of a warm wooden wall. A black podcast-style microphone sits in front of him. The key visual structure is a stacked comparison layout where the creator's exact expressions, head movement, hand gestures, and lip-sync are transferred onto multiple different characters. The swapped identities should include high-recognition fantasy and movie-inspired figures such as a Shrek-style ogre, a half-human cyborg reminiscent of Terminator, a Gollum-like creature, a Harry Potter-style wizard, a Pennywise-style clown, and a Tyler Durden-style gritty male lead. The demo should feel clear, fast, and proof-driven rather than cinematic storytelling.

[00:00-00:10] Open on a three-panel stacked comparison. The top panel shows the original creator speaking with both hands raised and expressive brows. The middle and bottom panels show alternate characters performing the exact same mouth movement, gaze direction, and hand pose in sync. Start with obvious contrast pairings like Shrek and a cyborg face to make the motion transfer immediately readable.

[00:10-00:24] Continue the stacked format while rotating through more dramatic character swaps. Show the same creator performance mapped onto a gaunt cave-dweller like Gollum, a young wizard in glasses, a white-faced clown with red makeup lines, and a gritty sunglass-wearing antihero. Each variant must preserve the exact source rhythm and gesture language, with only the identity layer changing.

[00:24-00:35] Transition back to the original creator in a single full-screen talking-head view with the microphone clearly visible. Let him continue speaking and gesturing naturally so viewers understand that the earlier transformations all came from this simple source performance. Keep the overall tone instructional and creator-focused.

NEGATIVE PROMPT: unsynced lip movement between variants, different poses in each comparison panel, heavy VFX clutter, cinematic story scenes replacing the demo structure, inaccurate parody costumes, random background changes, low-detail face swaps, no microphone or creator setup, generic montage without proof.

SHOT PROMPTS: creator talking-head source video; stacked mimic motion comparison panels; Shrek-style face swap synced to creator; cyborg half-face character remap; Harry Potter and clown motion transfer demo; original creator talking to microphone after swaps.

SPEECH PACK: One male speaker only. The important audio behavior is clean creator-style direct-to-camera speech with lip-sync accuracy preserved across every swapped character.
Video

INVARIANTS TO LOCK
- Vertical 9:16 split-comparison Reel.
- Same young adult white male creator in every shot: light skin, slim build, side-swept brown hair, clean-shaven, expressive face.
- Neutral studio setup with soft gray background, clean frontal lighting, medium framing from chest to head.
- Video alternates between “Original:” and “AI:” versions of the same gesture performance.
- The AI versions keep the exact body movement and timing, but swap wardrobe, accessories, and visual effects.
- Tone is demo-first, highly legible, fast, and social-native.

SHOTLIST
1. [00:00-00:02] AI label over a dark tactical outfit, then a red-and-blue spider-inspired superhero suit, then a brown aviator jacket with patches and sunglasses. Matching “Original:” frames underneath show the presenter in a plain black shirt doing the same finger snap gesture.
2. [00:02-00:05] The comparison continues with the aviator look in a warmer room setting with vertical blinds and a plant, still mirroring the original hand choreography.
3. [00:05-00:07] Fire effects appear behind and around the AI version while the original remains clean and unstyled below.
4. [00:07-00:09] Large subtitle CTA appears over the AI version: comment “AI” for guide. Final frames push the fiery transformation while the original keeps the same open-handed pose.

STYLE BIBLE
Visual style: creator demo of motion-consistent character transformation.
Camera signature: locked tripod, eye-level medium shot, no camera movement.
Lighting signature: soft even front light on the original clip; AI variants maintain similar face lighting while changing wardrobe and environment mood.
Grade signature: clean studio neutrals in the original; richer contrast and warmer highlights in the AI versions.
Speech style: brief solo creator commentary or silent caption-driven demo; if voice is present, it should sound casual, impressed, and direct.

MASTER PROMPT
GLOBAL LOCK: Create a vertical 9:16 Instagram Reel that compares an original studio performance against AI-transformed outputs. Use the same young adult white male creator with light skin, slim build, side-swept brown hair, and clean-shaven face throughout. Keep the original clip on a soft gray studio background with the creator in a plain fitted black shirt, medium framing, frontal lighting, and simple hand gestures. Every AI version must preserve identical timing, pose, eye line, and hand motion, while changing outfit, accessories, background mood, and effects. Use bold yellow labels “AI:” and “Original:” so the comparison is instantly readable.

[00:00-00:02] Show the creator snapping or flicking his fingers in sync across paired comparison frames. In the AI version, first dress him in a dark armored tactical costume, then switch to a red-and-blue spider-inspired superhero suit, then to a brown aviator jacket with sewn patches and black sunglasses. In the original version, keep the same gesture in a plain black shirt against a gray backdrop.

[00:02-00:05] Continue the gesture-matched comparison. The AI variant now settles into the aviator look in a warmer cinematic room with vertical blinds and a leafy plant, preserving exact mouth shape and hand timing from the original clip. The original remains unchanged below, emphasizing how the motion has been transferred rather than reanimated from scratch.

[00:05-00:07] Add stylized flames behind the AI character and subtle orange light wrapping around the jacket sleeves. Keep the original clip clean and neutral for contrast. Maintain sharp alignment between both performances so viewers can read the transformation as one-to-one motion mapping.

[00:07-00:09] End with the most dramatic fiery aviator transformation while overlaying a clear CTA: comment “AI” for guide. The original clip still mirrors the same open-handed pose. Finish on a high-energy, creator-demo beat.

NEGATIVE PROMPT
Do not drift the face identity, hairstyle, body proportions, or gesture timing between original and AI versions. Avoid extra fingers, broken sunglasses, distorted jacket patches, muddy flames, inconsistent eye direction, unreadable labels, flickering backgrounds, or cartoonish facial deformation. Do not let the AI transformation lose the exact one-to-one motion match with the original clip.

SPEECH PACK
[00:00-00:04] Speaker A, direct-to-camera, meaning: this is how the same motion can be restyled with AI. Delivery: short, confident, creator-demo cadence.
TAKE_A: “Same motion, completely different character styling.”
TAKE_B: “This is the exact same performance, just transformed with AI.”
TAKE_C: “Watch how the motion stays locked while the look changes.”

[00:04-00:09] Speaker A or on-screen text, meaning: these tools save creators time and a guide is available by comment. Delivery: casual CTA.
TAKE_A: “Comment AI if you want the full guide.”
TAKE_B: “If you want the workflow, comment AI below.”
TAKE_C: “Comment AI and I will send the guide.”

AI Meme from Photo

AI meme from photo content becomes useful when it starts with the real workflow: someone already has an image they want to use. They are not looking for a blank meme canvas. They want to upload a photo and quickly get a funny direction, whether that means a caption suggestion, a known meme format, or a more absurd remix that makes the image feel instantly shareable.

The strongest examples in this category make the jump from photo to joke feel short. A good result keeps the original image recognizable while pushing it toward humor fast enough that the creator does not lose momentum. When you compare ideas on this page, focus on whether the workflow helps a normal camera-roll image become something people would actually repost, send to friends, or use in a story.

FAQ

What is AI meme from photo used for?

It is used to turn an existing photo into a meme with captions, joke formats, or AI-assisted remix ideas that are ready to share.

What kinds of photos work best?

Selfies, pet pictures, reaction images, and expressive snapshots usually work well because they already carry a clear emotion or setup.

Why do people use this instead of a normal meme editor?

Because the AI step can shorten the jump from uploaded image to usable joke, especially when the creator wants caption ideas quickly.

What should I compare on this page?

Compare how easily each approach turns a photo into a strong meme concept, how much manual editing is still needed, and whether the final output feels shareable.

AI Meme from Photo: Photo-to-Meme Ideas and Caption Angles | Alici.AI