0:00 / 0:00

▶

Comment “AI” & I’ll send a guide👇 In 6 months AI influencers and AI models will be taking over your feed. Learn how to create high-quality AI product photoshoots now. #nanobanana #seedream #googleveo3 #aiphotoshoot

Gunhild Johanne Reumert

Q: What are the 3 most important words in the prompt?

For this aesthetic, the crucial prompt keywords are 'cinematic editorial portrait,' '85mm lens,' and 'dramatic studio lighting.'

Q: How can I avoid making it look like AI?

Avoid the 'AI look' by adding film grain in post-production, ensuring the lighting has realistic fall-off (shadows), and keeping camera movements subtle rather than sweeping and unnatural.

@shedoesai · Digital creator

INSTAGRAM · 2025-11-07Source

637likes

867comments

Remix This

Recreate with Kling 3

Make your own AI viral video

Prompt

A) MISE EN PLACE
2) Segment the video into scenes/shots:
- [00:00-00:03] Shot 1: ECU face, talking.
- [00:03-00:05] Shot 2: CU face, holding product.
- [00:06-00:09] Shot 3: MS, head turn, dramatic shadow.
- [00:10-00:12] Shot 4: CU, applying product.
- [00:13-00:15] Shot 5: WS, sitting on floor.
- [00:16-00:18] Shot 6: CU, touching neck.
- [00:19-00:21] Shot 7: MS, sitting on stool, talking.
- [00:22-00:24] Shot 8: MS, holding hair up.
- [00:25-00:27] Shot 9: CU, wind in hair.

3) Extract visual evidence:
- Keyframes: 00:01 (talking face), 00:04 (holding product), 00:07 (shadow face), 00:11 (applying product), 00:14 (full body), 00:17 (touching neck), 00:20 (sitting talking), 00:23 (holding hair), 00:26 (wind in hair).

4) Extract speech evidence:
- Speaker: 1 female voice (Speaker A).
- Transcript:
  [00:00-00:03] "What if I told you I'm not even real."
  [00:03-00:05] "But the product I'm holding is Hailey Bieber's Rhode lip balm."
  [00:06-00:09] "Everything you're seeing was created with AI, no camera, no studio."
  [00:10-00:12] "Just one image and a few prompts."
  [00:13-00:15] "Every reflection, every highlight, every detail was generated in seconds."
  [00:16-00:18] "Real product, unreal possibilities."
  [00:19-00:21] "You don't need a full setup anymore."
  [00:22-00:24] "Just imagination."
  [00:25-00:27] "Comment guide to learn how."
- Lip visibility: Full visibility in shots 1 and 7. Partial/implied in others.
- Sync strictness: High for shots 1 and 7.

5) Invariants list (LOCK THESE):
- Visuals: Asian woman, mid-20s, flawless glowing skin, dark brown hair, fitted white ribbed sleeveless turtleneck tank top, small silver hoop earrings. Cinematic studio lighting, 85mm lens feel, photorealistic texture.
- Speech: Female voice, warm, confident, commercial beauty tone, close-mic studio sound, dry room.

6) Variables list (TWEAK THESE):
- Visuals: Lighting direction (soft beauty vs. hard directional), hair state (tied back vs. loose), background color (black, grey, white), pose, camera framing (ECU to WS).
- Speech: Pacing, emphasis on key words ("real", "AI", "seconds").

B) SHOTLIST
[00:00-00:03]
- framing: ECU, eye level.
- lens: 85mm, shallow DoF.
- camera movement: Static.
- subject: Looking directly at lens, speaking.
- environment: Dark studio background.
- lighting: Soft beauty lighting, high contrast.
- speech: Speaker A, on-camera. "What if I told you I'm not even real." High lip-sync strictness.

[00:03-00:05]
- framing: CU, eye level.
- lens: 85mm, shallow DoF.
- camera movement: Slight drift.
- subject: Holding a pink lip balm tube near her cheek, looking at camera.
- environment: Neutral studio background.
- lighting: Soft diffused lighting.
- speech: Speaker A, VO. "But the product I'm holding is Hailey Bieber's Rhode lip balm."

[00:06-00:09]
- framing: MS, eye level.
- lens: 50mm.
- camera movement: Slow pan following head turn.
- subject: Turns head from profile to face camera.
- environment: Dark studio background.
- lighting: Dramatic hard directional light, sharp diagonal shadow across face.
- speech: Speaker A, VO. "Everything you're seeing was created with AI, no camera, no studio."

[00:10-00:12]
- framing: CU, tight on mouth.
- lens: 100mm macro feel.
- camera movement: Static.
- subject: Applying pink lip balm to lips, eyes looking slightly down.
- environment: Neutral background.
- lighting: Bright, even beauty lighting.
- speech: Speaker A, VO. "Just one image and a few prompts."

[00:13-00:15]
- framing: WS, full body.
- lens: 35mm.
- camera movement: Static.
- subject: Sitting on floor, one leg bent, wearing black trousers with the white tank top.
- environment: Grey studio floor and wall.
- lighting: Soft overhead lighting.
- speech: Speaker A, VO. "Every reflection, every highlight, every detail was generated in seconds."

[00:16-00:18]
- framing: CU.
- lens: 85mm.
- camera movement: Slight push-in.
- subject: Touching neck and jawline with both hands.
- environment: Dark background.
- lighting: Warm rim light, deep shadows.
- speech: Speaker A, VO. "Real product, unreal possibilities."

[00:19-00:21]
- framing: MS.
- lens: 50mm.
- camera movement: Static.
- subject: Sitting on a metal stool, leaning forward, speaking to camera.
- environment: Neutral studio background.
- lighting: Neutral studio lighting, slight vignette.
- speech: Speaker A, on-camera. "You don't need a full setup anymore." High lip-sync strictness.

[00:22-00:24]
- framing: MS, slight low angle.
- lens: 50mm.
- camera movement: Static.
- subject: Arms raised, holding hair up in a high ponytail.
- environment: White studio background.
- lighting: Bright, high-key lighting.
- speech: Speaker A, VO. "Just imagination."

[00:25-00:27]
- framing: CU.
- lens: 85mm.
- camera movement: Static.
- subject: Looking intensely at camera, hair blowing.
- environment: Dark background.
- lighting: Soft dramatic lighting.
- motion cues: Wind blowing hair.
- speech: Speaker A, VO. "Comment guide to learn how."

C) STYLE BIBLE
- visual_style: Photorealistic cinematic commercial beauty portrait.
- camera_signature: 85mm portrait lens dominance, shallow depth of field, mostly static or slow, deliberate movements.
- lighting_signature: Highly variable but always professional studio quality, ranging from soft high-key beauty to dramatic low-key hard shadows.
- grade_signature: High contrast, natural skin tones, deep blacks, clean whites.
- texture_signature: Flawless skin detail, sharp focus on eyes and product.
- pacing_signature: Fast-paced cuts every 2-3 seconds.
- speech_style: Commercial beauty VO, confident, direct-to-camera hybrid.
- speaker_profile: Female, warm, articulate, modern vocal fry.
- mic_mix_profile: Close-mic, dry studio, high clarity, compressed for social media.

D) PROMPT SYNTHESIS

1. MASTER PROMPT
GLOBAL LOCK: Photorealistic cinematic commercial style. Subject: Asian woman, mid-20s, flawless glowing skin, dark brown hair, wearing a fitted white ribbed sleeveless turtleneck tank top, small silver hoop earrings. Environment: Minimalist studio setting with solid neutral backgrounds (white/grey/black). Lighting: High-end beauty lighting, varying from soft diffused to dramatic hard shadows. Camera: 85mm lens, shallow depth of field. Speech: Single female speaker, warm commercial tone, close-mic studio sound.

[00:00-00:03] ECU of the woman's face against a dark background. Soft beauty lighting. She is looking directly at the lens, speaking. Lips are moving in sync with speech.
[00:03-00:05] CU. The woman holds a pink lip balm tube next to her cheek. Soft diffused lighting. She looks at the camera. Slight camera drift.
[00:06-00:09] MS. The woman is turned slightly away in profile, then turns her head towards the camera. Dramatic lighting with a harsh diagonal shadow cutting across her face. Slow pan following the head turn.
[00:10-00:12] CU tight on the mouth. The woman is applying the pink lip balm to her lips. Eyes looking slightly down. Bright, even beauty lighting highlighting skin texture.
[00:13-00:15] WS. The woman is sitting on the floor, wearing black trousers with the white tank top. One leg bent. Grey studio background. Soft overhead lighting. Static camera.
[00:16-00:18] CU. The woman touches her neck and jawline with both hands. Warm, glowing rim light, deep shadows on the opposite side. Slight camera push-in.
[00:19-00:21] MS. The woman is sitting on a metal stool, leaning forward slightly, speaking directly to the camera. Lips moving in sync. Neutral studio lighting, slight vignette. Static camera.
[00:22-00:24] MS, slight low angle. The woman has her arms raised, holding her hair up in a high ponytail. Bright, high-key lighting, white background. Static camera.
[00:25-00:27] CU. The woman's hair is blowing in the wind. She looks intensely at the camera. Soft dramatic lighting, dark background. Static camera.

2. NEGATIVE PROMPT
Visuals: cartoon, illustration, anime, 3d render, deformed anatomy, extra fingers, mutated hands, unnatural skin texture, plastic skin, temporal jitter, flickering lighting, morphing objects, text, watermarks, logos, low resolution, blurry, out of focus.
Audio: robotic voice, unnatural cadence, harsh sibilance, plosives, clipping, background noise, room echo, lip-sync mismatch, slurred words.

4. SPEECH PACK
Speaker: Female, 20s, warm, confident, commercial beauty tone.
[00:00-00:03] "What if I told you... I'm not even real." (Pause for dramatic effect, direct eye contact).
[00:03-00:05] "But the product I'm holding... is Hailey Bieber's Rhode lip balm." (Slight emphasis on 'Rhode').
[00:06-00:09] "Everything you're seeing was created with AI... no camera... no studio." (Paced, emphasizing the negatives).
[00:10-00:12] "Just one image... and a few prompts." (Smooth, instructional tone).
[00:13-00:15] "Every reflection... every highlight... every detail... was generated in seconds." (Staccato emphasis on 'every').
[00:16-00:18] "Real product... unreal possibilities." (Contrast emphasis).
[00:19-00:21] "You don't need a full setup anymore." (Direct, conversational).
[00:22-00:24] "Just imagination." (Soft, aspirational).
[00:25-00:27] "Comment guide... to learn how." (Clear CTA, energetic).

How shedoesai Made This Rhode Lip Balm AI Photoshoot AI Video

This Instagram Reel is a masterclass in bridging the gap between abstract AI capabilities and tangible, high-end commercial utility. By seamlessly blending a static "AI image" and a "Real product" (Hailey Bieber's highly recognizable Rhode lip balm) into a photorealistic, cinematic editorial portrait video, the creator instantly proves the value of AI for product photography. The visual aesthetic is anchored in a warm studio light, featuring a flawless Asian model in a minimalist white ribbed tank top and black trousers, executing high-fashion poses with perfect skin texture and natural hair movement. The split-screen UI—showing the raw ingredients on the left and the stunning final video on the right—acts as a continuous visual hook. The core keyword here is "cinematic editorial portrait," elevated by an "iPhone aesthetic" that feels native to social media yet professional enough for a billboard. The creator leverages the "uncanny valley" as a feature, not a bug, using the opening line "I'm not even real" to challenge viewers to find flaws in the hyper-realistic lighting, reflections, and lip-syncing, ultimately driving massive engagement through a "Comment for guide" CTA.

2. What You’re Seeing

The video employs a persistent split-screen layout. The left third displays static elements: an "AI image" of the model and a "Real product" cutout of a pink Rhode lip balm. The right two-thirds feature a dynamic, hyper-realistic AI-generated video of the model interacting with the product and posing in various high-end studio setups. The subject is a young Asian woman with flawless, glowing skin, dark brown hair (sometimes tied back, sometimes loose), wearing a fitted white sleeveless turtleneck tank top and small silver hoop earrings. The lighting transitions dramatically between shots: from soft, diffused beauty lighting to harsh, directional spotlights creating sharp diagonal shadows across her face. The camera work mimics a professional 85mm portrait lens with a shallow depth of field, focusing intensely on her facial features and the product. Subtitles are bold, yellow, and white, placed centrally over the video, while the background music is a subtle, modern electronic beat that underscores the futuristic yet grounded mood of the piece.

Shot-by-Shot Breakdown

Time Range	Visual Content	Shot Language	Lighting & Color Tone	Viewer Intent
00:00 - 00:03	Model looking directly at camera, speaking. Left UI shows static image and product.	Extreme Close-Up (ECU), static camera, shallow depth of field.	Soft, warm beauty lighting, dark background. High contrast.	The Hook: Establish the hyper-realistic baseline and deliver the "I'm not real" shock.
00:03 - 00:05	Model holds the pink Rhode lip balm near her cheek.	Close-Up (CU), slight camera drift.	Soft, diffused studio lighting, neutral background.	Product Integration: Prove the AI can handle real-world objects seamlessly.
00:06 - 00:09	Model turns her head towards the camera from a profile view.	Medium Shot (MS), slow pan following the head turn.	Dramatic, hard directional light creating a sharp diagonal shadow across the face.	Aesthetic Flex: Demonstrate AI's ability to handle complex, cinematic lighting changes.
00:10 - 00:12	Model applies the lip balm to her lips.	Close-Up (CU), static, tight framing on the mouth and product.	Bright, even beauty lighting, highlighting skin texture and gloss.	Action/Utility: Show the product in use, reinforcing the "photoshoot" concept.
00:13 - 00:15	Model sitting on the floor, one leg bent, wearing black trousers.	Wide Shot (WS), static, full body framing.	Cooler, grey studio background, soft overhead lighting.	Scale/Context: Prove the AI can generate full-body poses and environments, not just faces.
00:16 - 00:18	Model touches her neck and jawline with both hands.	Close-Up (CU), slight push-in.	Warm, glowing rim light, deep shadows on the opposite side.	Sensory/Texture: Highlight the realistic rendering of hands and skin interaction.
00:19 - 00:21	Model sitting on a metal stool, leaning forward, speaking.	Medium Shot (MS), static, eye-level angle.	Neutral studio lighting, slight vignette.	Re-engagement: Bring the focus back to the narrator and the core message.
00:22 - 00:24	Model holds her hair up in a high ponytail with both hands.	Medium Shot (MS), slight low angle.	Bright, high-key lighting, white background.	Dynamic Pose: Show complex limb positioning and hair rendering.
00:25 - 00:27	Model looking intensely at the camera, hair blowing in the wind.	Close-Up (CU), static, tight framing.	Soft, dramatic lighting, dark background.	The Climax/CTA: Leave a lasting visual impression while the CTA text appears.

3. Why It Went Viral (Breakdown of the Viral Mechanism)

The topic selection here is brilliant because it sits at the intersection of three massive trends: AI generation, e-commerce/dropshipping, and celebrity beauty brands. By choosing Hailey Bieber's Rhode lip balm—a product that frequently goes viral on its own for its aesthetic packaging and celebrity association—the creator instantly taps into an existing, highly engaged audience. This isn't just a generic "look what AI can do" video; it's a specific, highly desirable use case. The psychological hook is the "uncanny valley" turned into a magic trick. When the video starts with "What if I told you I'm not even real," it challenges the viewer's biological instinct to recognize human faces. The viewer is forced to scrutinize the video, looking for AI artifacts (extra fingers, weird blinking, unnatural lighting), which naturally increases watch time. The celebrity effect of the Rhode product grounds the abstract AI technology in reality. It answers the question, "How can I actually use this to make money?" by showing a direct application for product photography, making the content highly saveable and shareable for entrepreneurs and marketers.

From a platform perspective, this video is engineered for the algorithm. The 0-3 second hook is visually and audibly arresting. The split-screen UI acts as a constant visual anchor, reducing the cognitive load of understanding the "before and after" concept. The pacing is relentless, with a new, visually distinct shot every 2-3 seconds, preventing any drop-off in attention. The contrast between the static left side and the hyper-realistic right side creates a loop effect, as viewers might re-watch to see how the static image translated into the moving video. Finally, the caption and on-screen text ("Comment guide to learn how") create a frictionless engagement loop, driving the comment count to 867, which signals to the algorithm that the content is highly conversational and valuable.

5 Testable Viral Hypotheses

The "Impossible Reality" Hook: Evidence: The opening line "I'm not even real" paired with a hyper-realistic face. Mechanism: Creates cognitive dissonance, forcing the viewer to pause their scroll to resolve the conflict between what they hear and what they see. Replication: Start your video by stating a counter-intuitive fact about the visuals (e.g., "This entire room was generated from a sketch").
The "Ingredient vs. Meal" Split Screen: Evidence: The persistent left-side UI showing the static AI image and the real product. Mechanism: Provides continuous context, making the transformation feel more impressive and reducing the need for lengthy explanations. Replication: Use a picture-in-picture or split-screen layout to show the raw assets alongside the final AI output throughout the entire video.
The Celebrity Product Piggyback: Evidence: Using Hailey Bieber's Rhode lip balm instead of a generic tube. Mechanism: Taps into the search volume and visual recognition of an already trending item, borrowing its cultural cachet. Replication: When demonstrating an AI capability, apply it to a currently trending product, brand, or pop-culture moment rather than a generic example.
The Rapid Aesthetic Flex: Evidence: Changing the lighting setup (soft beauty to hard shadow to high-key) every 3 seconds. Mechanism: Keeps the visual cortex stimulated and proves the versatility of the tool, increasing the perceived value of the tutorial. Replication: Don't just show one output; show 5-6 drastically different stylistic variations of the same prompt/subject in rapid succession.
The Frictionless Value Exchange CTA: Evidence: "Comment guide to learn how" text overlay and caption. Mechanism: Trades a high-value asset (a tutorial) for a low-effort action (a one-word comment), artificially inflating the engagement metrics that platforms prioritize. Replication: Gate your actual tutorial or prompt list behind a ManyChat/auto-DM trigger based on a specific comment keyword.

5. How to Recreate (Replication Tutorial: From 0 to 1)

This video requires a multi-tool workflow, blending image generation, image editing, video generation, and lip-syncing. Here is the step-by-step checklist to recreate this exact style.

Topic Selection & Positioning: Choose a highly recognizable, aesthetically pleasing physical product (e.g., a trending skincare item, a popular sneaker, a high-end beverage). This positions your account as a practical resource for e-commerce and marketing, not just an AI art page.
Character Consistency (The Base Image): Use Midjourney or Stable Diffusion to generate your base model. Prompt example: "Cinematic editorial portrait of a 25-year-old Asian woman, flawless glowing skin, dark brown hair tied back, wearing a fitted white ribbed sleeveless turtleneck tank top, small silver hoop earrings, studio lighting, 85mm lens, photorealistic --ar 4:5". Save this image; it is your anchor.
Product Integration (The Composite): Take your base image into Photoshop or Canva. Find a high-quality, transparent PNG of your chosen real-world product. Composite the product into the model's hand or near her face. Pay attention to scale and rough shadow placement. This composite is your starting point for video generation.
Video Generation (The Motion): Use an image-to-video model like Kling AI, Runway Gen-3 Alpha, or Luma Dream Machine. Upload your composite image. Use prompts that describe the camera movement and lighting changes, NOT the subject (since the image locks the subject). Example: "Slow push-in, dramatic hard diagonal shadow sweeps across the face, cinematic lighting." Generate multiple 3-4 second clips with different lighting and poses.
Voiceover Generation: Write a punchy script (like the one in the video). Use ElevenLabs to generate a realistic, high-quality voiceover. Choose a voice that sounds professional, warm, and slightly authoritative (e.g., a commercial beauty VO style).
Lip-Syncing (The Magic Trick): Take the specific video clips where you want the model to speak (like the opening and the middle shot) and run them through a lip-sync tool like SyncLabs or Hedra, pairing the video clip with your ElevenLabs audio file.
UI Overlay Setup: In your video editor (CapCut, Premiere Pro), set up the split-screen. Create a black background. Place your static base image and the product PNG on the left side. Add the text labels ("AI image", "Real product"). Place your generated, lip-synced video clips on the right side.
Editing & Publishing: Cut the video to the beat of a modern, subtle electronic track. Add bold, dynamic subtitles in the center of the screen (yellow and white for contrast). Add the final CTA text overlay ("Comment guide to learn how"). When publishing, ensure the cover frame shows the most striking lighting contrast to maximize click-through rate.

6. Growth Playbook (Distribution & Scaling Strategy)

3 Ready-to-Use Opening Hook Lines

"What if I told you this entire photoshoot cost $0 and took 10 minutes?"
"Stop paying for product photography until you watch this."
"This is why AI models are about to replace traditional studios."

4 Caption Templates

The Cost-Saver: "I'm not even real, and neither is this studio setup. 🤯 If you're an e-commerce brand, you're wasting thousands on product shoots. I generated this entire campaign using just one image and a few prompts. Want to see the exact workflow? Comment 'STUDIO' and I'll DM you the step-by-step guide. 👇"
The Agency Flex: "The future of commercial photography is here, and it doesn't require a camera. 📸 We took a real product and built an entire high-end campaign around it using AI. The reflections, the lighting, the skin texture—all generated in seconds. Could you tell it was AI? Let me know below! Comment 'AI' for the tool list."
The Tutorial Tease: "Real product. Unreal possibilities. ✨ You don't need a massive budget to create cinematic content anymore. Just imagination and the right AI tools. I broke down this exact lighting and motion prompt in my latest guide. Comment 'GUIDE' to get it sent straight to your inbox. 📥"
The Trend Jacker: "Testing AI product photography with the viral [Insert Product Name]. 💄 The results are actually terrifyingly good. No studio, no model fees, just pure generation. What product should I do next? Drop a comment! (And comment 'HOW' if you want the tutorial)."

Hashtag Strategy

Broad (Reach & Algorithm Categorization): #AIphotography, #ArtificialIntelligence, #ContentCreation, #DigitalMarketing. Why: These tell the algorithm the general bucket your content belongs in, ensuring it gets pushed to users interested in tech and marketing.
Mid-Tier (Target Audience & Niche): #ProductPhotography, #EcommerceTips, #AIModels, #CreativeAgency. Why: These target the specific people who need this solution—dropshippers, brand owners, and marketers looking to cut costs.
Niche Long-Tail (Search Intent & Specificity): #MidjourneyTutorial, #RunwayGen3, #AIPhotoshoot, #VirtualInfluencer. Why: These capture users actively searching for how-to content or specific tool workflows, driving high-intent saves and shares.

7. FAQ (Covering Long-Tail Search Queries)

What tools make it look the most similar?

To achieve this specific photorealistic, cinematic look, Midjourney v6 is best for the base image, and Kling AI or Runway Gen-3 Alpha are currently top-tier for handling the complex lighting and motion generation.

What are the 3 most important words in the prompt?

For this aesthetic, the crucial prompt keywords are "cinematic editorial portrait," "85mm lens," and "dramatic studio lighting."

Why does the generated face look inconsistent across shots?

Face inconsistency happens when you rely solely on text prompts for video generation; you must use an image-to-video workflow, using the exact same base image as the starting frame for every single shot.

How can I avoid making it look like AI?

Avoid the "AI look" by adding film grain in post-production, ensuring the lighting has realistic fall-off (shadows), and keeping camera movements subtle rather than sweeping and unnatural.

Is it easier to go viral on Instagram or TikTok with this type of content?

This specific high-end, aesthetic-driven content tends to perform exceptionally well on Instagram Reels, as the platform's audience heavily indexes on visual quality, fashion, and e-commerce trends.

How should I properly disclose AI use for this type of content?

Always use platform-native AI disclosure labels (like Instagram's "Made with AI" tag) and clearly state it in the first line of your caption or within the video hook, as done here with "I'm not even real."