AI photo generator pages are for creators who care about realism first. They are not looking for painterly effects or stylized illustration. They want portraits, product shots, or lifestyle images that feel camera-made, with believable lighting, skin texture, and backgrounds. This page helps you compare photo directions that feel more lifelike, more usable, and closer to real photography without needing a full shoot setup.

Video
Kallaway
GLOBAL LOCK: The subject is a male in his mid-30s with light skin, wearing a black baseball cap with a subtle logo and a black long-sleeve shirt with a white "KITH" logo on the chest. He has an energetic, expressive face. The environment transitions between various 3D generated worlds and a studio setting. Lighting is cinematic with high contrast. The color grade is warm and saturated. Speech is direct-to-camera with high-energy delivery and crisp articulation.

[00:00–00:02]
A wide, high-angle drone-style shot of a tropical island. White sand beach, turquoise water with gentle waves, and lush green palm trees. A tiny, indistinguishable human figure stands on the sand. Bright, high-noon tropical lighting.

[00:02–00:05]
The subject appears in a circular frame overlaying the beach, then transitions to a full-screen medium close-up. He is speaking enthusiastically, gesturing with his hands. The background is the same tropical beach but slightly blurred (bokeh).

[00:05–00:08]
A medium shot from the side. The subject is walking along a path lined with tropical plants and palm trees. The lighting is dappled sunlight. He is looking off-camera and smiling. Cinematic handheld camera movement.

[00:08–00:11]
Close-up talking head shot. The background is dark and out of focus with a purple and blue rim light on the subject's shoulders. He is speaking directly to the camera, emphasizing the words "world building."

[00:11–00:14]
Medium shot of the subject sitting in a brown wicker chair inside a modern, sunlit living room with white walls and wooden stairs in the background. He gestures broadly with both hands. High-key, airy lighting.

[00:14–00:17]
A close-up of the living room set, focusing on the wicker chair and a patterned pillow. The camera pans slightly. The lighting is warm and domestic.

[00:17–00:24]
A rapid montage of digital environments: a gothic cathedral with lava flowing through the center, a snowy village under the green Aurora Borealis, and a futuristic sci-fi hallway. High-fidelity textures and dramatic lighting.

[00:24–00:30]
A screen recording of a UI. A photo of a tennis court with mountains in the background is uploaded. The UI shows a "Generate" button being clicked, and the photo transforms into a 3D navigable world.

[00:30–00:36]
The subject is back in a medium shot, gesturing toward a floating window that shows the 3D tennis court world. He explains the "digital sets" concept.

[00:36–00:45]
A grid of 8 reference images showing the subject in different poses and environments. The UI demonstrates "splicing" the subject into the living room set. The subject is seen waving in the final spliced image.

[00:45–00:52]
A screen recording of a video generation tool (Google VEO 3). A prompt is typed: "Animate the reference photo. The subject holds a cup..." The video generates a realistic motion of the subject in the digital set.

[00:52–01:05]
Close-up of the subject speaking. He transitions into a medium shot in a simple white-walled room, wearing the same KITH shirt. He uses his hands to emphasize the "sauce layer" of lip-syncing.

[01:05–01:12]
A cinematic shot of a fashion model in a green tank top walking across a city crosswalk, followed by a shot of a model in a red beret sitting in a futuristic subway car. High-end editorial lighting.

[01:12–01:18]
The subject is superimposed at the bottom of the screen, pointing up at an Instagram profile (KITH). He then shows lifestyle photos of models on a tennis court being turned into 3D worlds.

[01:18–01:26]
Final talking head shot. The subject winks and points at the camera. The video ends with quick cuts of a barn interior at sunset and a woman in a futuristic pink dress in a white, crystalline room.

NEGATIVE PROMPT: visual artifacts, distorted face, inconsistent clothing logos, flickering lighting, robotic lip movement, blurry textures, unnatural hand gestures, floating objects, low resolution, watermarks, text jitter.

SPEECH PACK:
[00:00-00:05] "This is absolutely insane. You can now use AI to put yourself in a 3D world."
TAKE_A: (High energy, fast pace) "This is absolutely insane! You can now use AI to put yourself in a 3D world!"
TAKE_B: (Awe-struck, slower pace) "This... is absolutely insane. You can actually use AI to put yourself... in a 3D world."
TAKE_C: (Direct, informative) "This is insane. AI now lets you put yourself directly into any 3D world."

[00:05-00:11] "I'm talking true world building. You can control the scene, the motion, the movement."
TAKE_A: (Emphasizing 'true') "I'm talking TRUE world building. Control the scene, the motion, the movement."
TAKE_B: (Rhythmic) "True world building. You control the scene. The motion. The movement."

[00:52-01:00] "And here is the sauce layer on top. If you want to lip sync so your character talks smoothly..."
TAKE_A: (Secretive/Excited) "And here’s the sauce layer. Want to lip sync so it looks smooth? Watch this."

PROSODY NOTES: Use punchy emphasis on tool names (World Labs, Sora, Veo). Maintain a "tech-guru" persona—warm but authoritative. High lip-sync strictness required for the "sauce layer" segment.
Video
GLOBAL LOCK:
Subject: A Caucasian woman in her late 20s, blonde hair tied in a neat ponytail, wearing a leopard-print (cheetah pattern) blouse.
Environment: A cozy home studio/office background with dark grey walls, wooden bookshelves filled with books, green indoor plants, and soft dual-tone lighting (warm orange light from one side, cool blue light from the other).
Camera: MCU (Medium Close-Up) framing, eye-level, 35mm lens feel with shallow depth of field.
Style: Professional UGC creator aesthetic, high-quality video, crisp audio.
Speech: Direct-to-camera delivery, energetic and authoritative tone.

[00:00–00:05]
Visual: Rapid montage of extreme macro close-ups (ECU). First, a human eye with visible iris patterns and eyelashes. Second, an ear with a gold hoop earring showing skin texture. Third, a wrist with a simple black line tattoo showing skin pores and fine hairs.
Action: Static macro shots.
Lighting: Bright, natural daylight feel for the macros.
Text Overlay: "most AI" -> "look fake" -> "because" -> "is trained".
Speech: "Most AI images look fake for one reason. Because AI is trained to remove flaws."

[00:05–00:11]
Visual: The woman (Subject) in the MCU studio setting, gesturing with her hands. Floating icons of AI tools (ChatGPT, Freepik, Ideogram, Nano Banana) appear around her.
Action: Subject talks directly to the camera, moving hands to emphasize points.
Lighting: Studio setup (Orange/Blue).
Text Overlay: "need" -> "AI tools" -> "to prompt".
Speech: "But we don't need better AI tools. We just need to prompt the model to create images that actually look real."

[00:11–00:21]
Visual: Transition to a black screen with white text titled "Master Prompt". The text scrolls or highlights specific sections. Then, a split screen showing the woman talking in a small window and the prompt text in a larger window.
Action: Subject continues talking while the prompt text is displayed.
Lighting: Studio setup for the talking head.
Text Overlay: "to create" -> "that actually" -> "look real".
Speech: "The key to realistic AI images is using a prompt with a specific structure. This prompt should force skin detail, including visible pores, uneven tone, and natural imperfections."

[00:21–00:30]
Visual: Montage of AI-generated faces with high realism. A man's face with stubble and pores, a woman's face with freckles and slight redness. Then, a screen recording of the Freepik interface showing a gallery of realistic portraits.
Action: Fast cuts between the portraits and the UI.
Lighting: Varied, matching the generated images.
Text Overlay: "most people start" -> "make" -> "image".
Speech: "Most people start their prompt with 'make a realistic image of'. I start by telling the model how the camera behaves."

[00:30–00:42]
Visual: Screen recording of a prompt being typed into a text box. Keywords like "iPhone 14 Pro", "handheld framing", and "imperfect composition" are highlighted in yellow.
Action: Scrolling through the prompt text.
Lighting: Digital UI.
Text Overlay: "model that" -> "camera behaves" -> "casual hand" -> "imperfect composition".
Speech: "Casual handheld framing, slightly imperfect composition, and a smartphone camera perspective. This alone already breaks the AI look."

[00:42–00:52]
Visual: The woman back in the MCU studio setting. She gestures toward floating app icons for "Enhancor" and "Higsfield". A screen recording shows a "Skin Enhancer" tool being used on a photo of a woman with goggles.
Action: Subject explains the final step.
Lighting: Studio setup.
Text Overlay: "But Most People Stop There" -> "Final Step" -> "Most Creators Are Gatekeeping".
Speech: "But most people stop there. I use a final step that most creators are gatekeeping. I run each image through a final skin enhancement step using Enhancor or Higsfield."

[00:52–01:00]
Visual: The woman in MCU, pointing down toward a text box that says "Comment GUIDE". A final zoom-out effect or a slight blur transition.
Action: Subject smiles and points.
Lighting: Studio setup.
Text Overlay: "Prompt Structure" -> "Workflow" -> "Comment GUIDE".
Speech: "If you want my exact prompt structure and the full workflow, just comment GUIDE and I'll send it over."

NEGATIVE PROMPT:
Smooth skin, plastic texture, perfect symmetry, airbrushed look, 6 fingers, distorted eyes, watermark, logo, blurry background (unless specified), robotic voice, lip-sync lag, harsh sibilance, flickering lights, low resolution.

SPEECH PACK:
[00:00-00:05] "Most AI images look fake for one reason. Because AI is trained to remove flaws."
[00:05-00:11] "But we don't need better AI tools. We just need to prompt the model to create images that actually look real."
[00:11-00:21] "The key to realistic AI images is using a prompt with a specific structure. This prompt should force skin detail, including visible pores, uneven tone, and natural imperfections."
[00:21-00:30] "Most people start their prompt with 'make a realistic image of'. I start by telling the model how the camera behaves."
[00:30-00:42] "Casual handheld framing, slightly imperfect composition, and a smartphone camera perspective. This alone already breaks the AI look."
[00:42-00:52] "But most people stop there. I use a final step that most creators are gatekeeping. I run each image through a final skin enhancement step."
[00:52-01:00] "If you want my exact prompt structure and the full workflow, just comment GUIDE and I'll send it over."
Video
GLOBAL LOCK: A vertical 9:16 creator-marketing Reel, approximately 33 seconds, built around one recurring host and a dark-mode AI character-generation interface. Keep three visual layers consistent across the whole video: (1) the host, a white male in his late 20s to early 30s with side-parted brown hair, slim build, expressive face, clean-shaven, wearing a fitted off-white knit sweater and speaking into a matte-black desktop microphone, lit by a warm amber key and soft vignetted studio background; (2) stylized portrait outputs of the same handsome male AI character, usually white, early 20s to early 30s, chiseled jaw, thick dark hair, slim-athletic build, shown in different fashion/editorial presets such as city streetwear, convenience-store candid, studio portrait, tank-top fashion, foggy road noir, cowboy desert, and black-and-white urban scenes; (3) Higgsfield.ai interface captures in dark mode featuring the Character section, Higgsfield Soul 2.0 highlighted in the left model list, a grid of example source faces, preset tiles labeled Editorials, Fashion, Street Photography, Double exposure, a bright lime-green Generate button with a coin cost indicator, and an Animate button on selected outputs. The pacing must stay aggressive and social-native with a new visual beat every one to two seconds, strong contrast between warm host footage and colder generated sample cards, crisp UI sharpness, black/charcoal backgrounds, neon-lime accent labels, and one energetic male speaker throughout with close-mic, dry, high-intelligibility audio. Lips are visible during all host sections and sync must feel tight.

[00:00-00:03] Start on a dark background with bold white uppercase text reading STOP DOING THIS, flanked by red X marks. Under the headline, show generic AI male portrait samples: first a black-coat city street shot, then a casual black sweater portrait, then another generic urban fashion image. The host appears in a rounded rectangle at the bottom, urgently raising one hand toward camera as if interrupting the viewer. Audio: same male host delivers a sharp pattern-break hook telling viewers to stop making the same boring AI character photos.

[00:03-00:07] Cut between the host in warm studio close-up and more bland sample outputs: a crouched white-sweater studio pose, a convenience-store fashion portrait with bomber jacket and bow tie, another convenience-store variation. The host points upward with both index fingers while speaking quickly. Camera on the host remains static medium close-up with 35mm to 50mm lens feel, shallow depth, warm amber falloff. Audio: one speaker, emphatic, corrective tone, lips fully visible.

[00:07-00:11] Introduce stronger preset-driven examples. Show a clean editorial portrait card labeled Editorials, then a Fashion preset with a white ribbed tank top, then Street Photography over a bright outdoor male portrait, then Double exposure with a grayscale silhouette overlay. Each sample occupies the upper two-thirds while the host continues in the lower panel. The transition rhythm should feel like flipping through creative options rather than a tutorial menu. Audio: host pivots from criticism to the better alternative.

[00:11-00:14] Briefly isolate the Higgsfield.ai logo on a dark bar, then cut to the platform interface. Show the Character tab area with Soul 2.0 in the model list highlighted and the host below continuing to explain. Use dark graphite UI, lime-green badges, and readable white text. Audio: same speaker names the tool and frames it as an easier route to ultra-realistic character creation.

[00:14-00:18] Show a grid of source reference portraits inside the character workflow: multiple male selfies and studio shots, the cursor hovering over them as if choosing a base identity. Host remains bottom-center, speaking calmly but with momentum. Emphasize that one character identity can be turned into many outputs. Audio: host explains consistency and customization, crisp consonants, no background reverb.

[00:18-00:21] Cut to a full-height preset card of a standing male figure against a white seamless with a lime Presets label, then to the generation composer showing a dark prompt box, a character token or preset mention, and a lime Generate button with a coin cost. Cursor movement should imply that generation is about to happen. Audio: host explains that the system can create polished images in a couple of clicks.

[00:21-00:24] Reveal generated outputs in different environments: a dark cinematic portrait of a bespectacled man, a convenience-store streetwear shot with Presets badge, and an outdoor coastal portrait with Animate highlighted in lime. The host gestures with one hand as if listing options. Color shifts between cool storefront daylight, neutral portrait lighting, and warm natural outdoor scenes while the UI frame stays dark.

[00:24-00:28] Expand the sample range further with a foggy road full-body shot in a long black coat, a desert cowboy standing in front of a stepped stone structure, and a top-down tank-top fashion portrait. These three outputs should feel dramatically different in location and styling while keeping premium realism and the same polished character aesthetic. Audio: same male narrator sells variety, speed, and realism for creators.

[00:28-00:31] Tighten into darker cinematic portraits: a serious close-up male face against a charcoal backdrop, then a black-and-white street portrait with overlaid CTA text Comment "AI", then a fashion portrait with the same CTA treatment. Keep typography large, bold, white, and lime-yellow, centered over the images. The host points upward from the bottom frame to reinforce the CTA timing.

[00:31-00:33] End on another fast CTA repetition using the strongest portrait samples while the host lands the final line. Maintain the warm studio box below, sharp microphone silhouette, and dark premium brand palette. Audio: one male speaker, punchy final comment-gate instruction, no fade, no music swell overpowering the words.

NEGATIVE PROMPT: avoid identity drift between generated male portraits, avoid uncanny skin texture, avoid distorted eyes or asymmetrical jawlines, avoid over-smoothed plastic faces, avoid broken hands in host gestures, avoid unreadable UI labels, avoid cluttered text overlays beyond STOP DOING THIS and Comment "AI", avoid fake logos, avoid low-resolution preset cards, avoid inconsistent sweater color on the host, avoid muddy shadows on the warm studio shot, avoid robotic speech, lip-sync mismatch, clipped peaks, harsh sibilance, or over-compressed voice.
Video
A) MISE EN PLACE
2) Segment the video into scenes/shots:
- [00:00-00:03] Shot 1: ECU face, talking.
- [00:03-00:05] Shot 2: CU face, holding product.
- [00:06-00:09] Shot 3: MS, head turn, dramatic shadow.
- [00:10-00:12] Shot 4: CU, applying product.
- [00:13-00:15] Shot 5: WS, sitting on floor.
- [00:16-00:18] Shot 6: CU, touching neck.
- [00:19-00:21] Shot 7: MS, sitting on stool, talking.
- [00:22-00:24] Shot 8: MS, holding hair up.
- [00:25-00:27] Shot 9: CU, wind in hair.

3) Extract visual evidence:
- Keyframes: 00:01 (talking face), 00:04 (holding product), 00:07 (shadow face), 00:11 (applying product), 00:14 (full body), 00:17 (touching neck), 00:20 (sitting talking), 00:23 (holding hair), 00:26 (wind in hair).

4) Extract speech evidence:
- Speaker: 1 female voice (Speaker A).
- Transcript:
  [00:00-00:03] "What if I told you I'm not even real."
  [00:03-00:05] "But the product I'm holding is Hailey Bieber's Rhode lip balm."
  [00:06-00:09] "Everything you're seeing was created with AI, no camera, no studio."
  [00:10-00:12] "Just one image and a few prompts."
  [00:13-00:15] "Every reflection, every highlight, every detail was generated in seconds."
  [00:16-00:18] "Real product, unreal possibilities."
  [00:19-00:21] "You don't need a full setup anymore."
  [00:22-00:24] "Just imagination."
  [00:25-00:27] "Comment guide to learn how."
- Lip visibility: Full visibility in shots 1 and 7. Partial/implied in others.
- Sync strictness: High for shots 1 and 7.

5) Invariants list (LOCK THESE):
- Visuals: Asian woman, mid-20s, flawless glowing skin, dark brown hair, fitted white ribbed sleeveless turtleneck tank top, small silver hoop earrings. Cinematic studio lighting, 85mm lens feel, photorealistic texture.
- Speech: Female voice, warm, confident, commercial beauty tone, close-mic studio sound, dry room.

6) Variables list (TWEAK THESE):
- Visuals: Lighting direction (soft beauty vs. hard directional), hair state (tied back vs. loose), background color (black, grey, white), pose, camera framing (ECU to WS).
- Speech: Pacing, emphasis on key words ("real", "AI", "seconds").

B) SHOTLIST
[00:00-00:03]
- framing: ECU, eye level.
- lens: 85mm, shallow DoF.
- camera movement: Static.
- subject: Looking directly at lens, speaking.
- environment: Dark studio background.
- lighting: Soft beauty lighting, high contrast.
- speech: Speaker A, on-camera. "What if I told you I'm not even real." High lip-sync strictness.

[00:03-00:05]
- framing: CU, eye level.
- lens: 85mm, shallow DoF.
- camera movement: Slight drift.
- subject: Holding a pink lip balm tube near her cheek, looking at camera.
- environment: Neutral studio background.
- lighting: Soft diffused lighting.
- speech: Speaker A, VO. "But the product I'm holding is Hailey Bieber's Rhode lip balm."

[00:06-00:09]
- framing: MS, eye level.
- lens: 50mm.
- camera movement: Slow pan following head turn.
- subject: Turns head from profile to face camera.
- environment: Dark studio background.
- lighting: Dramatic hard directional light, sharp diagonal shadow across face.
- speech: Speaker A, VO. "Everything you're seeing was created with AI, no camera, no studio."

[00:10-00:12]
- framing: CU, tight on mouth.
- lens: 100mm macro feel.
- camera movement: Static.
- subject: Applying pink lip balm to lips, eyes looking slightly down.
- environment: Neutral background.
- lighting: Bright, even beauty lighting.
- speech: Speaker A, VO. "Just one image and a few prompts."

[00:13-00:15]
- framing: WS, full body.
- lens: 35mm.
- camera movement: Static.
- subject: Sitting on floor, one leg bent, wearing black trousers with the white tank top.
- environment: Grey studio floor and wall.
- lighting: Soft overhead lighting.
- speech: Speaker A, VO. "Every reflection, every highlight, every detail was generated in seconds."

[00:16-00:18]
- framing: CU.
- lens: 85mm.
- camera movement: Slight push-in.
- subject: Touching neck and jawline with both hands.
- environment: Dark background.
- lighting: Warm rim light, deep shadows.
- speech: Speaker A, VO. "Real product, unreal possibilities."

[00:19-00:21]
- framing: MS.
- lens: 50mm.
- camera movement: Static.
- subject: Sitting on a metal stool, leaning forward, speaking to camera.
- environment: Neutral studio background.
- lighting: Neutral studio lighting, slight vignette.
- speech: Speaker A, on-camera. "You don't need a full setup anymore." High lip-sync strictness.

[00:22-00:24]
- framing: MS, slight low angle.
- lens: 50mm.
- camera movement: Static.
- subject: Arms raised, holding hair up in a high ponytail.
- environment: White studio background.
- lighting: Bright, high-key lighting.
- speech: Speaker A, VO. "Just imagination."

[00:25-00:27]
- framing: CU.
- lens: 85mm.
- camera movement: Static.
- subject: Looking intensely at camera, hair blowing.
- environment: Dark background.
- lighting: Soft dramatic lighting.
- motion cues: Wind blowing hair.
- speech: Speaker A, VO. "Comment guide to learn how."

C) STYLE BIBLE
- visual_style: Photorealistic cinematic commercial beauty portrait.
- camera_signature: 85mm portrait lens dominance, shallow depth of field, mostly static or slow, deliberate movements.
- lighting_signature: Highly variable but always professional studio quality, ranging from soft high-key beauty to dramatic low-key hard shadows.
- grade_signature: High contrast, natural skin tones, deep blacks, clean whites.
- texture_signature: Flawless skin detail, sharp focus on eyes and product.
- pacing_signature: Fast-paced cuts every 2-3 seconds.
- speech_style: Commercial beauty VO, confident, direct-to-camera hybrid.
- speaker_profile: Female, warm, articulate, modern vocal fry.
- mic_mix_profile: Close-mic, dry studio, high clarity, compressed for social media.

D) PROMPT SYNTHESIS

1. MASTER PROMPT
GLOBAL LOCK: Photorealistic cinematic commercial style. Subject: Asian woman, mid-20s, flawless glowing skin, dark brown hair, wearing a fitted white ribbed sleeveless turtleneck tank top, small silver hoop earrings. Environment: Minimalist studio setting with solid neutral backgrounds (white/grey/black). Lighting: High-end beauty lighting, varying from soft diffused to dramatic hard shadows. Camera: 85mm lens, shallow depth of field. Speech: Single female speaker, warm commercial tone, close-mic studio sound.

[00:00-00:03] ECU of the woman's face against a dark background. Soft beauty lighting. She is looking directly at the lens, speaking. Lips are moving in sync with speech.
[00:03-00:05] CU. The woman holds a pink lip balm tube next to her cheek. Soft diffused lighting. She looks at the camera. Slight camera drift.
[00:06-00:09] MS. The woman is turned slightly away in profile, then turns her head towards the camera. Dramatic lighting with a harsh diagonal shadow cutting across her face. Slow pan following the head turn.
[00:10-00:12] CU tight on the mouth. The woman is applying the pink lip balm to her lips. Eyes looking slightly down. Bright, even beauty lighting highlighting skin texture.
[00:13-00:15] WS. The woman is sitting on the floor, wearing black trousers with the white tank top. One leg bent. Grey studio background. Soft overhead lighting. Static camera.
[00:16-00:18] CU. The woman touches her neck and jawline with both hands. Warm, glowing rim light, deep shadows on the opposite side. Slight camera push-in.
[00:19-00:21] MS. The woman is sitting on a metal stool, leaning forward slightly, speaking directly to the camera. Lips moving in sync. Neutral studio lighting, slight vignette. Static camera.
[00:22-00:24] MS, slight low angle. The woman has her arms raised, holding her hair up in a high ponytail. Bright, high-key lighting, white background. Static camera.
[00:25-00:27] CU. The woman's hair is blowing in the wind. She looks intensely at the camera. Soft dramatic lighting, dark background. Static camera.

2. NEGATIVE PROMPT
Visuals: cartoon, illustration, anime, 3d render, deformed anatomy, extra fingers, mutated hands, unnatural skin texture, plastic skin, temporal jitter, flickering lighting, morphing objects, text, watermarks, logos, low resolution, blurry, out of focus.
Audio: robotic voice, unnatural cadence, harsh sibilance, plosives, clipping, background noise, room echo, lip-sync mismatch, slurred words.

4. SPEECH PACK
Speaker: Female, 20s, warm, confident, commercial beauty tone.
[00:00-00:03] "What if I told you... I'm not even real." (Pause for dramatic effect, direct eye contact).
[00:03-00:05] "But the product I'm holding... is Hailey Bieber's Rhode lip balm." (Slight emphasis on 'Rhode').
[00:06-00:09] "Everything you're seeing was created with AI... no camera... no studio." (Paced, emphasizing the negatives).
[00:10-00:12] "Just one image... and a few prompts." (Smooth, instructional tone).
[00:13-00:15] "Every reflection... every highlight... every detail... was generated in seconds." (Staccato emphasis on 'every').
[00:16-00:18] "Real product... unreal possibilities." (Contrast emphasis).
[00:19-00:21] "You don't need a full setup anymore." (Direct, conversational).
[00:22-00:24] "Just imagination." (Soft, aspirational).
[00:25-00:27] "Comment guide... to learn how." (Clear CTA, energetic).
Video
GLOBAL LOCK:
Subject: A consistent East Asian female model, mid-20s, athletic/slender build, sleek black hair tied in a long ponytail.
Wardrobe: White ribbed cotton tank top, black high-waisted trousers, black pointed-toe heels.
Environment: Minimalist professional photography studio, neutral grey/white background, clean floor with subtle reflections.
Lighting: High-contrast chiaroscuro lighting, sharp motivated light source creating deep shadows across the face and body, editorial fashion mood.
Color Grade: Neutral palette, high contrast, warm skin tones, sharp details, 8k resolution, cinematic film grain.
Camera: 35mm and 50mm prime lenses, shallow depth of field, professional stabilization.
Speech: Female voice, calm, sophisticated, medium pace, crisp articulation, studio-dry microphone signature.

[00:00–00:02]
Subject: MCU, over-the-shoulder view. The model turns her head slowly to look directly into the camera lens.
Action: Subtle, neutral expression, slight parting of lips.
Camera: Static MCU, 50mm lens.
Lighting: Rim light on the ponytail, shadow covering the front of the shoulder.
Speech: "I told you..." (Off-camera feel, transitioning to on-camera).

[00:02–00:05]
Subject: ECU of the model's face. She is holding a pink "rhode" lip balm tube horizontally just below her nose.
Action: She speaks directly to the camera. High skin detail, visible pores, and realistic lip texture.
Camera: ECU, macro feel.
Lighting: A sharp shadow bisects her face vertically, leaving one eye in darkness.
Speech: "...not even real. What I'm holding is..." (Strict lip-sync required).

[00:05–00:08]
Subject: CU of the model holding the pink "rhode" tube.
Action: She brings the tube to her lips. The tube has clear "rhode" branding.
Camera: CU, slight handheld shake for realism.
Lighting: Glossy highlights on the product packaging and her lips.
Speech: "...Hailey Bieber's Rhode lip balm."

[00:08–00:11]
Subject: MCU of the model's profile.
Action: She applies the lip balm to her bottom lip. Her eyes are closed slightly in a posing manner.
Camera: Profile MCU.
Lighting: High-key lighting on the face, dark background.
Speech: "Everything you're seeing... AI."

[00:11–00:15]
Subject: WS of the model sitting on the studio floor.
Action: She is posed with one leg bent, arm resting on her knee, looking at the camera.
Camera: WS, low angle.
Lighting: Hard shadow cast on the floor to the right.
Speech: "No camera, no... just one image and a few..."

[00:15–00:18]
Subject: MCU of the model sitting on a chrome and black leather studio stool.
Action: She rests her chin on her hand, then moves her hands to her neck.
Camera: MCU, 35mm lens.
Lighting: Soft fill light from the front, deep shadows in the background.
Speech: "...every reflection, every highlight, every detail was..."

[00:18–00:22]
Subject: MS of the model sitting on the stool, leaning forward.
Action: She speaks with expressive hand gestures, looking confident.
Camera: MS, eye-level.
Lighting: Dramatic side lighting.
Speech: "...generated in seconds. Real product, unreal possibilities."

[00:22–00:27]
Subject: MCU of the model.
Action: She reaches up with one hand to grab her ponytail and pulls it upward, letting the hair fan out. Wind blows through the loose strands of hair.
Camera: MCU, slight zoom in.
Lighting: Dynamic lighting shifting as she moves.
Speech: "You don't need... anymore. Just imagination. Learn how." (Cut lands on "Learn how").

NEGATIVE PROMPT:
Visual: Cartoonish features, distorted fingers, melting textures, flickering clothes, floating hair, blurry product labels, double limbs, unnatural eye movement, low resolution, watermark, text artifacts.
Speech: Robotic tone, monotone delivery, muffled audio, background hiss, lip-sync mismatch, popping 'p' sounds, unnatural pauses, synthesized artifacts.

SPEECH PACK:
[00:00–00:05] "I told you... not even real. What I'm holding is..."
TAKE_A: (Whispered, mysterious)
TAKE_B: (Confident, direct)
TAKE_C: (Casual, conversational)

[00:05–00:15] "Hailey Bieber's Rhode lip balm. Everything you're seeing... AI. No camera, no..."
TAKE_A: (Emphasis on "AI" and "Rhode")
TAKE_B: (Fast-paced, energetic)

[00:15–00:27] "...every reflection, every highlight, every detail was generated in seconds. Real product, unreal possibilities. You don't need... anymore. Just imagination. Learn how."
TAKE_A: (Inspiring, visionary tone)
TAKE_B: (Professional, matter-of-fact)
TAKE_C: (Slow, emphasizing "unreal possibilities")
Video
GLOBAL LOCK: A vertical AI tutorial video combining a talking-head presenter and step-by-step static visual slides. The presenter is a young woman with long dark brown hair, fair skin, and a fitted white sweater, seated in front of a soft pink-lilac studio background. The tutorial is built around Google Gemini and shows how to use prompt packs for different photo-enhancement tasks: restoring and colorizing old family photos, turning a casual portrait into a passport-style headshot, improving male portrait accuracy using face-shape and hairstyle references, and combining multiple prompt blocks into one reusable master prompt. The overall design uses a teal-green slide background, floating image cards, arrows, and large numbered sections like #3, #4, and #5. Keep the educational tone, slide-driven pacing, and Gemini branding consistent throughout. Speech should be clear, direct, and creator-oriented, with close dry mic sound and paced social-video caption timing.

[00:00–00:04] Open with the presenter promising to show prompt sets for Google Gemini. She appears in a small talking-head frame over a teal instructional background while stacked text blocks and the Gemini logo appear beside her. The tone is straightforward and valuable, like a creator giving away useful workflow templates.

[00:00–00:04] The opening line should sound like a practical tutorial intro, emphasizing that the viewer will get prompts they can reuse. Sync should align with words such as “show you,” “prompts,” and “Google Gemini.”

[00:04–00:10] Transition into a slide showing old family photographs transforming into restored or colorized versions. Use card-like images of black-and-white family portraits rotating or swapping into cleaner, modernized images. The presenter explains that Gemini can help enhance old photos and restore image quality. Keep visual arrows and before/after relationships obvious.

[00:10–00:15] Move to a passport-photo conversion section. Show a casual female portrait as input and a clean, centered passport-style headshot as the result. The presenter explains how one of the prompts can convert an ordinary image into a more formal ID / passport-ready format. Use neutral backgrounds and clear face centering to emphasize the transformation.

[00:15–00:21] Introduce a face-structure and hairstyle guidance section for male portraits. Show diagrams of head shapes, hair reference charts, a celebrity-like sports portrait, and improved portrait outputs of the same male subject in different styles. The presenter explains that adding face shape and hair references improves likeness and overall accuracy. The comparison should feel systematic and instructional rather than purely aesthetic.

[00:21–00:27] Shift to another numbered section focused on prompt construction. Show a stylish woman’s portrait, a separate prompt block, and then a refined final output. The presenter explains how to combine image references and descriptive instructions to sharpen the final look. Text overlays and slide panels should imply that several separate prompt fragments are being organized into one effective workflow.

[00:27–00:35] End with full text-slide examples showing long prompt paragraphs and a final note that the creator has combined all prompts into one. Large text urges viewers to comment “Gemini” to receive the full set. The presenter may no longer be visible in these last frames; instead, the tutorial closes with readable document-like slides and a strong CTA focused on reuse and download.
Video

MASTER PROMPT
GLOBAL LOCK: Vertical 9:16 creator-style AI image generation tutorial reel. Keep the visual structure consistent: dark background, stacked demo windows, rounded-corner presenter overlay near the lower half, and product screenshots or generated outputs occupying the upper area. The presenter is a bearded man in a beige baseball cap and brown hoodie speaking directly to camera with expressive hand gestures. The tutorial should open with a polished luxury ad-style image, then transition into a dark Generate Image interface with prompt and reference controls, and finish with generated lifestyle portraits and result examples. Preserve fast creator-educator pacing, practical workflow clarity, and social-media-friendly text hierarchy.

[00:00-00:10.00] Open with a strong proof-first visual: a luxury perfume bottle ad image against a rich purple satin-like backdrop. Place the presenter in a rounded picture-in-picture window at the bottom, speaking energetically to camera. The hook should feel like, "here is the kind of polished ad-style result you can create," with the upper image doing most of the persuasive work.

[00:10.00-00:28.00] Shift into the process section. Show a dark image-generation interface labeled around concepts like Generate Image, prompt box, reference styles, remix, auto prompt, or similar controls. Keep the presenter visible in the lower area while he explains how the workflow works. Include reference image boards, prompt panels, or app modules that make the system feel practical and reproducible.

[00:28.00-00:48.92] Move into the results and proof section. Show polished generated portraits or fashion-style outputs, app previews, and example result screens, including a casually dressed bearded man in a city street portrait. The presenter continues narrating while the upper content cycles through outputs, reinforcing that the workflow produces believable, commercially useful visuals. End on the strongest lifestyle result.

NEGATIVE PROMPT
Avoid cluttered multi-window chaos, unreadable UI, generic office stock footage, weak hook visuals, random unrelated outputs, corporate webinar styling, tiny text, dark muddy colors, or a tutorial sequence that explains too much before showing a compelling result.

SHOT PROMPTS
[00:00-00:10.00] Luxury perfume ad visual with presenter overlay.
[00:10.00-00:28.00] Dark Generate Image UI, prompt controls, reference boards, presenter explanation.
[00:28.00-00:48.92] Generated lifestyle portraits and result previews with presenter continuing narration.

SPEECH PACK
Timecoded transcript:
[00:00-00:48.92] Single-speaker tutorial explaining an AI image-generation workflow from polished ad example to interface steps to final outputs. Exact wording unclear; preserve concise creator-teacher delivery.

TAKE_A
[00:00-00:48.92] Fast creator-demo explanation with proof-first opening and simple step-by-step UI walkthrough.

TAKE_B
[00:00-00:48.92] Calm but confident tutorial tone emphasizing how to get polished commercial-looking results.

TAKE_C
[00:00-00:48.92] Slightly more enthusiastic creator cadence focused on workflow usefulness and output quality.
Video
GLOBAL LOCK: A vertical cinematic-teaching reel, approximately 47 seconds, designed as a visually rich prompt-and-framing tutorial for better AI-generated film stills. The video alternates between sample portrait or scene imagery and bold centered on-screen text that critiques low-quality AI aesthetics and then replaces them with concrete visual principles. The piece opens with a polished but generic blonde beauty portrait on a black background labeled as “low quality AI,” then pivots into stronger cinematic examples: moody urban night scenes under arches, distant silhouettes in fog, soft practical lighting, handheld-style portraits, and warm sunset close-ups of a short-haired woman. The overall color world leans teal-green shadows, warm amber highlights, subtle grain, and low-key cinematic contrast.

The structure is educational, not narrative. Text captions carry the teaching flow: first rejecting weak AI image habits, then introducing simple filmmaking rules such as better frames, one dominant camera perspective, warm sunset key light from one side, natural texture, contrast, and the idea that the work should visually prove itself. The imagery should feel like proof-of-concept boards or moving mood references rather than continuous story scenes. Most shots are carefully composed single moments: a woman framed in shallow light, two people under an urban arch, a hand-held close-up with soft night lighting, and other filmic fragments that demonstrate intentional cinematography.

The tone should feel confident, minimalist, and opinionated, like a creator explaining how to stop making generic AI portraits and start making cinematic images with stronger visual grammar. Visual priorities: centered all-caps instructional text, black separators or negative space, elegant comparison between generic beauty render and moodier cinematic frames, teal-and-amber grading, shallow depth of field, strong directional light, tasteful grain, and compact tutorial pacing. Avoid busy graphics, loud meme styling, or heavy voice-dependent explanation. The point is that the lesson is readable through image-plus-caption alone.
Video
GLOBAL LOCK: 
Subject is a Caucasian male in his mid-30s with a well-groomed brown beard and medium-length wavy brown hair. He consistently wears a white and olive-green "VANS" trucker hat and a plain, high-quality white crew-neck t-shirt. The environment for the creator's shots is a warm, indoor setting with soft ambient lighting and a neutral, slightly out-of-focus background. The AI-generated content features a cinematic, high-contrast aesthetic with vibrant colors (primarily deep reds and blacks). The speech is energetic, clear, and direct-to-camera, delivered with a "tech-enthusiast" persona.

[00:00–00:05]
Visual: A cinematic, deep red Porsche 911 is shown from multiple angles: top-down, rear view, and 3/4 side profile. The car has a metallic finish and is set against a dark, moody red background with dramatic studio lighting. Text overlay reads "Multiview Perspective Change."
Subject: The creator appears in a small, rounded-square overlay at the bottom center, pointing upwards with both index fingers.
Camera: Smooth transitions between static product shots.
Speech: "This genuinely feels like a cheat code to create high-quality AI visuals for your brand or business."
Sync: Cut to the next shot on the word "business."

[00:05–00:19]
Visual: A rapid-fire montage of the creator's face swapped into various AI-generated scenes: 
1. A close-up of the VANS hat.
2. A model holding a smartphone.
3. A bold fisheye portrait wearing colorful puffer jackets and sunglasses.
4. An "Indie Garden Polaroid" shot with sunflowers and a guitar.
5. A "Halloween Party" shot of the creator in a yellow duck costume holding a red cup.
6. An "Urban Glare Portrait" in a city street.
Subject: Creator remains in the bottom overlay, gesturing with his hands as if explaining the variety.
Motion: Fast cuts (approx. 1-2 seconds each) with slight zoom-ins.
Speech: "This is called Blueprints, and it allows you to create multiple angled shots of any scene. You can upload product reference images and you can even replicate certain styles of images with a simple VFX template they've created for you."

[00:20–00:35]
Visual: Screen recording of the Leonardo.ai interface. The cursor moves to the left sidebar, hovering over and clicking the "Blueprints (Beta)" button highlighted with a red box. It then scrolls through a gallery of templates, selecting "Product Studio Photoshoot."
Subject: Creator in the overlay, looking slightly off-camera as if watching the screen, pointing to the UI elements.
Speech: "All you have to do is upload an image of yourself, and here's how to do it. To get started on Leonardo, you can go to the Blueprints section, and they have all of these different templates."

[00:36–00:45]
Visual: The UI shows the "Upload Person Photo" step. A photo of the creator in his white t-shirt and VANS hat is uploaded. Then, a "Product Photo" of a black smartphone is uploaded. The "Generate" button is clicked. The result shows the creator holding the phone in a professional studio setting.
Subject: Creator in the overlay, nodding and smiling as the result is revealed.
Speech: "You can then select one you want and upload a reference image of your face, for example, and then hit next. Now you can upload a reference image of a product, and then boom! You can actually create images of you holding the product in that environment."

[00:46–00:51]
Visual: The UI shows a "Multiview Perspective Change" generation of the creator sitting on a park bench from different angles (back view, side view, top-down). The video ends with the creator full-screen (or large overlay) against a dark background with the text "TYPE AI COMMENTS."
Subject: The creator winks at the camera and points forward.
Speech: "But it gets crazier because you can use different templates like multiview perspective... if you want to try it out for yourself, type AI in the comments and I'll send you the link."
Sync: Final wink lands exactly on the last word.

NEGATIVE PROMPT:
Visual: blurry face, inconsistent beard length, distorted VANS logo, extra fingers, flickering background, low-resolution UI, robotic body movements, unnatural skin texture, messy hair transitions.
Speech: monotone delivery, background noise, muffled audio, robotic cadence, misaligned lip-sync, harsh "S" sounds, long pauses between sentences.

SPEECH PACK:
[00:00-00:05]
Transcript: "This genuinely feels like a cheat code to create high-quality AI visuals for your brand or business."
TAKE_A: (Energetic, emphasizing "cheat code" and "business")
TAKE_B: (Fast-paced, breathless excitement)
TAKE_C: (Confident, authoritative tone)

[00:46-00:51]
Transcript: "If you want to try it out for yourself, type AI in the comments and I'll send you the link."
TAKE_A: (Friendly, inviting, with a wink at the end)
TAKE_B: (Direct, urgent, pointing at the camera)
TAKE_C: (Casual, "by the way" style delivery)
Video
A creator-style educational video in vertical format featuring a woman speaking directly to the camera outdoors while holding a small handheld microphone. She stands in front of an industrial blue-gray wall and explains how to improve AI image or video generation by writing better prompts for character identity and consistency. As she talks, visual overlays appear around her, including example faces, UI screenshots, prompt text blocks, icon graphics, and sample outputs that illustrate her points. The camera remains steady in a medium shot while she gestures with one hand, points upward for emphasis, and delivers concise teaching segments with captioned key phrases. The mood is instructional, creator-native, confident, and optimized for social learning content.
Video
GLOBAL LOCK:
The video features a split-screen layout. The bottom 30% contains a consistent male creator: Caucasian, mid-30s, brown beard, wearing a tan "Vans" trucker hat and a black quilted vest over a white t-shirt. He is in a home office/studio setting with soft indoor lighting. The top 70% features AI-generated cinematic footage. The AI footage must maintain high subject consistency, specifically a character resembling Leonardo DiCaprio in "The Wolf of Wall Street" (short brown hair, blue pinstripe suit, red polka dot tie). The environment is a luxury office with wood paneling. Lighting is cinematic, warm, and professional.

[00:00–00:03]
Subject: A man resembling Leonardo DiCaprio in a blue pinstripe suit and red polka dot tie.
Action: He holds a crisp one-dollar bill horizontally with both hands, looking directly into the camera with a slight, confident smile.
Camera: Medium close-up, static.
Lighting: Warm, high-key office lighting, soft shadows.
Speech: Creator says "It has never been easier to create multiple camera angles..."
Sync: Creator's lips visible in the bottom frame, high sync.

[00:03–00:07]
Visual: A 3x3 grid appears showing the same man from 9 different angles (overhead, profile, low angle, etc.). Then transitions to a Nike windbreaker jacket (black, red, white) floating in a surreal dark environment filled with glowing blue and purple crystals.
Action: The jacket rotates slowly.
Camera: Close-up on the jacket texture and Nike logo.
Lighting: Dramatic, neon-blue and purple rim lighting.
Speech: "...with consistency from a single reference image."

[00:08–00:13]
Subject: Three characters: a man (DiCaprio-lookalike), a blonde woman (Margot Robbie-lookalike in a black dress), and a muscular man with a goatee (Jon Bernthal-lookalike, shirtless with a gold chain).
Action: They stand together in a modern room with wooden doors and bookshelves. They look toward the camera.
Camera: Medium wide shot, slight handheld jitter for realism.
Lighting: Naturalistic indoor light from the side.
Speech: "So in today's video, I'm going to show you the best method..."

[00:14–00:20]
Visual: Screen recording of the Higgsfield "Shots" app interface. A cursor selects an image of a woman in a black dress and clicks a yellow "Generate" button.
Action: The UI transitions to show a grid of 9 generated black-and-white images of the woman.
Camera: Screen capture.
Speech: "Let's dive in. To get started, you can upload your image into Shots..."

[00:21–00:28]
Subject: A beautiful woman with dark hair in a flowing black dress.
Action: A montage of artistic shots: her looking at the camera, her back to the camera with hair blowing, her dancing with fabric flowing around her.
Camera: Various angles (CU, MCU, Profile), slow motion.
Lighting: High-contrast black and white, dramatic shadows, bright white background.
Text Overlay: "Comment AI" in bold white letters.
Speech: "So if you want to try this out for yourself, type AI in the comments and I'll send you a link."

NEGATIVE PROMPT:
Visual: Distorted faces, extra fingers, flickering background, blurry textures, inconsistent clothing colors, morphing objects, robotic movement, low resolution, watermark.
Speech: Robotic tone, muffled audio, background noise, lip-sync delay, stuttering, unnatural pauses.

SPEECH PACK:
[00:00-00:07]
Transcript: "It has never been easier to create multiple camera angles with consistency from a single reference image."
TAKE_A: (Enthusiastic, fast-paced) "It's NEVER been easier to create multiple camera angles... with total consistency... from just ONE image."
TAKE_B: (Educational, steady) "It has never been easier to create multiple camera angles with consistency... starting from a single reference image."

[00:21-00:28]
Transcript: "So if you want to try this out for yourself, type AI in the comments and I'll send you a link."
TAKE_A: (Direct, CTA-focused) "Want to try this? Type AI in the comments and I'll DM you the link right now."
TAKE_B: (Friendly, helpful) "If you want to try this out for yourself, just comment AI below and I'll send that link over."
Video
GLOBAL LOCK:
Subject is a Caucasian male in his mid-30s with a short, well-groomed dark beard and mustache, brown eyes, and dark wavy hair. He consistently wears a black baseball cap. The environment is a sunny, sandy beach with fine-grained sand. The lighting is high-contrast cinematic sunlight. The color grade is warm with deep shadows and saturated skin tones. The camera uses a high-end cinematic lens with shallow depth of field and visible skin texture. Speech is clear, direct-to-camera, with a warm and enthusiastic tone.

[00:00–00:05]
Macro extreme close-up of the subject's face lying horizontally on the sand. A sharp, narrow "slither" of bright sunlight cuts across his eyes, while the rest of the face is in shadow. One eye is squinting slightly, the other is closed. High detail on skin pores, eyelashes, and beard hair. The camera is static. Text "wtf." appears near the eye. Subject is silent but smiling slightly.

[00:05–00:10]
Screen recording of the Freepik AI interface. A cursor types the prompt: "The man in @img1 is laying on his back in the sand...". Keywords "Slither of light", "Cinematic Realism", and "Macro" appear as text overlays. The subject appears in a small circular inset at the bottom, speaking enthusiastically: "This is all available on Freepik using Seedream 4k as your image model."

[00:10–00:17]
A sequence of video clips. First, a medium shot of the subject in a white t-shirt and beige vest holding a box of Kellogg's Corn Flakes in a bright studio. Then, a wide shot underwater in clear blue water; the subject is swimming while holding the same cereal box. Bubbles and light rays are visible. Subject's voiceover: "And you can take that image and bring it to WAN 2.5 as your video model, which generates 1080p outputs."

[00:17–00:22]
Close-up of the Kellogg's Corn Flakes box being held. The camera has a shallow depth of field, blurring the subject in the background. Text "AI" and "Blur" appear. Subject's voiceover: "It even generates sound effects on your videos as well, and you can even add in camera blurs."

[00:22–00:28]
Return to the AI interface. Shows two image references being combined: the subject's face and a pair of orange-tinted sports sunglasses. The cursor clicks "Generate". Subject in the inset explains: "You can even add consistent products to these images by combining two reference photos together."

[00:28–00:33]
Final result: A macro close-up of the subject's face on the sand, now wearing the orange-tinted sports sunglasses. A hand enters the frame and adjusts the glasses. The reflection in the lenses shows the beach and sky. Text "AI" in red. Subject in inset: "If you want access to this for yourself, type AI in the comments and I'll send you the link."

NEGATIVE PROMPT:
Visual: blurry features, inconsistent beard shape, cartoonish skin, plastic texture, distorted cereal box logo, messy hair, flickering light, floating objects, extra fingers, low resolution, watermark.
Speech: robotic voice, monotone delivery, background noise, muffled audio, lip-sync mismatch, stuttering, unnatural pauses.

SPEECH PACK:
[00:00-00:05]
TAKE_A: "Here’s how to create the most realistic 4K imagery of yourself that’s so real, it’s scary." (Fast, energetic)
TAKE_B: "Want to see something scary? This 4K image of me is actually 100% AI." (Mysterious, slow)
TAKE_C: "This is the most realistic AI I've ever seen. 4K resolution of yourself." (Direct, punchy)

[00:05-00:10]
TAKE_A: "This is all available on Freepik, using Seedream 4k as your image model."
TAKE_B: "Head over to Freepik and use the Seedream 4k model to get this look."

[00:10-00:22]
TAKE_A: "Take that image to WAN 2.5 for video. It generates 1080p with crazy camera control and realism. It even does sound effects!"

[00:22-00:33]
TAKE_A: "Add consistent products by combining two reference photos. Type AI in the comments for the link!"

PROSODY MARKUP:
"So real... [pause] it's **SCARY**."
"Comment **AI** [emphasis] for the link!"
Video
Kallaway
GLOBAL LOCK: One single male creator remains consistent across the full video: a light-skinned man in his late 20s to early 30s with a slim build, wearing a black baseball cap and black hoodie, speaking directly to camera from a dark creator studio with subtle blue and warm accent lighting. The video is a vertical 9:16 tutorial about an “ultimate AI cheat code” for recreating image styles using visual analysis, reference images, style reference codes, prompt breakdowns, and image-generation workflows. On-screen visuals include cinematic image grids, red and black graphic compositions, moodboard-like galleries, prompt boxes, style reference code text, ChatGPT or AI assistant windows, and image-generator interfaces. The editing style alternates between talking-head explanation and crisp screen recordings, with bold subtitle emphasis and rapid creator-education pacing. Speech is single-speaker, clear, energetic, and instructional, with high lip-sync importance whenever the creator is on screen.

[00:00-00:06] Open with a strong hook calling this the ultimate AI cheat code. Flash multiple stylized image examples on screen, including cinematic portraits, surreal visuals, and polished art-directed compositions. The creator speaks directly to camera in a medium close-up, hands raised to stress the promise.

[00:06-00:14] Show how the method starts from any image or visual example. Alternate between the creator and moodboard grids of different aesthetics, including pink sunset scenes, red graphic posters, and cinematic portraits. The creator explains that the system can analyze style rather than just copy random prompts.

[00:14-00:22] Move into the reference and analysis stage. Display image-library interfaces, style examples, and tools that inspect visual characteristics. The creator explains that visual style is hidden inside references, not just in obvious prompt text. Screen recordings should be crisp and legible.

[00:22-00:31] Introduce style reference codes and code-like descriptors. Show a clean screen with “Style Reference Codes” or similar text, followed by example outputs generated from these references. The creator describes how the code or extracted pattern can be applied to other images to keep a consistent visual language.

[00:31-00:40] Bring in AI assistant windows or chat interfaces where the creator asks for word-based breakdowns of the visual style. Display prompt boxes, short analytical responses, and extracted descriptors that summarize lighting, palette, mood, composition, and texture. He explains that words plus references create stronger reproduction.

[00:40-00:49] Show comparison grids and more style examples across different subjects. The creator explains how you can take one visual system and reuse it on other scenes, people, or concepts. The interfaces display image sets, generated outputs, and moodboard transitions to demonstrate consistency.

[00:49-00:55] End on the creator in close-up with a concise final takeaway that the easiest way to recreate strong visuals is to combine references, extracted words, and style codes rather than guessing prompts from scratch. Finish with confident tutorial energy and a direct promise of better outputs.

NEGATIVE PROMPT: multiple presenters, podcast microphones, bright casual room, unrelated stock footage, blurry UI, no image grids, no reference code text, no AI assistant windows, generic filler b-roll, identity drift, unsynced lips, cartoon overlays, or slow low-energy pacing.

SPEECH PACK: Single male tutorial speaker only. Fast creator-educator cadence, crisp articulation, close-mic dry sound, emphasis on terms like style, references, words, codes, and images, high lip-sync importance in all talking-head segments, no second voice.
Video
GLOBAL LOCK: 
Subject: A young Caucasian woman in her mid-20s, long blonde hair with slight waves, blue eyes, vibrant glossy red lipstick. 
Wardrobe: A metallic gold/champagne strapless tube top, black athletic leggings with a small white logo. 
Environment: Initially a warm, dimly lit upscale pub/bar with amber lighting and wooden textures; later a clean, bright indoor setting with natural daylight. 
Style: Photorealistic cinematic UGC, high skin detail (pores visible), 4k resolution, shallow depth of field. 
Speech: Enthusiastic, direct-to-camera, clear articulation, warm tone. 
Mic Signature: Close-mic, crisp, minimal room reverb.

[00:00–00:06]
Subject: The blonde woman is in a bar, holding a pint of beer with a frothy head. She is talking directly to the camera with an expressive, smiling face.
Action: She gestures slightly with her free hand while holding the beer steady.
Camera: Medium close-up, slight handheld jitter for realism.
Lighting: Warm amber light from overhead bar lamps, creating soft highlights on her hair and shoulders.
Speech: "Guys let me tell you most AI videos look fake as..."
Sync: High lip-sync strictness on "fake as".

[00:06–00:08]
Visual: Rapid transition. A screen of black and white static noise for 0.5s, followed by a black screen with bold white grid lines and the text "HERE IS HOW TO" appearing in a stepped animation.
Motion: Fast-paced, rhythmic cuts.

[00:08–00:12]
Visual: Screen recording of a complex node-based AI software interface (dark mode). Multiple boxes (nodes) are connected by glowing lines.
Action: A cursor moves across the screen, highlighting a node labeled "Prompt" and another labeled "Higgsfield Image".
Speech: "Almost no one is learning AI workflows."

[00:13–00:17]
Visual: Close-up on the software UI. A "Prompt" box contains text: "Subject Type: female, Age: Mid 20s, Skin: Scandinavian-inspired...".
Action: The cursor clicks through a gallery of generated images showing the same blonde woman in different poses.
Lighting: Cool blue light from the monitor reflected on the "screen" view.

[00:18–00:22]
Visual: The UI shows a product photo of a pink "LANEIGE" lip mask jar being dragged into a node.
Action: A red circle highlights a node labeled "Bria - Remove Background".
Speech: "Second drop your product photo in. The workflow automatically removes the background."

[00:23–00:28]
Visual: The UI shows a "Gemini 2.5 Flash" node. The resulting image appears: the blonde woman from the first scene is now holding the pink Laneige jar close to her face, smiling.
Action: Zoom in on the hand holding the jar to show realistic grip and reflections.
Speech: "...integration step which reflections so the model actually looks like they're holding it."

[00:29–00:32]
Visual: A side-by-side "Before and After" split-screen comparison. The left side is slightly soft; the right side (labeled "Enhancor") shows hyper-realistic skin pores, fine hairs, and sharp lip texture.
Action: A vertical slider moves across the face to reveal the detail.
Speech: "run a quick Enhancor pass for skin texture and realism."

[00:33–00:38]
Visual: Final video output. The blonde woman is holding the pink jar, talking and laughing. The background is a blurred indoor room with a green plant.
Action: She opens her mouth wide in a laugh at the end. Text overlay: "COMMENT WORK".
Camera: Close-up, stable.
Speech: "Now turn it into a video with the final stage of... COMMENT AND I WILL SEND WORKFLOW."
Sync: High lip-sync strictness on the final laugh and "workflow".

NEGATIVE PROMPT: 
Visual: plastic skin, blurry eyes, double rows of teeth, floating objects, inconsistent hair color, flickering background, distorted fingers on the beer glass or product jar, robotic movements, low resolution, watermark.
Speech: robotic monotone, muffled audio, lip-sync delay, harsh 's' sounds, background hiss, unnatural pauses.

SPEECH PACK:
[00:00-00:06] "Guys let me tell you most AI videos look fake as..."
TAKE_A: (Energetic, slightly conspiratorial) "Guys, let me tell you, most AI videos look fake as..."
TAKE_B: (Frustrated tone) "Guys, let me tell you... most AI videos? They look fake as..."

[00:09-00:12] "Almost no one is learning AI workflows."
TAKE_A: (Authoritative, teaching) "Almost no one is learning AI workflows."
TAKE_B: (Emphasizing 'no one') "Almost *no one* is learning AI workflows."

[00:33-00:38] "Now turn it into a video with the final stage of... COMMENT AND I WILL SEND WORKFLOW."
TAKE_A: (Excited, fast-paced) "Now turn it into a video with the final stage of... Comment and I will send the workflow!"
TAKE_B: (Call to action focus) "Now turn it into a video... final stage. Comment 'WORK' and I'll send it over!"
Video
GLOBAL LOCK: A consistent female subject, Caucasian, early 20s, shoulder-length messy blonde/light-brown hair, natural makeup, wearing a simple black tank top. The environment is a minimalist studio with a dark grey, out-of-focus background. Lighting is soft-box studio style, creating gentle highlights on the face. The video is a split-screen comparison with a vertical white slider line moving across the frame.

[00:00–00:03]
The subject is framed in a medium close-up, centered. On the left side of the vertical slider, her skin appears slightly too smooth and "AI-generated." On the right side, the skin is hyper-realistic with visible pores and natural texture. The slider is positioned on the far left. The subject remains static with a neutral, calm expression, looking directly at the camera.

[00:03–00:07]
The vertical white slider line moves steadily from the left edge of the frame to the right edge. As it passes over the subject's face, the "smooth" skin on the left is replaced by "hyper-textured" skin on the right. The transition is sharp and follows the slider line exactly. The subject's hair and clothing remain perfectly consistent across the transition.

[00:07–00:10]
The slider reaches the right side of the frame, revealing the fully enhanced, realistic face. The subject maintains her neutral gaze. The lighting remains constant, emphasizing the newly revealed skin texture, fine lines, and realistic highlights on the nose and forehead. The video loops seamlessly back to the start.

NEGATIVE PROMPT: blurry, distorted facial features, inconsistent hair movement, flickering lighting, plastic-looking skin on the "after" side, unnatural eye reflections, jittery slider movement, low resolution, watermarks, text artifacts on the subject.

SPEECH PACK:
(No speech present in the original video; it relies on text overlays and background music.)
TRANSCRIPT: [Background Music Only]
TAKE_A: N/A
TAKE_B: N/A
TAKE_C: N/A
PROSODY: N/A
SYNC: N/A
Video
GLOBAL LOCK: 
Subject: A Black man in his late 20s, athletic build, warm brown skin tone with visible texture (pores, slight stubble). 
Hair: Medium-length dark dreadlocks, some strands slightly frizzy. 
Wardrobe: Dark charcoal grey knitted crew-neck sweater with a visible weave pattern. 
Environment: Minimalist indoor setting, soft cream-colored curtains in the background. 
Lighting: Warm, cinematic directional lighting (Rembrandt style), soft shadows, high-end editorial feel. 
Color Grade: Warm earthy tones, slightly desaturated, rich contrast in skin highlights. 
Camera: 35mm and 85mm lens feel, shallow depth of field, sharp focus on subject. 
Speech: Male voice, calm, authoritative, medium-low pitch, professional cadence.

[00:00–00:03]
Subject: Medium close-up of the man. He has his right hand raised, fingers gently threading through his dreadlocks near his temple. He looks directly into the camera with a neutral, intense expression.
Action: Subtle movement of the hand in the hair.
Camera: Static MCU, eye-level.
Lighting: Soft light from the left, highlighting the side of his face and hand.
Speech: "This face is 100% AI." (Lips visible, high sync strictness).

[00:04–00:07]
Subject: Extreme macro close-up of the lower right cheek and jawline.
Action: Static shot showing the fine detail of skin pores, a few micro-scars, and a patchy, short-cropped beard with individual hairs visible.
Camera: ECU (Macro), static.
Lighting: Side-lit to emphasize the 3D texture of the skin.
Speech: "and brands still pay for it. You can see every clogged pore,"

[00:08–00:11]
Subject: Transition from a close-up of his dark brown eye (showing reflections) to an extreme macro of the sweater's shoulder.
Action: A tiny white flake of lint is visible on the dark knit of the sweater.
Camera: ECU, subtle shift in focus from eye to shoulder.
Lighting: Soft, revealing the texture of the wool.
Speech: "the patchy beard that looks like it's been growing since lockdown. You can see every strand in the hair. Even that little white flake on the shoulder"

[00:12–00:16]
Subject: Return to a medium close-up. The man is still looking at the camera, his hand is now down. He blinks once, naturally.
Action: A slow, almost imperceptible zoom-in.
Camera: MCU, slow dolly-in.
Speech: "could be t-shirt lint, could be a croissant crumb from breakfast. Either way, your brain buys it." (Lips visible, high sync).

[00:17–00:23]
Subject: A rapid montage of extreme macro shots: 1) Forehead skin with micro-scars. 2) Close-up of the eye and eyebrow. 3) Side of the neck with fine hairs and skin folds. 4) Macro of the cheek texture again.
Action: Fast cuts, minimal subject motion.
Camera: ECU, static shots.
Lighting: Consistent warm, directional light.
Speech: "I call this Genesis Engineering. Stacking pores, micro scars, lens dirt and bad pixels until it passes the client zoom test."

[00:24–00:26]
Subject: Medium shot of the man, centered. He maintains a steady, confident gaze.
Action: Static, final pose.
Camera: MCU, static.
Speech: "Comment Genesis and the prompt is yours." (Lips visible, high sync).

NEGATIVE PROMPT: 
Visual: Smooth plastic skin, "beauty filter" look, perfectly symmetrical beard, blurry textures, cartoonish dreadlocks, glowing eyes, distorted fingers, flickering light, floating hair, AI-generated text artifacts.
Speech: Robotic monotone, overly excited tone, slurred syllables, mouth movements not matching "Genesis" or "AI", background noise, echo, harsh "S" sounds.

SPEECH PACK:
[00:00-00:03] "This face is 100% AI."
TAKE_A: (Direct, factual) This face... is one hundred percent... AI.
TAKE_B: (Intriguing) This face? It's 100% AI.

[00:04-00:11] "and brands still pay for it. You can see every clogged pore, the patchy beard that looks like it's been growing since lockdown. You can see every strand in the hair. Even that little white flake on the shoulder"
TAKE_A: (Detailed, observational) ...and brands still pay for it. [pause] You can see every... clogged... pore. The patchy beard... every strand... even that flake.

[00:12-00:16] "could be t-shirt lint, could be a croissant crumb from breakfast. Either way, your brain buys it."
TAKE_A: (Conversational) Could be lint... could be a crumb. Either way? Your brain buys it.

[00:17-00:26] "I call this Genesis Engineering. Stacking pores, micro scars, lens dirt and bad pixels until it passes the client zoom test. Comment Genesis and the prompt is yours."
TAKE_A: (Professional/Closing) I call this... Genesis Engineering. [fast] Stacking pores, scars, dirt... until it passes the zoom test. Comment 'Genesis'... and the prompt is yours.

AI Photo Generator

AI photo generator content works best when it is centered on realism. People searching this topic usually want images that could plausibly pass as photographed, not obviously generated art. That often means portraits with natural skin detail, product images with believable light, or lifestyle scenes that feel grounded instead of overly polished or surreal.

The strongest examples here should help creators understand what makes an image read as photographic. Realism is not only about detail. It also depends on lighting logic, material behavior, camera-like framing, and backgrounds that support the subject naturally. When you compare ideas on this page, focus on whether the image feels believable at a glance and whether it could actually be useful in social, branding, or commerce contexts.

FAQ

What is an AI photo generator best for?

It is best for creating realistic-looking portraits, product photos, and lifestyle scenes without needing cameras, locations, or a full shoot.

Why is this different from a general image generator?

This category is more focused on camera-like realism, so photographic lighting, believable textures, and natural composition matter more than stylization.

Who is this page useful for?

It is useful for creators, marketers, e-commerce teams, and anyone who needs photo-style images for practical visual work.

What should I compare on this page?

Compare realism, lighting quality, subject detail, and whether the final image feels believable enough to use like a real photo.