AI sketch generator pages are for people who want an image that feels hand-drawn rather than digitally polished. The style can come from a photo or a text prompt, but the goal is always the same: make it read like pencil, charcoal, or another drawn medium with convincing line weight and texture. This page helps you compare sketch directions that feel more artistic, more personal, and more useful for gifts or concept work.

Video

A) MISE EN PLACE

Reference summary
- Duration: 00:53.66
- Format: vertical 9:16, 720x1280, 30 fps
- Structure: talking-head AI tutorial with example montage, Luma interface demo, before/after proof, and CTA
- Audio: direct-to-camera creator narration; exact wording inferred best-effort from caption, on-screen examples, and pacing

Scene / shot segmentation
1. 00:00.00-00:08.00
   Hook with black-and-white storyboard / sketch-style frames and quick cinematic example shots. Presenter appears as a lower-center cutout talking directly to camera.
2. 00:08.00-00:18.00
   Example montage showing the same or similar character pushed across multiple contexts, including transport / indoor scenes and blue-lit cinematic group imagery, reinforcing the “wild update” claim.
3. 00:18.00-00:30.00
   Luma Modify Video interface section. Dark UI panels, side-by-side comparisons, and slider-based before/after views take over while presenter keeps explaining.
4. 00:30.00-00:42.00
   Workflow proof section. The UI shows image/video preview cards, green plus controls, uploaded assets, and a panel layout that suggests start-frame or reference-driven modification.
5. 00:42.00-00:53.66
   Brand and CTA finish. Luma logo appears, more UI previews and result cards stack behind the presenter, and the closing comment-driven CTA lands.

Visual evidence keyframes
- 00:00.00: sketchy monochrome storyboard look with presenter lower center
- 00:04.00: cinematic sample shot with a central male figure, stronger realism than the opening sketches
- 00:12.00: multiple sample contexts imply style transfer / modify-video consistency across scenes
- 00:20.00: Luma dark interface with before/after style comparison
- 00:28.00: side-by-side or slider preview emphasizing transformation
- 00:36.00: uploaded image/video panel with green plus icon and dark editor layout
- 00:44.00: Luma branding / logo card
- 00:50.00: stacked UI and result proof while presenter closes with CTA

Speech evidence (best-effort)
- speaker_count: 1
- speaker A: male-presenting creator, on-camera in a lower-center talking-head cutout
- speech style: excited tutorial narration, update/news angle, then workflow explanation, then CTA
- likely content themes in order:
  1) Luma’s new AI video update is wild
  2) this update lets you modify video or transfer style while keeping characters more consistent
  3) here is what the workflow / interface looks like
  4) here are examples and proof
  5) comment “AI” for a link
- lip visibility: full for most presenter moments
- lip_sync_strictness: medium

Invariants list (LOCK THESE)
- presenter identity: male creator in his late 20s to 30s with beard, wearing a dark beanie or cap and a casual striped light top, seated and speaking directly to camera in a lower-center cutout
- layout: presenter fixed near bottom center, backgrounds switching between sketches, cinematic result clips, and Luma UI demonstrations
- product context: Luma AI / Modify Video update, style transfer, consistent character or reference-driven video modification
- design language: dark UI, high-contrast examples, creator-tutorial pacing, bold demo-first structure
- motion grammar: rapid hard cuts between examples and interface screens, no elaborate camera move on presenter layer
- lighting / grade: presenter evenly lit in creator-video style; examples range from sketchy monochrome to cinematic saturated scenes
- audio style: energetic creator explainer voice, concise, update-driven, comment CTA at the end

Variables list (TWEAK THESE)
- exact example scenes used in the montage
- exact wardrobe of the transformed subject inside sample clips
- precise interface crop selection
- exact CTA phrasing beyond the comment-and-link mechanic

B) SHOTLIST

Shot 1
- shot_id: 1
- timecode_start: 00:00.00
- timecode_end: 00:08.00
- duration: 8.00s
- framing: presenter lower center over sketch/storyboard visuals and fast sample proof
- lens: webcam or phone-style medium crop for presenter
- camera movement: static presenter layer, quick background cuts
- subject: presenter opens with high-energy reaction to a new update
- environment: monochrome sketches, storyboard-like frames, early cinematic samples
- lighting: soft frontal creator light on presenter
- speech/audio: Speaker A announces the update and why it matters

Shot 2
- shot_id: 2
- timecode_start: 00:08.00
- timecode_end: 00:18.00
- duration: 10.00s
- framing: example montage dominates frame, presenter remains visible
- camera movement: rapid montage cuts
- subject: sample scenes show style transfer / character consistency possibilities
- environment: vehicle interiors, indoor scenes, blue-lit group shot, stylized transformations
- lighting: sample clips vary, presenter remains consistent
- speech/audio: Speaker A expands on what the tool can do

Shot 3
- shot_id: 3
- timecode_start: 00:18.00
- timecode_end: 00:30.00
- duration: 12.00s
- framing: dark Luma interface with before/after comparisons
- camera movement: hard cuts between UI panels
- subject: presenter explains the modify-video workflow while gesturing
- environment: comparison sliders, preview windows, editor interface
- lighting: neutral on presenter, dark contrast-heavy UI behind
- speech/audio: Speaker A becomes more practical and tool-specific

Shot 4
- shot_id: 4
- timecode_start: 00:30.00
- timecode_end: 00:42.00
- duration: 12.00s
- framing: UI panels and uploaded asset views dominate
- camera movement: quick interface swaps and proof shots
- subject: presenter points through steps or settings
- environment: dark editor panels, asset cards, green plus icons, reference-image style layout
- speech/audio: Speaker A explains how to feed the source or start frame into the workflow

Shot 5
- shot_id: 5
- timecode_start: 00:42.00
- timecode_end: 00:53.66
- duration: 11.66s
- framing: Luma logo / brand frame, more proofs, presenter lower center
- camera movement: closing proof montage leading into CTA
- subject: presenter lands the payoff and asks viewers to comment for the link
- environment: logo card, UI screenshots, result cards
- speech/audio: Speaker A closes with a direct CTA

C) STYLE BIBLE (GLOBAL)

- visual_style: short-form AI creator tutorial, update-news meets workflow demonstration
- camera_signature: persistent presenter cutout over changing demo backgrounds
- lighting_signature: soft even creator lighting for presenter, high-contrast dark UI for the workflow layer
- grade_signature: presenter stays warm-neutral while examples oscillate between sketch monochrome and polished cinematic color
- texture_signature: crisp UI, clear preview windows, recognizable brand/logo moments, strong contrast for mobile readability
- pacing_signature: hook with “wild update,” proof montage, software explanation, results, comment CTA
- speech_style: direct-to-camera creator explainer
- speaker_profile: energetic, slightly hyped but still instructional
- pronunciation_profile: casual English, medium-fast pace, emphasis on update novelty and action steps
- mic_mix_profile: dry short-form creator audio, compressed for clarity on phone speakers

D) PROMPT SYNTHESIS

MASTER PROMPT

GLOBAL LOCK: Create a vertical 9:16 AI creator tutorial reel about a new Luma Modify Video style-transfer / reference-driven update. Keep a male presenter in his late 20s to 30s as a cutout near the bottom center for most of the video. He has a short beard, casual creator look, dark beanie or cap, and a light striped top, seated and speaking directly to camera with energetic tutorial cadence and visible hand gestures. The background rapidly changes between sketch/storyboard visuals, cinematic proof shots, dark Luma interface screens, comparison sliders, preview panels, uploaded assets, logo cards, and CTA frames. The reel should feel like a creator showing a genuinely impressive new feature, not a polished corporate ad. Keep typography readable and mobile-first, and preserve the update-news energy all the way to the comment CTA.

[00:00-00:08.00] Open with black-and-white storyboard or sketch-style visuals filling the background, then quickly intercut to more cinematic proof shots. Keep the presenter lower center, speaking directly to camera with excited, “this is wild” energy. The opening should instantly communicate that a new AI video update changes what is possible. Lips visible, medium lip-sync strictness, clear headline-like cadence.

[00:08.00-00:18.00] Move through a proof montage that suggests the same character or source can be transformed across multiple scenes and styles. Include vehicle or indoor shots, stylized cinematic scenes, and at least one dramatic blue-lit example. The presenter continues explaining with confident gestures, reinforcing the value of the update for modify-video workflows and consistent characters.

[00:18.00-00:30.00] Cut into the Luma interface. Show dark UI panels with side-by-side or before/after comparisons, preview windows, and tool context that clearly reads as a video-modification workspace. The presenter keeps speaking directly to camera, now shifting from hype to explanation. Sync important word emphasis to the interface changes.

[00:30.00-00:42.00] Show a more practical workflow section: asset upload panels, preview cards, green plus controls, and a reference-driven layout that implies selecting a start frame or source material before generating the modified result. The presenter gestures and explains how the workflow turns a source into a stylized, consistent output. Keep this section concrete and tool-oriented.

[00:42.00-00:53.66] Finish with a branded proof-and-CTA section. Include the Luma logo card, additional UI/result views, and a final direct ask for viewers to comment “AI” for the link. Keep the presenter bottom center, looking into camera, ending on a highly readable, engagement-focused frame.

NEGATIVE PROMPT

Avoid muddy UI text, warped presenter cutout edges, face inconsistency, random wardrobe changes, unreadable comparison sliders, low-contrast branding, generic stock montage, over-animated transitions, robotic speech, slurred words, lip-sync mismatch, noisy room echo, clipping, over-sharpened screens, flicker, frame jitter, and CTA copy that is too small to read on mobile.

SHOT PROMPTS

- Hook delta: sketch storyboard look turning into cinematic sample proof
- Montage delta: multiple style-transfer / modify-video result scenes with character continuity
- Interface delta: dark Luma UI with before/after comparison views
- Workflow delta: upload/reference panel with green plus controls and preview cards
- CTA delta: Luma logo plus comment-for-link close

SPEECH PACK

Timecoded transcript (best-effort observable reconstruction)
- [00:00.00-00:08.00] Speaker A: “Luma’s new AI video update is wild.” Emotion: excited, newsy, hook-first.
- [00:08.00-00:18.00] Speaker A: “You can push the same idea or character through different looks and keep the result feeling more consistent.” Emotion: impressed but instructional.
- [00:18.00-00:30.00] Speaker A: “Here’s what the interface and modify-video workflow look like.” Emotion: practical, explanatory.
- [00:30.00-00:42.00] Speaker A: “Load your source, set the reference or start frame logic, and generate the new version.” Emotion: tactical, medium-fast.
- [00:42.00-00:53.66] Speaker A: “Comment ‘AI’ for a link.” Emotion: direct CTA, punchy close.

TAKE_A
- Keep the wording close to the lines above with excited creator energy.

TAKE_B
- Same meaning, faster pace, stronger hype on the update and the CTA.

TAKE_C
- Same meaning, calmer and more tutorial-forward.

Closest audible version
- Exact wording was not transcribed verbatim, so treat the lines above as closest observable narration intent supported by caption, visible workflow context, and pacing.

Safe paraphrase version
- The reel introduces a new Luma Modify Video update, shows examples and interface proof, then asks viewers to comment “AI” for the link.
Video
GLOBAL LOCK: 
Subject is a Caucasian male in his mid-30s with a dark, well-groomed beard and mustache. He consistently wears a white baseball cap with a small logo and a white t-shirt. The AI-generated versions must maintain his facial structure and beard while changing costumes. The overall style is high-end cinematic photorealism with 8k textures, dramatic lighting, and professional color grading. The video follows a 3-panel vertical split-screen format: Top (Sketch), Middle (AI Video), Bottom (Live Action).

[00:00–00:03] 
SUBJECT: The subject is a medieval knight wearing a brown leather chest plate with a white deer emblem, green undershirt, and leather bracers. He is holding a wooden longbow, drawing the string back to his cheek with a focused expression.
ENVIRONMENT: A grand medieval castle courtyard with stone walls, flags, and a blurred crowd in the background.
ACTION: Drawing the bowstring, aiming, and holding the tension.
CAMERA: Medium shot, 50mm lens, slight side profile.
LIGHTING: Bright, natural sunlight with soft shadows.
SPEECH: "This new method of creating AI videos is absolutely insane." (Warm, energetic tone).

[00:04–00:08] 
SUBJECT: The subject is a master potter wearing a tan canvas apron over a white shirt. His hands are covered in wet clay.
ENVIRONMENT: A rustic, sun-drenched pottery studio with wooden shelves and ceramic pots.
ACTION: Shaping a spinning clay vase on a wooden pottery wheel. The clay is smooth and wet.
CAMERA: Close-up on hands and face, shallow depth of field.
LIGHTING: Warm, golden hour light coming from a side window.
SPEECH: "So you can now play yourself as a consistent character moving through any scene."

[00:09–00:12] 
SUBJECT: The subject is a gallery visitor in a striped shirt and white cap, holding a black picture frame that contains a vibrant floral oil painting.
ENVIRONMENT: A dark, modern art gallery with grey walls and red security laser beams crisscrossing the room.
ACTION: Holding the frame up, looking at the camera with a surprised, excited expression.
CAMERA: Medium shot, centered composition.
LIGHTING: Moody, low-key lighting with red accent lights from the lasers.
SPEECH: "And the crazy part is that you no longer need Hollywood level budgets for this."

[00:13–00:15] 
SUBJECT: The subject is a scuba diver with long flowing hair (no cap), wearing a white t-shirt.
ENVIRONMENT: A vibrant underwater coral reef with colorful fish, bubbles, and caustic light rays filtering through the surface.
ACTION: Swimming forward with a breaststroke motion, looking around in awe.
CAMERA: Wide shot, tracking the movement.
LIGHTING: Cool blue underwater lighting with shimmering highlights.
SPEECH: "You can record all of this from your own home."

[00:16–00:18] 
SUBJECT: The subject is a world-class DJ wearing a white cap and professional headphones.
ENVIRONMENT: A massive concert stage overlooking a cheering crowd of thousands. Neon lights and stage fog.
ACTION: One hand on a DJ controller, the other hand raised to the crowd in a "pumping" motion.
CAMERA: Over-the-shoulder shot looking out at the crowd.
LIGHTING: High-contrast, flashing concert lights (purple, blue, white).
SPEECH: "So I'm going to show you exactly how you could achieve the same results for yourself."

[00:19–00:21] 
SUBJECT: The subject is a professional chef in a white chef's coat and tall hat.
ENVIRONMENT: A busy, high-end restaurant kitchen with stainless steel surfaces and other chefs in the background.
ACTION: Tossing pasta in a frying pan, creating a large, controlled burst of orange flame.
CAMERA: Medium shot, dynamic movement.
LIGHTING: Bright kitchen lighting with the warm glow of the fire reflecting on the subject's face.
SPEECH: "...with a few subscriptions and a simple sketch."

[00:22–00:59] 
SUBJECT: The subject is an 18th-century opera singer in a lavish blue and gold velvet frock coat with white lace cuffs and a powdered wig (beard remains).
ENVIRONMENT: A grand, ornate opera house with red velvet seats, gold-leaf balconies, and a spotlight on the stage.
ACTION: Standing center stage, arms outstretched in a dramatic singing pose, then performing a theatrical twirl.
CAMERA: Starts as a wide shot of the theater, then punches in to a medium shot of the singer.
LIGHTING: Dramatic theatrical spotlighting, high contrast.
SPEECH: Detailed tutorial narration explaining the sketch-to-video process. (Clear, instructional, engaging).

NEGATIVE PROMPT: 
Visual: Cartoonish, low resolution, blurry, distorted facial features, inconsistent beard, flickering lights, floating objects, extra limbs, text/watermarks in the AI panel, jittery motion.
Speech: Robotic, flat tone, muffled audio, background noise, lip-sync mismatch, stuttering, unnatural pauses.

SPEECH PACK:
[00:00–00:03] "This new method of creating AI videos is absolutely insane."
TAKE_A: (Excited/High Energy) "This NEW method of creating AI videos is absolutely INSANE!"
TAKE_B: (Awestruck/Lower Pitch) "This... new method of creating AI videos... it's absolutely insane."

[00:04–00:08] "So you can now play yourself as a consistent character moving through any scene."
TAKE_A: (Informative/Smooth) "So you can now play YOURSELF as a consistent character, moving through ANY scene."
TAKE_B: (Fast-paced/Direct) "You can now play yourself as a consistent character in any scene you want."

[00:22–00:30] "To get started, you need to do a basic sketch mapping out the scene."
TAKE_A: (Instructional/Clear) "To get started, you just need a basic sketch... mapping out the whole scene."

PROSODY NOTES: 
- Use emphasis on "INSANE," "ANY," and "HOLLYWOOD."
- Maintain a rhythmic pace that matches the visual cuts.
- Ensure lip-sync is high-priority for the tutorial sections where the creator's face is visible in the bottom panel.
Video

GLOBAL LOCK: A vertical 9:16 creator explainer video with a matte-black background and subtle neon grid-floor perspective, a large rounded-rectangle demo panel on the upper half showing Higgsfield x NanoBanana editing examples, and a bottom talking-head creator framed from chest up in a softly lit indoor room. The speaker is a white male creator in his late 20s to mid 30s with medium brown hair, short beard, light skin, wearing a beige baseball cap backwards and a slate-blue oversized T-shirt with cream sleeve/shoulder panels. Keep the top caption text locked in bright yellow-green reading “Higgsfield x NanoBanana” followed by a banana emoji. The upper demo panel should alternate between sketch-to-image, pose sketch editing, character/IP remix examples, product insertion, and draw-to-edit interface states with clear toolbar icons and a bright lime-green “Higgsfield” or “Generate” button. The style is creator-news meets product-demo: clean UI, high readability, quick example swaps, no cinematic camera movement, one presenter speaking directly to camera with energetic but controlled gestures. Speech is English direct-to-camera narration, one speaker only, close-mic, dry room sound, informative hype tone, with lips visible most of the time and cuts aligned to example changes.

[00:00-00:05] The video opens with the title “Higgsfield x NanoBanana” at the top over a dark background. In the large upper panel, a rough black-line sketch appears on a white canvas with small reference images tucked into the corners, showing a loose hand-drawn figure pose. The presenter appears in the lower third, facing camera and raising one hand while introducing the collaboration. Framing is static vertical medium shot, warm lamp light on the face, dark background around him, no extra text beyond the title. Speaker A introduces the partnership and signals that a powerful new editing capability is available.

[00:05-00:10] The top panel switches from sketch to a polished cinematic result resembling pop-culture character imagery, showing how the rough drawing can become a finished scene. The creator below leans in slightly and gestures with both hands, emphasizing the transformation. Maintain crisp UI borders and a clean black margin around the demo panel. Speaker A explains that the tool can take rough input and generate controlled visual outcomes.

[00:10-00:18] The upper examples continue rotating: a fashion-like full-body figure on a clean white stage, seated-pose line drawings, and a stylized scene with a man in dark clothes sitting in a sunlit interior while a branded bottle or product card appears at the side. The presenter keeps speaking with measured, open-palm gestures. The key idea is controllable composition, pose, and inserted elements rather than random generation.

[00:18-00:26] The demo panel moves into more explicit pose-control examples: a sketched figure carrying another body, with character references like Joker and Batman pinned in the corners, followed by drawn action silhouettes with face references. Keep the toolbar visible at the bottom of the upper panel and the bright action button readable. Speaker A explains the flexibility of using sketches, references, and image guidance to direct the final scene. Lips visible, medium lip-sync strictness, emphasis on edit control and freedom.

[00:26-00:38] A rapid set of sketch-to-scene and sketch-plus-reference examples continues, including drawn bodies, anime-like or stylized references, and dramatic generated outcomes. The presenter below stays constant, nodding and gesturing in rhythm with the example swaps. The tone should feel like “look how much control this gives you,” not a calm tutorial. No secondary speakers, no music-led montage logic.

[00:38-00:50] The top panel shifts to a more app-like frame with visible mode tabs such as “Draw to Edit” and “Draw to Video,” then shows a humorous generated image of the creator composited with a celebrity in matching tuxedo-like outfits holding prop weapons. The UI looks more like a final product window rather than a floating demo card. Speaker A stresses that the workflow is practical and fun for creators, not just a research toy.

[00:50-00:62.4] The ending holds on further edit examples and interface states, reinforcing that rough sketches, masks, and reference images can steer image edits with high fidelity. The presenter keeps speaking directly to camera, hands opening and closing as he lands the CTA. Finish with the sense that the feature is live, generous, and worth trying immediately. One speaker only, close and intelligible, no other dialogue.

NEGATIVE PROMPT: no second presenter, no podcast framing, no desktop clutter, no cinematic handheld motion, no dark horror grade, no missing top title, no wrong cap orientation, no inconsistent shirt colors, no melted faces, no distorted reference thumbnails, no unreadable toolbar, no broken sketch anatomy, no random extra UI windows, no fake watermark overload, no low-resolution outputs, no jitter between example swaps, no extra fingers, no robotic lip movement, no echo, no crowd noise, no background chatter, no subtitles unrelated to the observed title or UI.

SHOT PROMPTS:
[00:00-00:10] Black background with neon-grid floor, title “Higgsfield x NanoBanana”, upper panel showing sketch-to-image transformation, bottom talking-head creator in backwards beige cap and slate-blue shirt.
[00:10-00:26] Controlled editing showcase: body pose sketches, seated figure scene, branded product insert, reference-driven transformations, toolbar and bright green action button visible.
[00:26-00:38] More advanced sketch plus reference examples emphasizing pose control, identity guidance, and scene remixing while the creator speaks enthusiastically below.
[00:38-00:62.4] Product-window UI with Draw to Edit / Draw to Video modes and playful high-fidelity generated examples, creator closes with try-it-now energy.

SPEECH PACK:
[00:00-00:10] Speaker A: announces Higgsfield x NanoBanana and frames it as a big update for creators. TAKE_A: excited reveal. TAKE_B: cleaner product-news tone. TAKE_C: hype-driven introduction.
[00:10-00:18] Speaker A: explains that sketches and rough drawings can be turned into polished outputs with strong control. TAKE_A: practical tone. TAKE_B: slightly more amazed tone. TAKE_C: creator-benefit emphasis.
[00:18-00:26] Speaker A: says you can use pose guides, references, and edits to shape the scene you want. TAKE_A: workflow explanation. TAKE_B: feature-summary cadence. TAKE_C: punchier social-video cadence.
[00:26-00:50] Speaker A: expands on creative flexibility, showing character remixes, product insertions, and more expressive control than normal image generation. TAKE_A: informative. TAKE_B: feature-hype balance. TAKE_C: tool-for-creators framing.
[00:50-00:62.4] Speaker A: closes with urgency that the offer is live for Pro+ users and worth testing now, likely tied to a comment CTA. TAKE_A: clear CTA. TAKE_B: more urgent CTA. TAKE_C: softer invitation to try. Prosody markup: energetic sentence starts, brief pauses between examples, emphasis on tool names and control words. Closest audible version: creator explains Higgsfield x NanoBanana editing control and limited-time availability. Safe paraphrase version: one-speaker explainer about a sketch-and-reference-driven AI editor that creators should try this week.
Video
GLOBAL LOCK: Subject is a male in his mid-30s with light brown wavy hair, a well-groomed beard, wearing a tan "Vans" custom classic trucker hat and a plain white t-shirt. The environment transitions between macro paper textures, cinematic outdoor settings, and clean studio backgrounds. Lighting is consistently high-quality, ranging from warm golden hour to professional high-key studio. Color grade is vibrant with high contrast and sharp details. Speech is energetic, direct-to-camera, with a warm and professional tone.

[00:00–00:04]
Macro extreme close-up of a sharpened graphite pencil writing "FLUX 2" in bold, sketchy capital letters on heavily textured white watercolor paper. Small graphite particles are visible around the strokes. The camera is static. Lighting is bright, natural side-lighting emphasizing the paper grain.
SPEECH: "Flux 2 just released and honestly..."

[00:04–00:08]
A series of rapid cuts: 1) A cinematic portrait of the subject (male, Vans hat, beard) in golden hour sunlight with soft bokeh. 2) A full-body shot of the same subject wearing a multi-colored, patchwork editorial suit against a magenta background. 3) An extreme macro close-up of a blue human eye with hyper-realistic skin texture and reflection in the pupil.
SPEECH: "...the details on this AI image model are insane. You can create photo-realistic images of yourself from any angle and it's spotless."

[00:08–00:14]
Product photography shots: 1) A white tube of "Green People" sun cream with clear, legible orange and black text, centered on a clean white/blue background. 2) A matte grey "Stanley" tumbler with a black lid, standing on a vibrant orange surface with a soft shadow. The text on the tumbler is perfectly rendered.
SPEECH: "And don't even get me started on text. You can now create packaging flawlessly with any products now using this image model."

[00:14–00:17]
Medium close-up of the subject (male, beard, long hair) wearing a bright yellow blazer over a green shirt. He looks directly into the camera with a confident expression. Large white text overlay reads "Here's How".
SPEECH: "So here's how you can access it too."

[00:17–00:23]
Screen recording of the Leonardo.ai user interface in dark mode. A cursor clicks on "Image Generation," then selects the "FLUX.1 PRO" model from a dropdown menu. The UI is clean and responsive.
SPEECH: "To get started, go to Leonardo AI, where you can go to image, then you can go to the Flux 2 Pro model."

[00:23–00:28]
A sequence showing the prompt "suncream on a product shelf in a pharmacy" being typed, followed by a grid of generated images showing realistic pharmacy shelves filled with sun cream bottles, all with legible labels and realistic store lighting.
SPEECH: "You can write in a prompt with a reference photo and it will create the most stunning, realistic AI images you've ever seen."

[00:28–00:32]
Macro shot of a hand using a pencil to draw a detailed portrait of the subject's face on paper, followed by the pencil writing "Comment AI" in elegant cursive. The camera zooms in slightly on the text.
SPEECH: "So if you want to try it out for yourself, type AI in the comments and I'll send you a link."

NEGATIVE PROMPT: Robotic speech, monotone delivery, blurry text, mangled fingers, inconsistent facial features, low-resolution textures, flickering lighting, unnatural eye movements, watermarks, distorted UI elements, muddy colors.

SPEECH PACK:
[00:00-00:04]
TAKE_A: "Flux 2 just released and honestly, the details on this AI image model are insane."
TAKE_B: "Flux 2 is finally here, and the level of detail in this model is absolutely mind-blowing."
TAKE_C: "Check this out: Flux 2 just dropped, and the image quality is on a whole different level."

[00:04-00:14]
TAKE_A: "You can create photo-realistic images of yourself from any angle and it's spotless. And don't even get me started on text. You can now create packaging flawlessly."
TAKE_B: "From perfect portraits to flawless product shots, this model handles everything. Look at how it renders text on this packaging—it's perfect."

[00:14-00:32]
TAKE_A: "So here's how you can access it too. Go to Leonardo AI, select the Flux 2 Pro model, and type your prompt. Comment AI for the link!"
TAKE_B: "Want to try it? Head over to Leonardo AI, find Flux 2 Pro, and start creating. If you want the direct link, just comment AI below."
Video
GLOBAL LOCK:
Subject: A Caucasian woman in her late 20s, blonde hair tied in a neat ponytail, wearing a leopard-print (cheetah pattern) blouse.
Environment: A cozy home studio/office background with dark grey walls, wooden bookshelves filled with books, green indoor plants, and soft dual-tone lighting (warm orange light from one side, cool blue light from the other).
Camera: MCU (Medium Close-Up) framing, eye-level, 35mm lens feel with shallow depth of field.
Style: Professional UGC creator aesthetic, high-quality video, crisp audio.
Speech: Direct-to-camera delivery, energetic and authoritative tone.

[00:00–00:05]
Visual: Rapid montage of extreme macro close-ups (ECU). First, a human eye with visible iris patterns and eyelashes. Second, an ear with a gold hoop earring showing skin texture. Third, a wrist with a simple black line tattoo showing skin pores and fine hairs.
Action: Static macro shots.
Lighting: Bright, natural daylight feel for the macros.
Text Overlay: "most AI" -> "look fake" -> "because" -> "is trained".
Speech: "Most AI images look fake for one reason. Because AI is trained to remove flaws."

[00:05–00:11]
Visual: The woman (Subject) in the MCU studio setting, gesturing with her hands. Floating icons of AI tools (ChatGPT, Freepik, Ideogram, Nano Banana) appear around her.
Action: Subject talks directly to the camera, moving hands to emphasize points.
Lighting: Studio setup (Orange/Blue).
Text Overlay: "need" -> "AI tools" -> "to prompt".
Speech: "But we don't need better AI tools. We just need to prompt the model to create images that actually look real."

[00:11–00:21]
Visual: Transition to a black screen with white text titled "Master Prompt". The text scrolls or highlights specific sections. Then, a split screen showing the woman talking in a small window and the prompt text in a larger window.
Action: Subject continues talking while the prompt text is displayed.
Lighting: Studio setup for the talking head.
Text Overlay: "to create" -> "that actually" -> "look real".
Speech: "The key to realistic AI images is using a prompt with a specific structure. This prompt should force skin detail, including visible pores, uneven tone, and natural imperfections."

[00:21–00:30]
Visual: Montage of AI-generated faces with high realism. A man's face with stubble and pores, a woman's face with freckles and slight redness. Then, a screen recording of the Freepik interface showing a gallery of realistic portraits.
Action: Fast cuts between the portraits and the UI.
Lighting: Varied, matching the generated images.
Text Overlay: "most people start" -> "make" -> "image".
Speech: "Most people start their prompt with 'make a realistic image of'. I start by telling the model how the camera behaves."

[00:30–00:42]
Visual: Screen recording of a prompt being typed into a text box. Keywords like "iPhone 14 Pro", "handheld framing", and "imperfect composition" are highlighted in yellow.
Action: Scrolling through the prompt text.
Lighting: Digital UI.
Text Overlay: "model that" -> "camera behaves" -> "casual hand" -> "imperfect composition".
Speech: "Casual handheld framing, slightly imperfect composition, and a smartphone camera perspective. This alone already breaks the AI look."

[00:42–00:52]
Visual: The woman back in the MCU studio setting. She gestures toward floating app icons for "Enhancor" and "Higsfield". A screen recording shows a "Skin Enhancer" tool being used on a photo of a woman with goggles.
Action: Subject explains the final step.
Lighting: Studio setup.
Text Overlay: "But Most People Stop There" -> "Final Step" -> "Most Creators Are Gatekeeping".
Speech: "But most people stop there. I use a final step that most creators are gatekeeping. I run each image through a final skin enhancement step using Enhancor or Higsfield."

[00:52–01:00]
Visual: The woman in MCU, pointing down toward a text box that says "Comment GUIDE". A final zoom-out effect or a slight blur transition.
Action: Subject smiles and points.
Lighting: Studio setup.
Text Overlay: "Prompt Structure" -> "Workflow" -> "Comment GUIDE".
Speech: "If you want my exact prompt structure and the full workflow, just comment GUIDE and I'll send it over."

NEGATIVE PROMPT:
Smooth skin, plastic texture, perfect symmetry, airbrushed look, 6 fingers, distorted eyes, watermark, logo, blurry background (unless specified), robotic voice, lip-sync lag, harsh sibilance, flickering lights, low resolution.

SPEECH PACK:
[00:00-00:05] "Most AI images look fake for one reason. Because AI is trained to remove flaws."
[00:05-00:11] "But we don't need better AI tools. We just need to prompt the model to create images that actually look real."
[00:11-00:21] "The key to realistic AI images is using a prompt with a specific structure. This prompt should force skin detail, including visible pores, uneven tone, and natural imperfections."
[00:21-00:30] "Most people start their prompt with 'make a realistic image of'. I start by telling the model how the camera behaves."
[00:30-00:42] "Casual handheld framing, slightly imperfect composition, and a smartphone camera perspective. This alone already breaks the AI look."
[00:42-00:52] "But most people stop there. I use a final step that most creators are gatekeeping. I run each image through a final skin enhancement step."
[00:52-01:00] "If you want my exact prompt structure and the full workflow, just comment GUIDE and I'll send it over."
Video
GLOBAL LOCK: Subject is Natalia Dyer, an American actress with an oval face, high cheekbones, large expressive brown eyes, and fair skin with natural warmth. Her hair is dark brown, long, and wavy, styled into two thick, loose braids falling over her shoulders. She wears a dark, high-collared cloak/coat. Her expression is neutral, serene, and slightly melancholic, looking directly at the camera. The camera is a static Medium Close-Up (MCU) with a cinematic 35mm lens feel. High-fidelity skin textures and realistic lighting are mandatory.

[00:00–00:01]
Subject is centered in a grand, atmospheric gothic cathedral. Background features intricate stone arches and stained glass windows. Lighting: Misty, volumetric light beams (God rays) filter through the windows, creating a teal and orange contrast. Subject's face is softly lit by the ambient glow. Motion: Subtle dust motes dancing in the light beams.

[00:01–00:02]
Subject is centered in a vast golden hour meadow. Background features tall, dry grass and a distant horizon under a setting sun. Lighting: Warm, intense amber backlighting creating a soft rim light on her hair and cloak. A subtle lens flare peeks from the corner. Motion: Very slight swaying of the grass in the background.

[00:02–00:03]
Subject is centered in a dense autumn forest. Background is filled with vibrant orange and red maple leaves. Lighting: Dappled sunlight filtering through the canopy, creating soft patches of light on her face. Shallow depth of field with a creamy bokeh effect on the leaves. Motion: A few leaves slowly falling in the background.

NEGATIVE PROMPT: 
Facial distortion, changing eye color, changing hair style, inconsistent facial features, cartoonish look, plastic skin, extra limbs, blurry face, text, watermark, logo, flickering lighting, sudden jumps in subject position, robotic movement, oversaturated colors, low resolution.
Video
Create a vertical 9:16 premium AI model promo visual featuring an ultra-realistic close-up portrait of a young woman facing directly into camera against a dark teal background. She has fair skin, dark hair pulled back, subtle natural makeup, and translucent amber-orange eyeglasses catching a precise highlight across the frame. The lighting should be soft but dramatic, sculpting the face with studio precision and emphasizing realistic skin texture, calm eyes, and balanced symmetry. In the composition, glowing yellow ImagineArt 1.0 text appears in the upper right, while Most Realistic AI Model is set large at the bottom like bold creator-marketing typography. The overall feeling should be a polished product ad announcing a highly realistic character-generation model for creators and brands. No clutter, no subtitles, no cartoon styling.
Video
GLOBAL LOCK: 
Subject is a Caucasian male in his mid-30s with a well-groomed brown beard and medium-length wavy brown hair. He consistently wears a white and olive-green "VANS" trucker hat and a plain, high-quality white crew-neck t-shirt. The environment for the creator's shots is a warm, indoor setting with soft ambient lighting and a neutral, slightly out-of-focus background. The AI-generated content features a cinematic, high-contrast aesthetic with vibrant colors (primarily deep reds and blacks). The speech is energetic, clear, and direct-to-camera, delivered with a "tech-enthusiast" persona.

[00:00–00:05]
Visual: A cinematic, deep red Porsche 911 is shown from multiple angles: top-down, rear view, and 3/4 side profile. The car has a metallic finish and is set against a dark, moody red background with dramatic studio lighting. Text overlay reads "Multiview Perspective Change."
Subject: The creator appears in a small, rounded-square overlay at the bottom center, pointing upwards with both index fingers.
Camera: Smooth transitions between static product shots.
Speech: "This genuinely feels like a cheat code to create high-quality AI visuals for your brand or business."
Sync: Cut to the next shot on the word "business."

[00:05–00:19]
Visual: A rapid-fire montage of the creator's face swapped into various AI-generated scenes: 
1. A close-up of the VANS hat.
2. A model holding a smartphone.
3. A bold fisheye portrait wearing colorful puffer jackets and sunglasses.
4. An "Indie Garden Polaroid" shot with sunflowers and a guitar.
5. A "Halloween Party" shot of the creator in a yellow duck costume holding a red cup.
6. An "Urban Glare Portrait" in a city street.
Subject: Creator remains in the bottom overlay, gesturing with his hands as if explaining the variety.
Motion: Fast cuts (approx. 1-2 seconds each) with slight zoom-ins.
Speech: "This is called Blueprints, and it allows you to create multiple angled shots of any scene. You can upload product reference images and you can even replicate certain styles of images with a simple VFX template they've created for you."

[00:20–00:35]
Visual: Screen recording of the Leonardo.ai interface. The cursor moves to the left sidebar, hovering over and clicking the "Blueprints (Beta)" button highlighted with a red box. It then scrolls through a gallery of templates, selecting "Product Studio Photoshoot."
Subject: Creator in the overlay, looking slightly off-camera as if watching the screen, pointing to the UI elements.
Speech: "All you have to do is upload an image of yourself, and here's how to do it. To get started on Leonardo, you can go to the Blueprints section, and they have all of these different templates."

[00:36–00:45]
Visual: The UI shows the "Upload Person Photo" step. A photo of the creator in his white t-shirt and VANS hat is uploaded. Then, a "Product Photo" of a black smartphone is uploaded. The "Generate" button is clicked. The result shows the creator holding the phone in a professional studio setting.
Subject: Creator in the overlay, nodding and smiling as the result is revealed.
Speech: "You can then select one you want and upload a reference image of your face, for example, and then hit next. Now you can upload a reference image of a product, and then boom! You can actually create images of you holding the product in that environment."

[00:46–00:51]
Visual: The UI shows a "Multiview Perspective Change" generation of the creator sitting on a park bench from different angles (back view, side view, top-down). The video ends with the creator full-screen (or large overlay) against a dark background with the text "TYPE AI COMMENTS."
Subject: The creator winks at the camera and points forward.
Speech: "But it gets crazier because you can use different templates like multiview perspective... if you want to try it out for yourself, type AI in the comments and I'll send you the link."
Sync: Final wink lands exactly on the last word.

NEGATIVE PROMPT:
Visual: blurry face, inconsistent beard length, distorted VANS logo, extra fingers, flickering background, low-resolution UI, robotic body movements, unnatural skin texture, messy hair transitions.
Speech: monotone delivery, background noise, muffled audio, robotic cadence, misaligned lip-sync, harsh "S" sounds, long pauses between sentences.

SPEECH PACK:
[00:00-00:05]
Transcript: "This genuinely feels like a cheat code to create high-quality AI visuals for your brand or business."
TAKE_A: (Energetic, emphasizing "cheat code" and "business")
TAKE_B: (Fast-paced, breathless excitement)
TAKE_C: (Confident, authoritative tone)

[00:46-00:51]
Transcript: "If you want to try it out for yourself, type AI in the comments and I'll send you the link."
TAKE_A: (Friendly, inviting, with a wink at the end)
TAKE_B: (Direct, urgent, pointing at the camera)
TAKE_C: (Casual, "by the way" style delivery)
Video
GLOBAL LOCK: Vertical 9:16 UGC tutorial reel with a persistent two-layer presentation style: the upper 60 to 70 percent of the frame shows demonstrations, screenshots, typed prompts, and generated image results; the lower portion shows the same male creator speaking directly to camera in a rounded-corner selfie window for most of the video. The creator is a white male in his late 20s to mid 30s, medium-length wavy dark brown hair, short beard and mustache, expressive eyebrows, average build, casual creator aesthetic. Keep his delivery energetic, friendly, and persuasive. Wardrobe changes are intentional by section: white tee and cream Vans cap at the opening studio desk, blue polo and backward cap for the main explainer section, yellow suit jacket and black top hat for the final gag CTA. Upper-frame design alternates between a white studio opening, black presentation slides branded "Google Nano Banana" with a banana emoji, product-demo image canvases, and dark Freepik interface screens on a soft orange-blue gradient background. The reel should feel like an AI creator tutorial ad: quick but readable, clean text overlays, obvious prompt boxes, high contrast UI, fast social pacing, light jump cuts, and consistent bottom talking-head commentary. Speech style is single-speaker direct-to-camera tutorial English with crisp articulation, upbeat cadence, short persuasive sentences, and creator-economy CTA energy. Audio should sound like a close phone or lav mic in a quiet room, lightly compressed, dry, intelligible, and synced to the speaker window.

[00:00-00:04.50] Open on a bright white studio setup. The upper frame shows the colorful Google wordmark above the title "Nano Banana" with a banana emoji. Centered below it, the creator sits behind a white table in a cream Vans cap and light shirt, leaning toward a turquoise striped cup-shaped microphone or tumbler. Softbox lights are visible on both sides, making the setup feel like a casual creator studio. In the lower portion of frame, a separate rounded-corner selfie video of the same man begins speaking directly to camera. He introduces the tool with immediate enthusiasm. Lips are fully visible in the lower video; lip-sync strictness high for the first spoken hook.

[00:04.50-00:10.00] Cut to a black presentation layout branded "Google Nano Banana" at the top. The upper demo area shows a bright outdoor image of the creator on a Grand Canyon style cliff-edge walkway, arms stretched, backpack on, huge sky and canyon behind him. A prompt box appears under the image and begins typing "Make it into a youtube thumbnail". The lower selfie speaker remains on screen in the blue polo and backward cap, gesturing with one hand while explaining the edit. The tone is excited, helpful, and a little amazed. Keep the typed prompt animation readable and central.

[00:10.00-00:14.50] The same canyon image updates into a louder thumbnail treatment with giant curved yellow "GRAND CANYON" text behind the creator’s head. Emphasize the before-and-after value clearly: same base photo, more clickable YouTube-style packaging. The lower speaker continues talking in sync with hand gestures. Audio remains a crisp tutorial voice, no music overpowering the speech.

[00:14.50-00:20.50] Transition to a luxury product-edit example. In the upper frame, a prompt card reads "Replace the bottle" with a small reference thumbnail, then the output becomes a glossy Dior Sauvage-style perfume bottle on swirling golden light trails over a dark brown-black studio background. Maintain premium ad aesthetics, reflective glass, centered bottle, and luminous streaks. The lower talking-head explains the edit use case, likely referencing product replacement or image transformation. Speech stays fast, punchy, and creator-friendly.

[00:20.50-00:24.00] Briefly show another generated image example in the upper area, including a polished portrait-style output that demonstrates broader image editing capability beyond product swaps. Keep the cut quick and social-first, serving as visual proof rather than a full tutorial pause. The bottom speaker window continues uninterrupted, preserving continuity.

[00:24.00-00:31.50] Move into the software walkthrough. The upper frame now shows the Freepik dark UI over a soft gradient backdrop, starting with an AI Suite menu containing categories like image tools, video tools, audio tools, and design tools. Then zoom into the model panel where "Google Nano Banana" is selected, with image reference slots, style/composition/effects/character/object controls, and a beta disclaimer about aspect ratio. The creator in the lower window counts features with his fingers while describing how to access the workflow. Keep the UI readable enough for social tutorial viewing, but still fast-paced.

[00:31.50-00:36.50] Continue the interface demo with more dark UI panels, prompt fields, thumbnails, and settings sections scrolling or cutting through the workflow. The creator keeps speaking in direct, practical language, as if walking viewers through where to click and how to upload references. Camera on the lower speaker remains static, head-and-shoulders, neutral indoor room with door and wall behind him.

[00:36.50-00:43.00] End with a comedic CTA transformation. The upper frame shows a prompt reading "Give him a sign to hold" while the creator appears dressed like a theatrical ringmaster or showman in a yellow jacket and tall black top hat on a sunlit balcony. He holds a handmade cardboard sign that reads "Comment AI and I'll send you the link!" The lower talking-head still speaks beneath, landing the call to action. The final beat should feel playful, persuasive, and optimized for comments. Lip-sync remains visible in the lower window; key sync accents should land on the CTA words "comment AI" and "send you the link".

NEGATIVE PROMPT: extra fingers, warped hands during gesturing, drifting facial hair, inconsistent eye color, duplicated selfie windows, unreadable UI, misspelled "Google Nano Banana", broken prompt boxes, random logos, muddy text, incorrect YouTube thumbnail lettering, deformed perfume bottle glass, floating product shadows, overexposed softboxes, messy background clutter, cinematic bokeh that hides the tutorial content, abrupt framing jumps, desynced speech, robotic cadence, slurred consonants, harsh sibilance, echoey room tone, loud background music, clipping, pumping compression, lip-sync mismatch, subtitle blocks covering the demo.

SHOT PROMPTS:
SHOT_1 [00:00-00:04.50]: White studio opener, Google Nano Banana title, creator at desk with Vans cap and turquoise cup, bottom selfie explainer starts.
SHOT_2 [00:04.50-00:10.00]: Black branded demo screen, Grand Canyon reference photo, typed prompt box for YouTube thumbnail conversion, bottom speaker explains.
SHOT_3 [00:10.00-00:14.50]: Thumbnail result reveal with giant GRAND CANYON text, same split-screen layout, energetic creator commentary.
SHOT_4 [00:14.50-00:20.50]: Product-edit demo, perfume bottle replacement prompt, luxury golden-light result, bottom speaker continues.
SHOT_5 [00:20.50-00:24.00]: Quick alternate polished image result proving editing range.
SHOT_6 [00:24.00-00:31.50]: Freepik AI Suite walkthrough, dark UI menus, Google Nano Banana model selected, image reference slots and controls visible.
SHOT_7 [00:31.50-00:36.50]: More UI steps, prompt/settings panels, creator explains workflow and uploads.
SHOT_8 [00:36.50-00:43.00]: Final joke CTA, top hat outfit, cardboard sign asking viewers to comment AI for the link, bottom talking-head closes the pitch.

SPEECH PACK:
Timecoded transcript (best-effort, inferred from visible overlays and tutorial cadence):

[00:00-00:04.50]
TAKE_A: "Please use this if you have not already. It is a game changer."
TAKE_B: "If you are not using this yet, you need to. It is a total game changer."
TAKE_C: "This tool is a game changer, and you should absolutely be using it already."
Prosody: fast hook, confident, slightly urgent, friendly creator tone.

[00:04.50-00:10.00]
TAKE_A: "You can take an image like this and ask Nano Banana to turn it into something more clickable."
TAKE_B: "Watch this. I can upload a photo and prompt Nano Banana to make it into a YouTube thumbnail."
TAKE_C: "Here is a simple example. Drop in an image and tell it to make a YouTube-ready thumbnail."
Prosody: explanatory, upbeat, demonstration-first.

[00:10.00-00:14.50]
TAKE_A: "It keeps the subject but gives you a much stronger thumbnail treatment."
TAKE_B: "Same image, better packaging. That is why this is so useful for creators."
TAKE_C: "This is the kind of upgrade that makes basic content feel publish-ready."
Prosody: impressed, selling practical value.

[00:14.50-00:20.50]
TAKE_A: "You can also do product swaps, like replacing the bottle and turning it into a premium ad."
TAKE_B: "It is not just thumbnails. You can replace products and restyle the entire scene."
TAKE_C: "This works for product creatives too. Swap the object and it rebuilds the shot around it."
Prosody: persuasive, slightly faster, feature-stack delivery.

[00:20.50-00:24.00]
TAKE_A: "And it is not limited to one type of image either."
TAKE_B: "You can use the same workflow across different visual styles."
TAKE_C: "That flexibility is what makes the tool stand out."
Prosody: transitional, concise.

[00:24.00-00:31.50]
TAKE_A: "Inside Freepik, open the AI Suite, choose Google Nano Banana, and upload your image references."
TAKE_B: "If you want to try it, go into AI Suite, pick the Nano Banana model, then add your reference image here."
TAKE_C: "This is where it lives in Freepik. Select the model, drop your images in, and start prompting."
Prosody: instructional, practical, clear enunciation.

[00:31.50-00:36.50]
TAKE_A: "Then you can use the style, composition, effects, character, and object controls to shape the result."
TAKE_B: "From here you fine-tune the edit with the controls and prompt box."
TAKE_C: "Once the image is in, the rest is just directing the model with these tools."
Prosody: matter-of-fact, tutorial rhythm.

[00:36.50-00:43.00]
TAKE_A: "Want to try it? Comment AI and I will send you the link with unlimited generations on Freepik."
TAKE_B: "If you want access, comment AI and I will send you the link."
TAKE_C: "Comment AI for the link and I will send it over."
Prosody: bright CTA, direct ask, strong emphasis on "comment AI".
Video
GLOBAL LOCK: A young man in his early 20s, Mediterranean/Southern European appearance, olive skin tone, curly dark brown hair, well-groomed mustache and goatee. He wears a black cotton t-shirt with a vintage-style graphic print. The environment is a modern home office with soft, natural indoor lighting and a blurred background containing shelves and posters. Cinematic color grading with high dynamic range and soft highlight rolloff. Speech is energetic, clear, and direct-to-camera.

[00:00–00:02]
Subject: The man in a maroon and navy blue soccer jersey with "PEOPLESTYLE 07" on the front.
Environment: A grey asphalt street with white crosswalk markings.
Action: Standing still, looking directly at the camera with a neutral expression.
Framing: Medium shot, eye level.
Lighting: Warm, sepia-toned, mimicking the aged oil painting texture of the Mona Lisa shown in the top half of the split screen.
Motion: Subtle handheld camera micro-shake.
Speech: No speech, upbeat background music starts.

[00:02–00:03]
Subject: The man in a dark charcoal suit, white shirt, and striped tie.
Environment: A high-rise office with a large window overlooking a city skyline.
Action: Holding a vintage black desk phone to his ear, looking slightly off-camera.
Framing: Medium shot, eye level.
Lighting: High contrast, deep blues and vibrant yellows, mimicking Van Gogh's "Starry Night" shown in the top half.
Motion: Static camera.

[00:03–00:05]
Subject: The man in a plain black t-shirt.
Environment: An outdoor desert landscape at dusk.
Action: Profile view, looking over his shoulder toward the camera.
Framing: Medium close-up, side angle.
Lighting: Monochromatic warm orange glow, soft backlighting, mimicking the geometric 3D art above.
Motion: Slow camera pan around the subject.

[00:05–00:11]
Subject: The man in the global lock black graphic tee.
Environment: Home office desk with a laptop in the foreground.
Action: Talking to the camera, using expressive hand gestures (palms up, moving outward).
Framing: Medium close-up, eye level.
Lighting: Natural window light from the side, shallow depth of field.
Speech: "to your... with absolutely no prompts... that's why I started using..." (Energetic, persuasive tone).
Sync: High lip-sync strictness; cuts land on phrase endings.

[00:11–00:20]
Visual: Screen recording of the Higgsfield Hex interface. A dark mode dashboard. A cursor moves to click a "Color transfer" button. An abstract red, black, and white painting is uploaded. The UI extracts a color palette (red, pink, tan).
Action: Digital UI interaction.
Lighting: Clean digital screen glow.
Speech: Narrating the process (implied).

[00:20–00:37]
Subject: Back to the man in the home office.
Environment: Same as [00:05-00:11].
Action: Continuing to talk and gesture. Floating UI cards appear in front of him showing various images (a white goat, a vintage car, a blonde woman) all styled with the same color palette.
Framing: Medium close-up.
Text Overlays: "ARTISTIC VISION NOW DECODED", "#hex", "Comment 'SOUL'".
Speech: "and that's it... choose... artistic vision now decoded... if you want to try this out, comment 'SOUL' and I'll send you..."
Sync: High lip-sync strictness. Final cut on the CTA.

NEGATIVE PROMPT: Robotic speech, flat delivery, blurry face, inconsistent facial hair, flickering lighting, distorted UI text, messy background, unnatural hand movements, low-resolution textures, over-saturated colors, lip-sync lag.

SPEECH PACK:
[00:05–00:11]
Transcript: "...to your videos with absolutely no prompts. That's why I started using..."
TAKE_A: (Fast, excited) "...to your videos with absolutely NO prompts! That's why I started using..."
TAKE_B: (Confident, steady) "...to your videos with absolutely no prompts. [pause] That's why I started using..."

[00:20–00:37]
Transcript: "And that's it. Choose... artistic vision now decoded. If you want to try this out, comment 'SOUL' and I'll send you the link."
TAKE_A: (Inviting) "And that's it! Just choose... artistic vision now decoded. If you want to try this out, comment 'SOUL' [emphasis] and I'll send you the link!"
TAKE_B: (Direct) "And that's it. Choose your style. Artistic vision decoded. Comment 'SOUL' now and I'll send it over."
Video
A vertical creator tutorial video about achieving AI character consistency across generations and workflows. A female presenter speaks directly to the camera against a clean lavender-purple background while holding a handheld microphone and explaining a multi-step process labeled with numbered sections like #1, #2, #3, and #4. As she talks, large overlays appear showing reference portraits, facial expressions, hat variations, prompt text, interface screenshots, parameter panels, model settings, and examples from different AI tools. The video walks through how to build a consistent character, refine realism, preserve facial identity, manage textures, and combine different generation tools into one repeatable system. The mood is educational, structured, creator-friendly, and optimized for short-form AI workflow teaching.
Video
GLOBAL LOCK: A vertical promotional AI video tile designed like a social-media prompt pack cover. Keep the composition consistent: a black decorative border with tiny star sparkles, large handwritten-style text at the bottom reading “+100 Prompts”, and a central portrait area showing a blonde young woman whose look shifts between stylized cartoon beauty and photoreal beauty. Keep the subject identity consistent across all frames: fair-skinned young woman, short blonde bob haircut, soft green or hazel eyes, black off-shoulder top with thin straps, black choker, delicate pretty expression. The visual concept is a smooth transformation or comparison between two aesthetics: a doll-like illustrated version and a realistic camera-ready portrait version. Background stays minimal and soft. Motion is subtle, focused on transition and light pose variation rather than action. No dialogue, no extra subtitles, no logos beyond the baked-in “+100 Prompts” design.

[00:00-00:01] Open on the stylized version of the blonde woman inside the black framed promo card. The face is slightly doll-like, with softened illustrated features, while the “+100 Prompts” text and sparkly border are already visible.

[00:01-00:02] The central portrait begins shifting into a more photoreal interpretation. Keep the bob haircut, choker, and off-shoulder black top fixed so the viewer reads this as a style transformation, not a different person.

[00:02-00:03] The realistic version becomes dominant: cleaner skin detail, natural lighting, and a more photographic face. The border, stars, and handwritten title remain static and legible.

[00:03-00:04] The portrait subtly drifts back toward the softer stylized look, as if comparing two prompt outcomes within the same branded card layout. Preserve the same gentle head angle and calm expression.

[00:04-00:05] End with the stylized portrait or a halfway blend that still clearly communicates the before-and-after concept. The final frame should feel like a course promo visual for a large prompt pack focused on portrait styles.

NEGATIVE PROMPT: missing border, missing stars, missing “+100 Prompts” text, unrelated background, hair color drift, changing clothing, extra accessories, warped bob haircut, asymmetrical face, heavy camera movement, subtitles, logos, watermark clutter, broken style transition, distorted eyes, unstable choker, aggressive morphing, uncanny blend artifacts.

SHOT PROMPTS:
SHOT 1 DELTA: establish stylized blonde portrait inside sparkly black promo frame.
SHOT 2 DELTA: begin transition toward realistic portrait while identity stays locked.
SHOT 3 DELTA: realistic beauty version fully readable, promo layout unchanged.
SHOT 4 DELTA: soften back toward stylized look for direct prompt-comparison feel.
SHOT 5 DELTA: finish on a clear branded style-comparison hero frame with “+100 Prompts”.

SPEECH PACK:
Timecoded transcript: no dialogue is present in the reference clip.
TAKE_A [00:00-00:05]: silent promo-card transformation, no speech.
TAKE_B [00:00-00:05]: no spoken words, portrait-style comparison only.
TAKE_C [00:00-00:05]: quiet prompt-pack cover animation showing stylized versus realistic portrait output.
Closest audible version: no intelligible spoken content detected.
Safe paraphrase version: a blonde portrait shifts between cartoon-like and realistic styles inside a branded “+100 Prompts” card.
Video
GLOBAL LOCK: A vertical AI tutorial video combining a talking-head presenter and step-by-step static visual slides. The presenter is a young woman with long dark brown hair, fair skin, and a fitted white sweater, seated in front of a soft pink-lilac studio background. The tutorial is built around Google Gemini and shows how to use prompt packs for different photo-enhancement tasks: restoring and colorizing old family photos, turning a casual portrait into a passport-style headshot, improving male portrait accuracy using face-shape and hairstyle references, and combining multiple prompt blocks into one reusable master prompt. The overall design uses a teal-green slide background, floating image cards, arrows, and large numbered sections like #3, #4, and #5. Keep the educational tone, slide-driven pacing, and Gemini branding consistent throughout. Speech should be clear, direct, and creator-oriented, with close dry mic sound and paced social-video caption timing.

[00:00–00:04] Open with the presenter promising to show prompt sets for Google Gemini. She appears in a small talking-head frame over a teal instructional background while stacked text blocks and the Gemini logo appear beside her. The tone is straightforward and valuable, like a creator giving away useful workflow templates.

[00:00–00:04] The opening line should sound like a practical tutorial intro, emphasizing that the viewer will get prompts they can reuse. Sync should align with words such as “show you,” “prompts,” and “Google Gemini.”

[00:04–00:10] Transition into a slide showing old family photographs transforming into restored or colorized versions. Use card-like images of black-and-white family portraits rotating or swapping into cleaner, modernized images. The presenter explains that Gemini can help enhance old photos and restore image quality. Keep visual arrows and before/after relationships obvious.

[00:10–00:15] Move to a passport-photo conversion section. Show a casual female portrait as input and a clean, centered passport-style headshot as the result. The presenter explains how one of the prompts can convert an ordinary image into a more formal ID / passport-ready format. Use neutral backgrounds and clear face centering to emphasize the transformation.

[00:15–00:21] Introduce a face-structure and hairstyle guidance section for male portraits. Show diagrams of head shapes, hair reference charts, a celebrity-like sports portrait, and improved portrait outputs of the same male subject in different styles. The presenter explains that adding face shape and hair references improves likeness and overall accuracy. The comparison should feel systematic and instructional rather than purely aesthetic.

[00:21–00:27] Shift to another numbered section focused on prompt construction. Show a stylish woman’s portrait, a separate prompt block, and then a refined final output. The presenter explains how to combine image references and descriptive instructions to sharpen the final look. Text overlays and slide panels should imply that several separate prompt fragments are being organized into one effective workflow.

[00:27–00:35] End with full text-slide examples showing long prompt paragraphs and a final note that the creator has combined all prompts into one. Large text urges viewers to comment “Gemini” to receive the full set. The presenter may no longer be visible in these last frames; instead, the tutorial closes with readable document-like slides and a strong CTA focused on reuse and download.
Video
GLOBAL LOCK:
Subject is a Caucasian male in his early 30s, dark wavy hair, well-groomed medium-length beard, expressive brown eyes. He maintains a consistent facial structure across all shots. The visual style is a mix of high-end editorial photography and UGC tutorial footage. Lighting is cinematic with soft key lights and motivated rim lighting. Color grade is professional with deep blacks and vibrant but natural skin tones. Speech is clear, energetic, and instructional, delivered with a warm, authoritative tone.

[00:00–00:01]
Subject: MCU of the man wearing a dark suit, white dress shirt, black tie, and a white baseball cap with a green brim.
Action: Talking directly to the camera. A vertical white rectangular mask moves across his face, revealing a slightly different version of the same scene.
Camera: Static MCU, eye-level.
Lighting: Soft studio lighting, neutral background.
Speech: "This is how you can create..."

[00:01–00:04]
Subject: Rapid montage of AI-generated images. 
1. Man in a dark suit and sunglasses driving a green car at night, "AI MAG" text overlay.
2. Man in a checkered blazer and paisley tie in front of a brick wall.
3. Man in a white short-sleeve shirt with multiple pens in his pocket, standing in a white studio.
Action: Static editorial poses.
Camera: Various (MS, MCU).
Lighting: Cinematic, high contrast, nighttime car lighting, studio softbox.
Grade: Magazine editorial style.

[00:05–00:08]
Subject: A 3x4 grid of 12 different AI portraits of the same man in various outfits (boxing gloves, red car, street style, suit).
Action: Static images.
Overlay: Large bold text "UNLIMITED GENERATIONS" in orange and blue.
Camera: Flat grid layout.
Lighting: Varied per image.

[00:09–00:14]
Environment: Screen recording of the Higgsfield.ai website interface. A cursor moves to click "Image" then "Soul ID Character".
Action: UI navigation.
Speech: "On Higgsfield.ai, go to image and select Soul ID Character..."

[00:15–00:20]
Subject: Picture-in-picture of the man talking (wearing a tan cap and beige shirt) over a screen recording of the "Make Your Own Character" page.
Action: Explaining the process while gesturing.
Speech: "...where you can actually create your own custom character of yourself by uploading a bunch of photos."

[00:21–00:24]
Subject: Montage of AI images with text prompts.
1. Man in a suit drinking from a glass (trippy lens effect).
2. Man in a tan suit with a "Micky Mouse Bag" in a city street.
3. Man in a white tank top and jeans in front of a "Tokyo Red Car".
Action: Posing.
Camera: Full body and MS.
Lighting: Bright daylight, stylized urban lighting.

[00:25–00:34]
Environment: Screen recording of the "Lipsync Studio" interface. Subject's PIP continues.
Action: Selecting "Video", then "Lipsync Studio", uploading an image of himself at the beach, and dragging an audio file named "voiceover.wav".
Speech: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio..."

[00:35–00:38]
Subject: CU of the man at a tropical beach. He is shirtless, wearing black swimming goggles on his head.
Action: He is lip-syncing perfectly to the audio, smiling slightly.
Environment: Bright blue ocean water with small waves in the background.
Camera: CU, static.
Lighting: Bright, direct sunlight with natural shadows.
Speech: "...and it will combine those two together with the best lip-sync models."

NEGATIVE PROMPT:
Visual: robotic movement, distorted facial features, inconsistent beard growth, blurry textures, flickering background, extra fingers, warped UI elements, low resolution, watermarks.
Speech: robotic monotone, lip-sync delay, muffled audio, background hiss, unnatural pauses, slurred consonants, popping sounds.

SPEECH PACK:
[00:00-00:08]
Transcript: "This is how you can create 25 magazine-ready images of yourself using AI and then you can even lip-sync on top of them with this brand new feature."
TAKE_A: (Energetic, fast-paced) "This is how you can create TWENTY-FIVE magazine-ready images of yourself using AI... and then you can even LIP-SYNC on top of them with this brand new feature!"

[00:09-00:20]
Transcript: "On Higgsfield.ai, go to image and select Soul ID Character where you can actually create your own custom character of yourself by uploading a bunch of photos."
TAKE_A: (Instructional, clear) "On Higgsfield dot A-I, go to image and select Soul I-D Character... where you can actually create your own custom character of yourself... by uploading a bunch of photos."

[00:25-00:38]
Transcript: "Now you can go to video at the top of the page and select the Lipsync Studio where you can upload your photo and audio and it will combine those two together with the best lip-sync models."
TAKE_A: (Helpful, concluding) "Now you can go to video at the top of the page and select the Lipsync Studio... where you can upload your photo and audio... and it will combine those two together with the best lip-sync models."
Video
GLOBAL LOCK: beauty-tech vertical explainer reel, photoreal close-up of a young woman with clear skin, dark center-parted hair, neutral makeup, warm indoor room lighting, framed in selfie-style portrait shots inside a softly lit home interior with lamp and shelves in the background, strong on-screen headline text about "Nano Banana Pro" and "the secret to perfect realism with AI", model points to her under-eye area, cheeks, lips, and overall face to demonstrate skin detail and realism, the edit alternates between close beauty crop and medium portrait framing, polished social-media ad aesthetic, no extra people.

00:00-00:02
Open on an extreme close-up of one eye and cheek while the model points directly under the eye to highlight skin realism and detail. Large bold text overlay fills the lower half of the frame, emphasizing Nano Banana Pro and the promise of perfect realism with AI. The shot must feel like an attention-grabbing hook for a beauty-tech reel.

00:02-00:04
Shift to another close crop of the same face from a slightly different angle, maintaining soft warm room light and ultra-clean skin texture. The model keeps one finger near the eye area so the viewer knows exactly what detail is being demonstrated.

00:04-00:06
Move to a centered face shot with both eyes and upper face visible. The model gestures to the cheek area and holds a composed, neutral expression while the text overlay remains prominent. The room background should stay softly blurred and domestic, not studio-commercial.

00:06-00:08
Cut to close beauty framing on lips and lower face, then back to a full frontal face shot. The edit should emphasize smooth transitions between macro skin-detail zones and overall portrait realism. Keep the model calm, confident, and slightly posed like an ad demonstration.

00:08-00:10
Reveal a wider portrait view where the woman touches or frames her face with one hand, then another variation with both hands near the cheeks. The text still anchors the message that the tool is about realism, while the camera proves it with consistent skin texture and facial symmetry.

00:10-00:10.63
Finish on a softer medium portrait with the model slightly angled and looking camera-right, still in the same warm room setting. End with the beauty result rather than adding more feature clutter.

CAMERA: vertical beauty-ad framing, alternating macro facial close-ups and medium portrait shots, shallow depth of field, clean static compositions, no shaky handheld motion.

LIGHTING: warm indoor practical light, soft frontal fill on the face, gentle lamp glow in the background, smooth skin highlights without harsh contrast.

GRADE: polished neutral-beauty palette, healthy skin tones, controlled warmth, crisp but flattering facial detail, creator-reel social media finish.

MOTION: subtle hand gestures pointing to under-eye, cheek, lips, and overall face, small head turns, steady eye contact, minimal camera movement so the realism demo remains readable.

SPEECH PACK:
- No spoken audio required; this reel functions as silent text-led visual proof.
- If sound is added in recreation, use light trendy social media background music with no voice-over.
- Prioritize the visual headline and facial detail over any audio storytelling.

NEGATIVE PROMPT: harsh studio flash, beauty filter blur, heavy glam makeup, extra people, outdoor scene, dramatic fashion editorial lighting, cartoon skin, waxy face texture, over-smoothed pores, noisy room background, text-free edit, watermark, logo clutter beyond a small brand mark, distorted facial symmetry.
Video
GLOBAL LOCK: A consistent female subject, Caucasian, early 20s, shoulder-length messy blonde/light-brown hair, natural makeup, wearing a simple black tank top. The environment is a minimalist studio with a dark grey, out-of-focus background. Lighting is soft-box studio style, creating gentle highlights on the face. The video is a split-screen comparison with a vertical white slider line moving across the frame.

[00:00–00:03]
The subject is framed in a medium close-up, centered. On the left side of the vertical slider, her skin appears slightly too smooth and "AI-generated." On the right side, the skin is hyper-realistic with visible pores and natural texture. The slider is positioned on the far left. The subject remains static with a neutral, calm expression, looking directly at the camera.

[00:03–00:07]
The vertical white slider line moves steadily from the left edge of the frame to the right edge. As it passes over the subject's face, the "smooth" skin on the left is replaced by "hyper-textured" skin on the right. The transition is sharp and follows the slider line exactly. The subject's hair and clothing remain perfectly consistent across the transition.

[00:07–00:10]
The slider reaches the right side of the frame, revealing the fully enhanced, realistic face. The subject maintains her neutral gaze. The lighting remains constant, emphasizing the newly revealed skin texture, fine lines, and realistic highlights on the nose and forehead. The video loops seamlessly back to the start.

NEGATIVE PROMPT: blurry, distorted facial features, inconsistent hair movement, flickering lighting, plastic-looking skin on the "after" side, unnatural eye reflections, jittery slider movement, low resolution, watermarks, text artifacts on the subject.

SPEECH PACK:
(No speech present in the original video; it relies on text overlays and background music.)
TRANSCRIPT: [Background Music Only]
TAKE_A: N/A
TAKE_B: N/A
TAKE_C: N/A
PROSODY: N/A
SYNC: N/A

AI Sketch Generator

AI sketch generator content works best when it treats the drawn look as the main value, not a side effect. People searching this topic usually want something that feels like a real pencil sketch, charcoal study, or hand-drawn concept piece. That can mean converting a photo into a more graphic line-based result or generating a sketch from scratch with a specific artistic feel.

The strongest examples here should make the medium feel believable. Good sketch output depends on line weight, shading behavior, hatching, and graphite or charcoal texture that reads as intentional rather than cleanly digital. When you compare ideas on this page, focus on whether the result feels like a true drawing, whether the style suits the subject, and whether it could work as a gift, concept study, or stylized portrait.

FAQ

What is an AI sketch generator best for?

It is best for creating pencil sketches, charcoal drawings, portrait studies, and photo-to-sketch artwork with a hand-drawn feel.

How is this different from a normal image generator?

This category is judged by how well it imitates drawing techniques like line work, shading, and texture rather than by photorealism.

Who is this page useful for?

It is useful for artists, gift makers, concept designers, and anyone who wants a drawn aesthetic instead of a clean digital image.

What should I compare on this page?

Compare line quality, sketch texture, and whether the output feels like an actual drawing rather than a filtered photo.

AI Sketch Generator: Pencil and Charcoal Sketch Ideas | Alici.AI