Ai Spotify Canvas Maker
Spotify Canvas asks for a very narrow kind of visual: short, loopable, vertical, and calm enough to live behind a song without hijacking it. This page gathers Alici examples and prompts for 3 to 8 second Canvas-style loops built for mobile listening and seamless repeat playback.
GLOBAL LOCK: vertical 9:16 creator tutorial reel, one consistent young adult male host with light skin, slim build, black backwards baseball cap, black hoodie, seated at a desk with a black microphone accented by red lighting, dark studio background with magenta-blue rim light, clean social-media talking-head aesthetic, frequent cutaways to iPhone screen recordings and desktop UI captures, crisp contrast, sharp subtitles, direct-to-camera educational delivery, fast pacing, screen-demo workflow energy, voice remains the same confident male speaker throughout, close-mic sound with dry room tone and clear consonants. [00:00-00:03] Open with a high-speed hook collage: several glossy AI-generated coin or medallion-style motion-graphic examples appear at the top while bold thumbnail text promises viewers they can make this from their phone. Cut immediately into an iPhone screen showing a text field and app navigation, establishing a mobile-first tutorial workflow. [00:03-00:06] Continue with phone screen recordings of typing into ChatGPT or a GPT search interface. Show keyword searches for the right assistant or GPT tool while subtitle words land one by one. The host is not always visible, but his narration stays continuous, fast, and instructional, with cuts landing on emphasized phrases. [00:06-00:10] Alternate between the host’s face and mobile UI screens. The host looks directly at camera with a neutral but focused expression, speaking in a concise “here’s the exact process” tone. The phone screen shows menus, search results, and a selected motion-graphics-related GPT or helper. [00:10-00:14] Move into a message-composition phase on the phone. A long, detailed prompt is typed or pasted into a chat interface requesting motion-graphic image generation with clear visual constraints. Keep the UI legible and the pacing brisk, with punch-ins on key words like image, detailed, or copy. [00:14-00:18] Show the generated or referenced output and transition into desktop or browser captures featuring AI video or motion tools. Include interfaces associated with cinematic generation platforms like Higgsfield or Kling, with green-accent UI panels and creator-plan messaging visible. The host continues narrating over the demo, explaining what to do next. [00:18-00:23] Demonstrate the next workflow step inside editing or generation panels: toggling options, selecting presets, setting a background or text layer, and preparing a motion graphics sequence. Intercut brief returns to the host in the studio so the viewer stays anchored to a single teacher guiding the process. [00:23-00:28] Show more UI interactions that build the final result: adding text, adjusting layout, or exporting motion elements. The host remains seated in the same setup, speaking clearly into the desk microphone, with subtitles emphasizing functional words like background, text, yourself, and links. [00:28-00:32] End on the host full-screen in the studio, centered and speaking directly to camera with a strong CTA tone. He gestures minimally, stays upright behind the microphone, and closes by telling viewers where to get the links or workflow resources. The final beat should feel like a practical creator tutorial, not a cinematic montage. NEGATIVE PROMPT: broken smartphone UI, unreadable text, warped hands, inconsistent host identity, changing wardrobe, duplicate microphones, messy desk clutter, random overlays, flickering screen recordings, fake app interfaces, low-resolution subtitles, robotic lip sync, slurred narration, echoey room sound, harsh sibilance, clipping, jittery cuts, watermark, logo corruption. SPEECH PACK: - Hook: You can make motion graphics like this straight from your phone. - Beat 1: Start inside ChatGPT and find the right GPT or helper for motion-graphics prompts. - Beat 2: Ask it for a detailed image prompt first, then move that output into your video-generation workflow. - Beat 3: Use tools like Kling or Higgsfield to animate the asset, then add your background and text treatment. - CTA: I’ve got the links and setup in the caption, so save this and try it yourself.
GLOBAL LOCK: A 9:16 vertical social-first AI tutorial video explaining how brands or businesses can create AI videos. Alternate between a male creator speaking directly to camera and polished example visuals or screen-style workflow inserts. The speaker is a white male in his 20s or 30s with medium dark beard, shoulder-length hair, and a light baseball cap, framed chest-up against a neutral indoor background. His delivery is energetic, instructional, and creator-economy fluent. The visual examples should look like premium Midjourney-to-Kling style concept art: dreamy clouds, music-brand inspired posters, cinematic surreal scenes, and polished brand-ready compositions. Keep the edit fast, clear, and educational, with bold on-screen references to steps or examples. No heavy cinematic drama; this should feel like a viral creator tutorial optimized for Instagram. Audio is speech-first, with the speaker driving the narrative and visuals supporting each claim. [00:00-00:04] Open on a polished AI visual concept frame in a square-ish or embedded example format, such as a dreamy cloud scene with brand-style treatment, immediately establishing the quality level of the output being discussed. [00:04-00:08] Cut to the male creator speaking directly to camera, chest-up, hands visible making small emphasis gestures, clearly introducing how brands or businesses can create AI videos. [00:08-00:14] Alternate quickly between more AI-generated example images and the creator talking head, showing branded concepts, surreal poster-like compositions, and elevated campaign-style visuals while the speaker explains the workflow. [00:14-00:20] Show additional examples such as Nike-style or music-video-inspired visual mockups, then return to the speaker for practical commentary, keeping the rhythm instructional rather than purely aesthetic. [00:20-00:28] Introduce workflow-style inserts that resemble app or editing interfaces, indicating transformation from image generation to video generation, while the speaker continues the explanation. [00:28-00:40] Continue the pattern of speaker-to-example, making the process feel accessible: generate the image, upscale or refine it, animate it, and adapt it for brand content. [00:40-00:52] Close with more high-quality AI visuals and one final talking-head summary, reinforcing that this is a repeatable business or creator workflow rather than a one-off trend. NEGATIVE PROMPT: static corporate webinar, boring slide deck, low-detail AI art, cartoon mascot, ugly text clutter, broken face, stiff talking head, empty gestures, dark moody studio, podcast mic setup, random unrelated b-roll, unreadable UI, logos that dominate every frame, lip-sync mismatch, robotic speech cadence, badly generated hands, muddy branded visuals, overlong transitions, generic business office stock footage. SHOT PROMPT DELTAS: 1) Premium AI brand example visual in dreamy commercial style. 2) Direct-to-camera creator explanation with hand gestures. 3) Alternating branded concept art and tutorial talking head. 4) Workflow/UI inserts suggesting image-to-video pipeline. 5) Final summary with more polished example outputs. SPEECH PACK: [00:00-00:52] Speech present. One male speaker throughout. Delivery should be confident, quick, and educational, with short punchy sentences explaining how to create AI videos for brands or business use. No lip-sync perfection required if used only as narration over cuts; medium strictness if the mouth is visible.
GLOBAL LOCK: Subject is a Caucasian male, mid-20s, with short brown hair and a light beard, wearing a tan "VANS" trucker hat and a plain white t-shirt. He is positioned in the bottom third of the frame in a talking-head format. The top two-thirds of the frame is a digital workspace. The environment for the subject is a cozy room with warm, out-of-focus background lighting. The digital workspace is a clean, modern software UI with a white background. The video has a high-energy, fast-paced UGC tutorial style. Speech is enthusiastic, clear, and direct-to-camera. [00:00–00:03] The top 2/3 shows a rapid succession of Taylor Swift posters. First, a red and black vintage-style poster with "TAYLOR" in large block letters. Then, a collage-style poster with denim textures and "TAYLOR SWIFT" in a stylized font. The subject at the bottom is talking excitedly, gesturing with his hands. [00:04–00:06] The top 2/3 switches to Post Malone posters. One is a gritty, black-and-white screen-print with a red star over his eye and "POST" in red spray-paint font. The next is a profile shot with "F-1 Trillion" text in pink. The subject continues his energetic narration. [00:07–00:14] The top 2/3 shows a breakdown of a Leonardo DiCaprio poster. A portrait of DiCaprio appears on the left, a text prompt on the right. A progress bar fills, and a "Wolf of Wall Street" poster is revealed, featuring a screen-print texture and yellow/black color scheme. The subject points upwards toward the visuals. [00:15–00:25] The top 2/3 shows the "Lovart" website interface. A cursor clicks "New Project." The subject explains the tool. The cursor types "Create me a poster for Ed Sheeran" into a chat box. A model selection menu pops up, and "Nano Banana Pro" is selected. [00:26–00:37] The top 2/3 shows an Ed Sheeran poster being generated. It features him with a guitar against a sunset background. The subject demonstrates iterations: the text at the bottom changes to "NEW YEAR'S EVE" and "LAS VEGAS SPHERE." The style then shifts to a high-contrast green and black screen-print. [00:38–00:42] The entire frame transitions to a real-world scene. A man in a tan jumpsuit, seen from behind, is taping a large white poster onto a red brick wall. The poster features a black circular logo and the text "COMMENT AI." The subject appears in a small bubble at the bottom, saying "type AI in the comments." NEGATIVE PROMPT: Visual: blurry face, distorted hands, flickering UI elements, inconsistent hat logo, low resolution, messy background, unnatural eye movements. Speech: robotic tone, monotone delivery, background noise, muffled audio, lip-sync mismatch, stuttering, long pauses. SPEECH PACK: [00:00–00:06] TAKE_A: "Google Nano Banana Pro is mind-blowing when it comes to creating graphic design work. You can take any character and create any poster design." TAKE_B: "Nano Banana Pro is a total game-changer for design. Take any celeb, any style, and boom—instant professional posters." TAKE_C: "This new AI model is insane for graphics. One reference photo is all you need to make these incredible celebrity posters." [00:07–00:14] TAKE_A: "With one reference image of their face and a basic prompt. So I'm going to show you exactly how you can get the best results." TAKE_B: "Just one photo and a simple sentence. I'll show you the secret to getting these high-end results every single time." TAKE_C: "Reference photo plus a basic prompt equals this. Let me walk you through the process for the best output." [00:15–00:25] TAKE_A: "To get started you want to go to Lovart, which is a dedicated AI design tool. You can now write in a basic prompt, then select Google Nano Banana Pro." TAKE_B: "Head over to Lovart—it's built for designers. Type your idea, pick the Nano Banana Pro model, and you're ready." TAKE_C: "Step one: open Lovart. It’s an AI design powerhouse. Enter your prompt, choose the Google model, and watch the magic." [00:26–00:42] TAKE_A: "Once you hit generate, it will use its own prompt enhancer. Now you can iterate, change text or backgrounds. Type AI in the comments for the link!" TAKE_B: "Hit generate and let the AI enhance your prompt. Tweak the text, swap the background, it's that easy. Comment AI for access!" TAKE_C: "Generate, iterate, and perfect. Change anything you want in seconds. If you want to try this, just type AI below!"
A) MISE EN PLACE 2) Segment the video into scenes/shots: - [00:00-00:02]: Shot 1, Medium close-up, man singing. - [00:02-00:04]: Shot 2, Wide shot, woman floating. - [00:04-00:06]: Shot 3, Extreme close-up, mouth and mic. - [00:06-00:09]: Shot 4, Medium shot, B&W, three clones. - [00:09-00:10]: Shot 5, Close-up, woman in hat. - [00:10-00:12]: Shot 6, Medium shot, man singing. - [00:12-00:14]: Shot 7, Wide shot, woman walking in field. - [00:14-00:16]: Shot 8, Medium close-up, man singing. - [00:16-00:17]: Shot 9, Medium shot, man driving. - [00:17-00:18]: Shot 10, Medium wide, drummer on roof. - [00:18-00:20]: Shot 11, Medium shot, man driving. - [00:20-00:21]: Shot 12, Medium close-up, man singing. - [00:21-00:22]: Shot 13, Medium shot, woman in field. - [00:22-00:25]: Shot 14, Medium shot, B&W, three clones. - [00:25-00:27]: Shot 15, Medium close-up, man smoking. 3) Extract visual evidence: - Keyframes: Blonde man singing (00:01), Woman floating (00:03), Mouth close-up (00:05), B&W clones (00:07), Man driving (00:16), Drummer (00:17), Man smoking (00:26). 4) Extract speech evidence: - The audio is a continuous pop-rock song with male vocals. - Transcript: "Just to show that it'll be fine / And when I'm back in Chicago I feel it / Another version of me I was in it / I wake up back to the end / I feel it" - Lip visibility: High in singing shots. Strict lip-sync required. 5) Invariants list: - Visuals: Protagonist (Caucasian male, early 30s, short blonde hair, black shirt), cinematic lighting, 24fps motion blur, anamorphic lens feel. - Speech: Continuous song, male vocal, energetic delivery. 6) Variables list: - Visuals: Locations (city rooftop, field, car), secondary characters (woman, drummer), color grade (warm sunset vs cool night vs B&W). B) SHOTLIST [00:00–00:02] - framing: MCU, eye level. - lens: 50mm, shallow depth of field. - camera movement: Slow push-in. - subject: Blonde male, singing passionately into vintage mic. - environment: Outdoor, blurred city skyline. - lighting: Warm golden hour, directional from right. - color grade: Teal and orange, high contrast. - SPEECH: Male vocal, singing "Just to show that it'll be fine". Strict lip-sync. [00:02–00:04] - framing: WS. - lens: 35mm. - camera movement: Slow horizontal tracking. - subject: Woman in white dress, floating horizontally. - environment: Grassy field at dusk. - lighting: Soft sunset. - SPEECH: Song continues, no on-camera lip-sync. [00:04–00:06] - framing: ECU. - lens: Macro. - camera movement: Static. - subject: Blonde male's mouth and vintage mic. - lighting: Warm, high contrast. - SPEECH: Male vocal, singing "And when I'm back in Chicago". Strict lip-sync. [00:06–00:09] - framing: MS. - lens: 50mm. - camera movement: Static. - subject: Three identical clones of blonde male, singing into one mic. - environment: Studio backdrop. - lighting: High contrast, retro. - color grade: Black and white. - SPEECH: Male vocal, singing "I feel it / Another version of me". Strict lip-sync for all three. [00:16–00:17] - framing: MS. - lens: 35mm. - camera movement: Mounted on hood, slight vibration. - subject: Blonde male driving classic convertible. - environment: City street at night. - lighting: Cool streetlights, warm dashboard practicals. - SPEECH: Song continues, no on-camera lip-sync. [00:25–00:27] - framing: MCU. - lens: 50mm. - camera movement: Slow pan right. - subject: Blonde male smoking cigarette, exhaling. - environment: City rooftop at dusk. - lighting: Cool cinematic. - SPEECH: Song ends, instrumental fade. C) STYLE BIBLE - visual_style: Cinematic music video. - camera_signature: Anamorphic lenses, smooth tracking, shallow depth of field. - lighting_signature: High contrast, motivated sources (sunset, streetlights). - grade_signature: Teal and orange for city, warm golden for fields, stark B&W for studio shots. - texture_signature: Film grain, 24fps motion blur. - SPEECH STYLE BIBLE: Energetic pop-rock male vocal, clear articulation, studio-quality mix. D) PROMPT SYNTHESIS 1. MASTER PROMPT GLOBAL LOCK: A cinematic music video featuring a consistent protagonist: a Caucasian male in his early 30s, short styled blonde hair, wearing a black collared shirt. The visual style is photorealistic, shot on anamorphic lenses with a 24fps filmic motion blur. The camera work is dynamic, with smooth tracking. The audio is a pop-rock song with clear male vocals. [00:00–00:02] Medium close-up. The blonde male protagonist stands outdoors against a blurred city skyline at sunset. He is singing passionately into a vintage silver condenser microphone. Warm, golden-hour lighting hits his face from the right. The camera slowly pushes in. Strict lip-sync to the lyrics "Just to show that it'll be fine". [00:02–00:04] Wide shot. A young woman with long brown hair, wearing a flowing white dress, floats horizontally above a grassy field at dusk. The lighting is soft and ethereal. The camera tracks her movement slowly. [00:04–00:06] Extreme close-up. Profile shot of the blonde male protagonist's mouth and the vintage microphone. He is singing, lips perfectly synced to the lyrics "And when I'm back in Chicago". The background is completely out of focus. [00:06–00:09] Medium shot, black and white. Three identical clones of the blonde male protagonist stand close together, all singing into a single vintage microphone in the center. The lighting is high-contrast, reminiscent of classic 1960s music videos. The camera is static. Strict lip-sync to "I feel it / Another version of me". [00:09–00:10] Close-up. A young woman with freckles, wearing a straw hat, looks softly off-camera. Warm sunlight illuminates her face. The background is a blurred field. [00:10–00:12] Medium shot. The blonde male protagonist singing passionately into the vintage microphone, city skyline in the background. Warm sunset lighting. The camera slightly pans left. Strict lip-sync. [00:12–00:14] Wide shot. A woman with long brown hair, wearing a white dress and a wide-brimmed hat, walks away from the camera through a field of tall grass and flowers at sunset. The camera follows her slowly. [00:14–00:16] Medium close-up. The blonde male protagonist singing intensely into the vintage microphone, city skyline background. The camera pushes in quickly. Strict lip-sync. [00:16–00:17] Medium shot. The blonde male protagonist is driving a classic convertible car at night. The city lights blur in the background. He is looking forward, illuminated by dashboard lights and passing streetlights. The camera is mounted on the hood, facing him. [00:17–00:18] Medium wide shot. A different man, with dark hair and a beard, is energetically playing a drum set on a city rooftop at dusk. The camera pans around him. [00:18–00:20] Medium shot. The blonde male protagonist driving the convertible at night. He turns his head slightly to look towards the camera. City lights streak by. [00:20–00:21] Medium close-up. The blonde male protagonist singing into the vintage microphone, city skyline background. Strict lip-sync. [00:21–00:22] Medium shot. The woman in the white dress and straw hat stands in a field of flowers at sunset, smiling gently at the camera. [00:22–00:25] Medium shot, black and white. The three clones of the blonde male protagonist singing into the vintage microphone. The camera slowly pushes in. Strict lip-sync. [00:25–00:27] Medium close-up. The blonde male protagonist stands on a rooftop with a city skyline behind him at dusk. He is smoking a cigarette, exhaling a cloud of smoke. The lighting is cool and cinematic. The camera slowly pans right. 2. NEGATIVE PROMPT visual artifacts, anatomy issues, extra fingers, weird motion, text, logos, watermarks, flicker, temporal jitter, morphing faces, inconsistent clothing, robotic movement, unnatural lighting, overexposed highlights, cartoonish style, anime, 3d render. Speech negatives: robotic cadence, unnatural emphasis, slurred words, harsh sibilance, plosives, clipping, lip-sync mismatch, out of sync audio. 4. SPEECH PACK [00:00-00:02] "Just to show that it'll be fine" [00:04-00:06] "And when I'm back in Chicago" [00:06-00:09] "I feel it / Another version of me I was in it" [00:10-00:12] "I wake up back to the end" [00:14-00:16] "I feel it"
Core format and topic lock: a vertical creator tutorial showing how to create an AI VFX shot using Kling O1 inside Higgsfield, combined with Adobe After Effects and Adobe Illustrator. The main source material is a green-screen clip of the presenter walking toward camera in a white t-shirt and dark pants. The workflow then combines that green-screen footage with generated environment imagery and a bold black-and-white geometric Illustrator graphic that becomes part of the compositing transition or reveal. A male presenter in a rounded talking-head box explains each stage. Shot-by-shot reconstruction 0.0s-14.0s Open on the raw green-screen performance clip of the presenter facing and walking toward camera. The lower talking-head frame introduces the idea of turning this simple source footage into a polished AI VFX shot. 14.0s-28.0s Show the workflow combination visually: the Kling green-screen video on one side, a generated environment image on the other, and a Kling O1 Edit label or module in between. This section should make clear that AI editing is being layered onto standard source footage. 28.0s-48.0s Switch to an Illustrator-style canvas displaying a strong black-and-white radial or angular geometric graphic. The presenter explains that this designed element becomes part of the final visual transition or reveal, adding professional polish beyond the AI output alone. 48.0s-67.3s Show the composited result, where the green-screen subject is integrated into a stylized environment with shape-based wipes or angular reveal elements. End on the final VFX shot and a CTA inviting viewers to comment “AI” for the workflow link. Visual style Vertical AI VFX tutorial, clean software-demo presentation, green-screen source clip, dark interface backgrounds, geometric design overlays, creator talking-head guidance, no cinematic scene changes beyond workflow steps. Motion notes Motion should come from transitions between source clip, workflow cards, graphic design canvas, and final composited result. Preserve the same subject identity and green-screen clip so the audience can follow the full before-to-after pipeline. Negative prompt messy interface, unreadable labels, unrelated effects, extra presenters, watermark, subtitles unrelated to tutorial, random footage swaps, non-geometric graphics, broken green-screen edges, non-AI workflow sections, shaky handheld filming Speech pack English creator narration explaining how Kling O1 Edit in Higgsfield works with green-screen footage, generated environment images, Illustrator graphics, and After Effects compositing to produce a polished VFX shot.
GLOBAL LOCK: Subject: A vintage green Land Rover Defender (classic 110 model) and a modern bright orange Lamborghini Huracan. Environment: A winding asphalt mountain road, lush green pine trees, misty background with rolling hills, overcast but bright daylight. Consistency: Maintain the specific metallic green paint of the Land Rover and the high-gloss pearl orange of the Lamborghini. Camera Language: Cinematic, high-speed movement, aggressive zooms, and speed ramping. Lighting: Natural daylight with soft shadows, high-contrast metallic reflections on car bodies. Color Grade: Earthy, desaturated greens for the Land Rover segments; high-saturation, vibrant tones for the Lamborghini segments. Speech/Audio: Male narrator, energetic and instructional tone, clear articulation, mid-range pitch, recorded in a dry room with slight compression. [00:00–00:02] Visual: A wide shot of a green Land Rover Defender driving away from the camera on a mountain road. The camera rapidly zooms in toward the rear spare tire. Action: The car is in motion; trees are blurred in the background. Camera: Fast dolly-in/zoom. Lighting: Soft daylight, misty atmosphere. Speech: "Here is exactly how you can create this epic car transition..." (Narrator on-camera in PIP). [00:02–00:04] Visual: Extreme close-up of the Land Rover front headlight and grille. The camera "whips" to the side. Action: Fast panning motion. Camera: Macro lens feel, shallow depth of field. Lighting: Glint on the glass of the headlight. [00:04–00:06] Visual: A "glitch" transition. The Land Rover engine bay (mechanical, dark, oily textures) morphs through a prismatic chromatic aberration effect into the green "Land Rover" oval badge. Action: Morphing transition. Motion: High-speed jitter and color fringing. [00:06–00:09] Visual: The green Land Rover badge morphs into the rear of the car. The camera pulls back to reveal the full vehicle. The creator's PIP bubble is visible in the bottom center. Speech: "...using AI. This is such a simple effect to do, so strap in." [00:09–00:26] Visual: Screen recording of the Leonardo.ai interface. A cursor navigates to "Image Generation," types a prompt for "a bright orange lamborghini," and adjusts aspect ratio to 9:16. Action: UI interaction, clicking buttons, scrolling through generated car images. Environment: Dark mode software UI. Speech: "To get started, you want to go to Leonardo AI. Select image on the left-hand side, write in a basic prompt... adjust the aspect ratio..." [00:26–00:31] Visual: Screen recording of the Kling 2.1 Video model. The user selects "Start Frame" (the wide Lamborghini) and "End Frame" (the close-up). Action: Dragging and dropping images into the video generator slots. Speech: "Go to video and select Kling 2.1... you can use the start and end frame." [00:31–00:38] Visual: Screen recording of Adobe After Effects. The "Graph Editor" shows a steep curve for speed ramping. The cursor selects "CC Force Motion Blur." Action: Adjusting keyframes on a timeline, searching for effects in the panel. Speech: "Bring that into a tool like Adobe After Effects with some simple speed ramping and some motion blur..." [00:38–00:42] Visual: Final montage of the orange Lamborghini driving fast on the mountain road, with aggressive speed ramps and motion blur. Large yellow text "AI" appears over the car. Action: High-speed driving, camera shaking, cinematic finish. Speech: "...to get an effect like this. If you want access to the tool, type AI in the comments." NEGATIVE PROMPT: Visual: Low resolution, blurry car logos, inconsistent car models (e.g., Land Rover turning into a Jeep), flickering lighting, distorted wheels, floating objects, messy UI overlays, watermark, grainy textures. Speech: Robotic voice, background noise, echo, stuttering, muffled audio, inconsistent volume levels, lip-sync delay in PIP. SPEECH PACK: [00:00-00:08] TAKE_A: "Here is exactly how you can create this epic car transition using AI. This is such a simple effect to do, so strap in." (Energetic, fast-paced) TAKE_B: "Want to make car transitions like this? I'll show you how to use AI to do it. It's easier than you think, let's go." (Casual, inviting) TAKE_C: "This AI car transition is going viral. Here is the step-by-step breakdown using Leonardo and Kling." (Authoritative, hook-focused) [00:09-00:25] TAKE_A: "To get started, go to Leonardo AI. Select image, type your prompt for the car you want, and set your aspect ratio to custom 9 by 16." TAKE_B: "First step, head over to Leonardo. We're generating our base car images here. Make sure you get a wide shot and a close-up." [00:26-00:42] TAKE_A: "Now use Kling 2.1. Put your wide shot as the start and the close-up as the end. Finish it in After Effects with speed ramping and motion blur." TAKE_B: "The secret is Kling 2.1's start and end frame feature. Then just add some CC Force Motion Blur in AE for that pro look." Prosody Markup: "epic car transition... (pause) ...using AI." "strap in. (emphasis)" "Leonardo AI (clear enunciation)" "Kling 2.1 (punchy)" "AI (loud, call to action)"
GLOBAL LOCK: The video features a split-screen layout. The bottom 30% contains a consistent male creator: Caucasian, mid-30s, brown beard, wearing a tan "Vans" trucker hat and a black quilted vest over a white t-shirt. He is in a home office/studio setting with soft indoor lighting. The top 70% features AI-generated cinematic footage. The AI footage must maintain high subject consistency, specifically a character resembling Leonardo DiCaprio in "The Wolf of Wall Street" (short brown hair, blue pinstripe suit, red polka dot tie). The environment is a luxury office with wood paneling. Lighting is cinematic, warm, and professional. [00:00–00:03] Subject: A man resembling Leonardo DiCaprio in a blue pinstripe suit and red polka dot tie. Action: He holds a crisp one-dollar bill horizontally with both hands, looking directly into the camera with a slight, confident smile. Camera: Medium close-up, static. Lighting: Warm, high-key office lighting, soft shadows. Speech: Creator says "It has never been easier to create multiple camera angles..." Sync: Creator's lips visible in the bottom frame, high sync. [00:03–00:07] Visual: A 3x3 grid appears showing the same man from 9 different angles (overhead, profile, low angle, etc.). Then transitions to a Nike windbreaker jacket (black, red, white) floating in a surreal dark environment filled with glowing blue and purple crystals. Action: The jacket rotates slowly. Camera: Close-up on the jacket texture and Nike logo. Lighting: Dramatic, neon-blue and purple rim lighting. Speech: "...with consistency from a single reference image." [00:08–00:13] Subject: Three characters: a man (DiCaprio-lookalike), a blonde woman (Margot Robbie-lookalike in a black dress), and a muscular man with a goatee (Jon Bernthal-lookalike, shirtless with a gold chain). Action: They stand together in a modern room with wooden doors and bookshelves. They look toward the camera. Camera: Medium wide shot, slight handheld jitter for realism. Lighting: Naturalistic indoor light from the side. Speech: "So in today's video, I'm going to show you the best method..." [00:14–00:20] Visual: Screen recording of the Higgsfield "Shots" app interface. A cursor selects an image of a woman in a black dress and clicks a yellow "Generate" button. Action: The UI transitions to show a grid of 9 generated black-and-white images of the woman. Camera: Screen capture. Speech: "Let's dive in. To get started, you can upload your image into Shots..." [00:21–00:28] Subject: A beautiful woman with dark hair in a flowing black dress. Action: A montage of artistic shots: her looking at the camera, her back to the camera with hair blowing, her dancing with fabric flowing around her. Camera: Various angles (CU, MCU, Profile), slow motion. Lighting: High-contrast black and white, dramatic shadows, bright white background. Text Overlay: "Comment AI" in bold white letters. Speech: "So if you want to try this out for yourself, type AI in the comments and I'll send you a link." NEGATIVE PROMPT: Visual: Distorted faces, extra fingers, flickering background, blurry textures, inconsistent clothing colors, morphing objects, robotic movement, low resolution, watermark. Speech: Robotic tone, muffled audio, background noise, lip-sync delay, stuttering, unnatural pauses. SPEECH PACK: [00:00-00:07] Transcript: "It has never been easier to create multiple camera angles with consistency from a single reference image." TAKE_A: (Enthusiastic, fast-paced) "It's NEVER been easier to create multiple camera angles... with total consistency... from just ONE image." TAKE_B: (Educational, steady) "It has never been easier to create multiple camera angles with consistency... starting from a single reference image." [00:21-00:28] Transcript: "So if you want to try this out for yourself, type AI in the comments and I'll send you a link." TAKE_A: (Direct, CTA-focused) "Want to try this? Type AI in the comments and I'll DM you the link right now." TAKE_B: (Friendly, helpful) "If you want to try this out for yourself, just comment AI below and I'll send that link over."
A vertical talking-head tutorial reel hosted by a young white male creator seated against a solid warm orange studio backdrop. Large kinetic captions introduce a test of multiple AI image and video tools for generating professional-looking avatars. The edit alternates between direct-to-camera explanation, moody retro-tech B-roll of the host at a vintage CRT computer in a dim teal-and-amber room, stylized example portraits arranged in tiled grids, and cinematic concept scenes featuring human characters, analog screens, and fashion-editorial lighting. One standout shot shows a television-headed figure standing beside a woman in a patterned dress, labeled “Midjourney.” Other segments show portrait matrices and tool comparisons, with the overall visual language leaning cinematic, grainy, nostalgic, and premium rather than clean SaaS tutorial aesthetics.
GOAL Maximize visual + motion + speech (audio) similarity to the reference video, prioritizing: 1) Subject consistency (identity / wardrobe / props) for the creator in the bottom half. 2) Environment consistency (location, set dressing) for the creator's room. 3) Split-screen layout consistency (Top half: UI/AI Video, Bottom half: Creator). 4) Visual fidelity of the AI-generated casino sequence (lighting, morphing transitions, specific objects). 5) UI accuracy for the screen recording segments. 6) Speech signature (script/semantics, cadence, energetic tone). WORKFLOW A) MISE EN PLACE - Invariants: Split-screen vertical format. Bottom half: Caucasian male, 30s, beard, beige t-shirt, blue baseball cap with yellow logo, sitting in a bright room with a window behind him. Top half: Alternates between high-end cinematic casino AI video and dark-mode web UI screen recordings. - Variables: The specific content shown in the top half (casino sequence vs. UI), the creator's hand gestures (pointing up, gesturing to explain). B) SHOTLIST & D) PROMPT SYNTHESIS MASTER PROMPT: GLOBAL LOCK: Vertical 9:16 video format featuring a continuous split-screen layout. The bottom half consistently shows a Caucasian man in his 30s with a beard, wearing a beige t-shirt and a blue baseball cap with a yellow logo. He is sitting indoors in a brightly lit room with a window visible in the background. He speaks directly to the camera with an energetic, educational tone, frequently using his hands to gesture upwards towards the top half of the screen. The top half of the screen displays varying content, alternating between a highly polished, cinematic AI-generated video sequence and screen recordings of a dark-mode web interface. The lighting on the man is natural and soft, while the AI sequence features warm, golden, high-contrast cinematic lighting. [00:00–00:05] Top half: A cinematic, warm-lit shot of a man in a white tuxedo at a casino poker table, pushing chips forward. The camera pushes in rapidly, and the scene seamlessly morphs into an extreme close-up of a hand holding a yellow and black casino chip. Text overlays "Start Frame" and "End Frame" appear briefly, along with a graphic of a video editing timeline. Bottom half: The man points both index fingers upwards at the top screen, looking impressed. Text overlay: "This is AI? 🤯". [00:05–00:15] Top half: The continuous morphing sequence continues. The yellow casino chip flips into the air, spinning, and seamlessly transforms into a spinning wooden roulette wheel. A white ball drops onto the wheel, bounces, and falls down a dark, golden-lit chute. As it falls, it morphs into two golden dice that land on a green craps table. The dice have the words "Artlist" and "AI Toolkit" engraved on them. In the background, out-of-focus people in formal wear are cheering. Bottom half: The man watches the sequence, resting his hand on his chin, then points downwards, then makes an "okay" hand gesture, reacting to the visuals above. [00:15–00:35] Top half: The view switches to a screen recording of a dark-mode web interface for "Artlist AI Toolkit". The cursor clicks on "Generate image". The screen scrolls through various AI image models, specifically highlighting "Artlist Original 1.0" and "Nano Banana Pro". Thumbnails of the generated casino images (tuxedo man, chip, roulette) are shown. Bottom half: The man speaks directly to the camera, explaining the process, using his hands to emphasize points. [00:35–00:44] Top half: The screen recording switches to a video generation interface, highlighting a model called "Kling 2.5 Turbo Pro". The UI demonstrates a workflow using a "Start Frame" and an "End Frame" input, showing the previously generated casino images being placed into these slots to create the transition. Bottom half: The man continues his explanation, gesturing with his hands to indicate the connection between the two frames. [00:44–00:51] Top half: A static graphic appears showing a dark-themed PDF document titled "GenHQ Workflow Overview" containing thumbnails of the casino images and blocks of text (prompts). A large, bold text overlay appears: "Comment 'AI'". Bottom half: The man points emphatically at the "Comment 'AI'" text, urging the viewer to take action. NEGATIVE PROMPT: Visuals: Loss of split-screen format, creator's appearance changing (different hat, different shirt, losing beard), inconsistent lighting on the creator, jarring cuts in the top-half AI sequence, loss of momentum during the morphing transitions, misspelled text on the golden dice (must say "Artlist" and "AI Toolkit"), blurry or illegible UI elements in the screen recording, unnatural hand anatomy during gestures. Audio: Robotic voice, low energy, muffled audio, lack of sync between creator's lips and speech. SPEECH PACK: [00:00-00:15] (No speech, upbeat electronic background music playing) [00:15-00:35] SPEAKER A (Creator): "Here's how you can make this using Artlist's new AI Toolkit. On Artlist, go to the top of the page and select 'Generate image'. From here, you can go and select all of the best image models, including Artlist's new image model called 1.0. Then, when you pair that with Google Nano Banana Pro, you can insert yourself into the scenes." [00:35-00:44] SPEAKER A (Creator): "And we'll give you this PDF document at the end of the video with all of our prompts and images. Then you can use the video section of the toolkit, where you can use the best video models, and use Kling 2.5 first and last frame to create these epic transition shots." [00:44-00:51] SPEAKER A (Creator): "If you want access to all of our prompts and images, type 'AI' in the comments and I'll send you the links. Artlist has a lot of new features coming soon."
GLOBAL LOCK: one male creator in a neon-lit studio, white hoodie, white baseball cap with front patch, short beard, seated on a stool or chair for the opening, later standing in the same room, plus one dense tabletop miniature world with many small figurines and terrain pieces, plus one glossy blue-black robotic or alien figure for the late close-up section, no Gemini, no extra lead characters, no location change outside the established studio and desk world. Create a 31-second vertical creator-style reveal video that begins as a direct-to-camera tease and then pays off with a detailed miniature tabletop world. The room is a moody neon studio with blue and purple accent lights, screens or light columns in the background, and a creator seated close to camera. The tone is “wait for the drop,” then wonder, then productized reveal, then one final high-detail creature close-up. 0.0-4.0s: medium shot of the creator seated and facing camera. He lifts one hand as if counting in or cueing the audience. Add an on-screen subtitle feel without actually rendering text. Expression should read confident and teasing, like he is about to show something better. 4.0-8.0s: he points toward the desk or toward the viewer, still seated. Keep the studio lights glowing blue and magenta behind him. The pace is calm and deliberate, like a setup before a beat drop. 8.0-12.0s: cut hard to the tabletop reveal. Show a wide miniature fantasy or sci-fi battle board packed with small figures, props, terrain, platforms, and scattered pieces across a large table. Camera glides lightly over the setup so scale and density read clearly. 12.0-18.0s: continue exploring the tabletop from different angles. Show the room around it: shelves, hanging pieces, wall-mounted objects, and the overall “desk world” feeling. The miniatures should feel handcrafted, collectible, and visually dense. 18.0-22.0s: cut back to the creator standing in the neon room holding or presenting something toward the lens, like he is bridging the real room and the world he just showed. His body language stays casual and presenter-like. 22.0-31.0s: final macro reveal of a glossy blue-black robotic or alien figure with reflective surfaces, cables, or tendril-like forms. The figure fills the frame as a premium detail payoff. The creator is soft or partly visible behind it, reaching toward camera. End on the creature’s high-detail front view. CAMERA: vertical social-video framing, creator close-up, tabletop wide and mid shots, final creature macro push, no chaotic handheld shake. LIGHTING: neon studio lighting for the creator shots, brighter room light for the tabletop reveal, glossy reflective highlights for the final creature macro. COLOR / GRADE: cool blue-purple studio grade at the start and end, warmer neutral room tone in the tabletop section, high-contrast reflections on the final figure. MOTION: pointing, presenting, controlled reveal pacing, slight camera drift over miniatures, final macro hold on the robotic/alien figure. SPEECH PACK: no required audible dialogue, but the creator should clearly read as saying “wait for the drop” or hyping the reveal through gesture and expression. NEGATIVE PROMPT: empty desk, generic gaming setup, low-detail toys, blurry tabletop, no miniatures, random city exterior, text overlays, subtitles, watermark, broken hands, malformed creature, low-resolution product shot, horror gore.
GLOBAL LOCK: A vertical 9:16 branded artist-showcase reel for the Leonardo Imagination Fund featuring judge Shelby and a range of experimental creator visuals. The video should feel like a polished contemporary art campaign rather than a direct tutorial: tactile, emotional, and visually dense. Across the reel, maintain a premium brand-film language with shallow depth of field, macro facial details, reflective surfaces, soft but intentional lighting, and a sequence that alternates between abstract AI imagery, real artist portraits, and clean interview moments. Shelby appears as a blonde white woman in her 30s to early 40s with light skin, straight blonde hair in either long or short bob styling depending on the shot, minimal makeup, and modern understated wardrobe in pale tones or white. Supporting artists appear as a diverse set of adults shown through close-up portraiture, glasses reflections, projected textures, studio details, and poetic environmental inserts. The visual system should tie everything together with elevated color, soft contrast, and brand-safe cinematic polish. [00:00-00:14] Open with a rush of extreme close-ups: a pale blue-green eye filling the frame, partial faces, skin texture, lashes, lips, and projected color washing across a woman's face in dim light. Use macro or near-macro framing with slow drifting camera movement and intentional softness at the edges. The mood should feel intimate, observational, and creatively charged, like entering an artist's internal world. [00:14-00:31] Expand into a montage of multiple creators and imagined worlds. Show glasses catching screen reflections, sharply framed eyes, a suburban house rendered with dreamy painterly distortion, a silhouetted figure standing before magenta clouds, and a young man outdoors with a small subtitle fragment visible. The pacing should stay lyrical rather than instructional, with each image working like a poetic evidence point for experimentation and imagination. [00:31-00:47] Move toward Shelby more directly. Show her in a soft-lit studio interview setup wearing a sheer pale top over a light garment, then intercut with close-ups of hands touching hanging organic materials, artist-process textures, and clean portrait angles that suggest judging, reflection, and craft. Preserve a calm, thoughtful performance style and keep the camera stable, elegant, and intimate. [00:47-00:56] Shift to brighter portrait moments of Shelby in a minimal white shirt holding or standing beside artwork in a clean interior. The reel should now feel more grounded and declarative, moving from abstract inspiration toward visible artistic output and human presence. Use balanced natural light, gentle contrast, and simple composition. [00:56-01:00] End with bold text on black reading “YOURS TO CREATE,” followed by Leonardo.Ai by Canva branding surrounded by a grid of colorful creative outputs. The final beat should feel aspirational and campaign-ready, giving the impression that the platform exists to unlock a wide range of artistic voices. NEGATIVE PROMPT: generic talking-head tutorial, cheap influencer look, low-detail skin, inconsistent Shelby identity, cluttered random backgrounds, hard flash lighting, harsh corporate explainer graphics, comedy timing, shaky phone footage, heavy glitch artifacts, neon sci-fi overload without human portraiture, low-end stock footage feel. SHOT PROMPTS: macro human eye close-up; projected light portrait; artist glasses screen reflection; dreamy suburban AI scene; silhouette against magenta clouds; outdoor reflective portrait; Shelby studio interview; hands on hanging botanical materials; clean artwork presentation; black title card; Leonardo.Ai campaign end card. SPEECH PACK: Sparse campaign-style voiceover or interview fragments only. Tone is reflective, artist-centered, and quietly enthusiastic. Any spoken language should feel measured and sincere, with clean studio capture, minimal room echo, and edit points aligned to visual transitions rather than punchline delivery.
GLOBAL LOCK: - Subject: White male, mid-30s, curly dark brown hair, well-groomed beard. - Wardrobe: Green "Vans" trucker cap, plain white crew-neck t-shirt. - Environment: Surreal desert landscape with white sand dunes and jagged rock formations. - Lighting: Warm, high-contrast sunlight with deep shadows; occasional high-contrast black and white. - Color Grade: Warm desert tones (orange/white) vs. high-contrast monochrome. - Camera: Cinematic 4K, shallow depth of field, rhythmic fast cuts, split-screen triptych layouts. - Speech: Direct-to-camera address, energetic and professional tone, clear articulation. [00:00–00:02] Visual: A wide shot of a surreal desert with white sand dunes. A massive, hyper-realistic full moon hangs low in the sky. The frame is split into three horizontal sections. The top and bottom show the desert; the middle shows the subject (male, Vans cap) looking down and then up at the camera. Speech: "Let's talk about the future of world building with AI." Sync: Cut on "AI". [00:02–00:05] Visual: The subject is now in the center of a triptych. The top frame shows the desert moon. The bottom frame shows a close-up of swirling white sand. The subject smiles and gestures with his hands. Speech: "We are in a position right now where you can create any world that you like..." [00:05–00:08] Visual: The subject is lying flat on his back in the white sand. A translucent, flowing white fabric is draped over him, billowing in a gentle breeze. Sunlight filters through the fabric, creating dappled shadows on his face. Speech: "...in any style." [00:08–00:15] Visual: Transition to high-contrast black and white. The subject is a silhouette in profile, looking upwards. Behind him is a massive, glowing white circular light (like a halo or a second moon). A hand reaches out toward the light in the bottom frame of a split screen. Speech: "The question still remains: What AI image model should I use to create photo-realism to high standards?" [00:15–00:20] Visual: A rapid montage of diverse AI generations. 1) Giant stone monoliths in a desert at sunset. 2) A cosmic, glowing humanoid figure made of stars. 3) A fashion model with red hair in a structured blue vinyl dress. Text overlay: "SEEDREAM 4.5". Speech: [Music swells, rhythmic beat] [00:20–00:27] Visual: Split screen. Top: A female model behind a frosted glass pane, wearing a green blazer. Bottom: The subject in a small inset bubble, talking and gesturing. The blazer on the model changes styles and patterns (stripes, colors) rapidly. Speech: "This AI image model is not only photo-realistic, but you can edit images as well in 4K resolution." [00:27–00:32] Visual: Screen recording of the Artlist UI. A cursor selects "Seedream 4.5" from a list of models (Kling, Sora, Veo). Then, a text prompt "dynamic-FOV drone shot" is typed into a search bar. Speech: "You can access Seedream 4.5 on Artlist, along with all of the best AI image models." [00:32–00:36] Visual: A cinematic shot of an elderly male pilot with a mustache, wearing vintage goggles and a leather flight cap, flying through the clouds. Text overlay: "AI" in quotes. Final shot: Artlist.io logo on a black background with a yellow "Start Now" button. Speech: "So if you want to try it out, type AI in the comments and I'll send you a link." NEGATIVE PROMPT: Visual: Blurry textures, distorted facial features, inconsistent hat logos, flickering lighting, low resolution, messy hair silhouettes, unnatural fabric physics. Speech: Robotic monotone, muffled audio, background hiss, out-of-sync lip movements, harsh "S" sounds, inconsistent volume levels. SPEECH PACK: [00:00–00:05] TAKE_A: "Let's talk about the future of world building with AI. We are in a position right now..." (Fast, energetic) TAKE_B: "Let's talk about the future... of world building... with AI. We're in a position right now..." (Measured, thoughtful) TAKE_C: "The future of world building is here. With AI, we are in a position right now..." (Authoritative) [00:08–00:15] TAKE_A: "What AI image model should I use to create photo-realism to high standards?" (Inquisitive, rising intonation) TAKE_B: "The big question: which AI model actually delivers high-standard photo-realism?" (Direct, punchy) TAKE_C: "To get this level of photo-realism, you need the right model. But which one?" (Conversational)
MASTER PROMPT GLOBAL LOCK: Vertical 9:16 creator-style AI image generation tutorial reel. Keep the visual structure consistent: dark background, stacked demo windows, rounded-corner presenter overlay near the lower half, and product screenshots or generated outputs occupying the upper area. The presenter is a bearded man in a beige baseball cap and brown hoodie speaking directly to camera with expressive hand gestures. The tutorial should open with a polished luxury ad-style image, then transition into a dark Generate Image interface with prompt and reference controls, and finish with generated lifestyle portraits and result examples. Preserve fast creator-educator pacing, practical workflow clarity, and social-media-friendly text hierarchy. [00:00-00:10.00] Open with a strong proof-first visual: a luxury perfume bottle ad image against a rich purple satin-like backdrop. Place the presenter in a rounded picture-in-picture window at the bottom, speaking energetically to camera. The hook should feel like, "here is the kind of polished ad-style result you can create," with the upper image doing most of the persuasive work. [00:10.00-00:28.00] Shift into the process section. Show a dark image-generation interface labeled around concepts like Generate Image, prompt box, reference styles, remix, auto prompt, or similar controls. Keep the presenter visible in the lower area while he explains how the workflow works. Include reference image boards, prompt panels, or app modules that make the system feel practical and reproducible. [00:28.00-00:48.92] Move into the results and proof section. Show polished generated portraits or fashion-style outputs, app previews, and example result screens, including a casually dressed bearded man in a city street portrait. The presenter continues narrating while the upper content cycles through outputs, reinforcing that the workflow produces believable, commercially useful visuals. End on the strongest lifestyle result. NEGATIVE PROMPT Avoid cluttered multi-window chaos, unreadable UI, generic office stock footage, weak hook visuals, random unrelated outputs, corporate webinar styling, tiny text, dark muddy colors, or a tutorial sequence that explains too much before showing a compelling result. SHOT PROMPTS [00:00-00:10.00] Luxury perfume ad visual with presenter overlay. [00:10.00-00:28.00] Dark Generate Image UI, prompt controls, reference boards, presenter explanation. [00:28.00-00:48.92] Generated lifestyle portraits and result previews with presenter continuing narration. SPEECH PACK Timecoded transcript: [00:00-00:48.92] Single-speaker tutorial explaining an AI image-generation workflow from polished ad example to interface steps to final outputs. Exact wording unclear; preserve concise creator-teacher delivery. TAKE_A [00:00-00:48.92] Fast creator-demo explanation with proof-first opening and simple step-by-step UI walkthrough. TAKE_B [00:00-00:48.92] Calm but confident tutorial tone emphasizing how to get polished commercial-looking results. TAKE_C [00:00-00:48.92] Slightly more enthusiastic creator cadence focused on workflow usefulness and output quality.
GLOBAL LOCK: A consistent young Black female model with short buzz-cut hair, natural skin texture with visible freckles, and a slender build. She wears a beige distressed baseball cap with "ROSE" embroidered in large 3D letters on the front. The environment is a minimalist professional photo studio with a seamless light grey paper backdrop. Lighting is soft-box studio lighting with a subtle rim light to separate the subject from the background. The color grade is clean, neutral, and editorial with high contrast and sharp details. [00:00–00:03] A dark mode software UI on a vertical screen. A small 3D icon of a burger is centered, then quickly replaced by a leopard print belt. Text "This AI" and "Creating something beautiful... (63%)" appears. The UI is sleek with various model icons on the left sidebar. [00:04–00:07] A handheld POV shot of a beige "ROSE" baseball cap lying on a white tiled floor. A smartphone enters the frame, showing the camera app interface as it takes a photo of the hat. Text "reshoot" and "five" appears on screen. [00:08–00:12] The smartphone screen shows the "Artlist Toolkit" UI. A prompt box at the bottom contains detailed text about recreating the hat photo. The UI shows the hat being processed. Text "Now I take one" and "my workflow through" overlays the screen. [00:13–00:16] A grid view of various AI models like "Nano Banana 2," "Seedream 5.0," and "Kling 3.0" with their respective credit costs. The mouse cursor navigates the dark-themed workspace. Text "All the leading AI models," and "one workspace," appears. [00:17–00:24] A rapid montage of the Black female model wearing the beige "ROSE" hat in different outfits: a grey ribbed sweater, a white tank top, and a green knit sweater. Extreme close-up shots of the green knit fabric showing intricate weave patterns and a white "ohneis" care label. Text "Complex garments," "weird shapes," "quality materials." [00:25–00:31] The UI shows a video generation prompt: "slow dolly in, she moves her hand away from the hat." A progress bar moves. Text "You can generate images and videos from the same session." [00:32–00:35] A cinematic profile shot of the model wearing the beige hat and a dark grey knit sweater. She has a neutral, confident expression. The camera is static. Lighting highlights her facial structure and the texture of the hat. Text "If you're building a brand or creating for clients,". [00:36–00:39] A frontal MCU of a different blonde female model wearing oversized black heart-shaped sunglasses, followed by a shot of her in a black and grey striped "London UK" sports jersey. Final screen is black with white text: "Comment 'Artlist' and I'll send you the full workflow." NEGATIVE PROMPT: blurry textures, inconsistent hat logo, distorted facial features, shaky camera, low resolution, unnatural skin smoothing, flickering lighting, messy background, floating objects, mismatched clothing physics, robotic speech cadence, lip-sync lag. SPEECH PACK: [00:00-00:03] "This AI trick legit makes your product photos look studio level." TAKE_A: (Energetic, fast-paced) "This AI trick... legit makes your product photos... look studio level!" TAKE_B: (Confident, smooth) "This AI trick legit makes your product photos look studio level." [00:04-00:08] "I used to reshoot the same product five times and it still looked off. So I stopped." TAKE_A: (Frustrated tone on 'five times') "I used to reshoot the same product FIVE times... and it still looked off. (Pause) So I stopped." [00:09-00:16] "Now I take one flat lay on my phone and run my workflow through the Artlist Toolkit. All the leading AI models, one workspace, no switching tabs." TAKE_A: (Informative, rhythmic) "Now I take one flat lay on my phone... and run my workflow through the Artlist Toolkit. All the leading AI models... one workspace... no switching tabs." [00:17-00:24] "A few prompts later, it looks like a full blown studio shoot. Complex garments, weird shapes, quality materials. Still works every time." TAKE_A: (Impressed, emphasizing 'quality') "A few prompts later... it looks like a full blown studio shoot. Complex garments... weird shapes... quality materials. Still works every time." [00:25-00:31] "And the best part? You can generate images and videos from the same session. No photographer, no studio, no reshoots." TAKE_A: (Punchy delivery) "And the best part? You can generate images AND videos from the same session. No photographer... no studio... no reshoots." [00:32-00:39] "If you're building a brand or creating for clients, this will save you hours. Comment 'Artlist' and I'll send you the full workflow." TAKE_A: (Direct, CTA focus) "If you're building a brand or creating for clients... this will save you hours. Comment 'Artlist'... and I'll send you the full workflow."
Vertical AI tutorial about creating cinematic commercial shots using first-frame and last-frame control. The video opens with a creator explaining the method in a simple talking-head setup, then quickly switches into a series of dramatic visual examples presented as “How to” references. These examples include a luxury sports car at sunset, a bull standing on a reflective surface in golden-hour light, and a top-down dust-ring composition with strong cinematic atmosphere. The overall teaching angle is that high-end ad-like motion can be built from still images by controlling the opening and closing frames of the sequence. The middle of the video focuses on a mobile-style workflow. We see a smartphone interface where reference images are loaded, arranged, and turned into prompts or scene instructions. The creator demonstrates how to import the chosen visual, position the image, add text or instructions, and set up the animation logic so the output feels intentional rather than random. The examples stay consistent with a premium brand-campaign mood: glossy car photography, dramatic bull imagery, warm sunset light, dust, reflections, and slow cinematic movement. Later frames show the generated examples again in more polished detail, emphasizing how the same workflow can be used across different premium visual concepts. The video ends by reinforcing the lesson that first frame and last frame are the keys to stronger AI motion composition, especially for commercial and luxury aesthetics. Overall, the clip should feel like a concise but stylish tutorial for creators who want to turn still images into cinematic advertising visuals using AI, mobile-friendly tools, and deliberate composition planning.
GLOBAL LOCK: preserve a creator-led talking-head tutorial format mixed with vertical phone screen recordings. Keep one young male creator in a backward black cap and dark hoodie speaking directly to camera in a studio setup with a microphone. Intercut iPhone-style screen captures showing ChatGPT/OpenAI image workflow steps, uploaded object photos, prompt entry, and AI video generation screens. Maintain a practical “make from your phone” educational reel structure. No random B-roll, no unrelated tools, no logo overlays beyond app UI already present in the source. Create a 37.8-second social-first AI tutorial reel showing how to turn ordinary phone photos into animated AI character videos. Begin with a hook using a simple hand-held object photo and bold on-screen teaching posture from the creator. Then show phone interfaces: photo selection, ChatGPT or image-tool screens, prompt entry, image transformation results, switching to an AI video tool, uploading the generated image, entering a motion prompt, and generating the final animated output. Use repeated face-cam segments where the creator explains the steps and emphasizes that the workflow can be done from a phone. Include the specific examples visible in the source: tiny object/food photos held in a hand, ChatGPT app icon and mobile interface, typed prompts that turn objects into cute expressive characters, a generated pear-like baby character image, a switch to another AI generation interface, upload and prompt steps for video, and a final generated moving result shown on-screen. Preserve the educational pacing and creator-marketing vibe. SHOT SEGMENTS: [00:00-00:06] Hook with object photos in hand and creator talking-head intro about making AI content from your phone. [00:06-00:14] Mobile screens show ChatGPT / image workflow setup, app screens, and prompt entry. [00:14-00:22] Creator explains the key steps while on-screen phone UI shows prompt refinement and generated object-to-character image outputs. [00:22-00:30] The tutorial switches to an AI video tool, showing upload, prompt, and generation steps from the phone. [00:30-00:37.8] Final result displays the generated animated character clip, while the creator closes with a call to try the workflow. ENVIRONMENT: creator desk/studio face-cam plus crisp mobile screen recordings. CAMERA: direct-to-camera presenter shots alternating with full-screen phone UI captures. LIGHTING: clean creator-studio lighting on face-cam; bright legible phone UI on inserts. MOTION: tutorial pacing, finger taps on phone UI, creator emphasis gestures, no cinematic narrative scenes. NEGATIVE PROMPT: generic AI ad montage, unrelated tools, desktop-only workflow, no phone UI, missing creator face-cam, subtitles replacing the actual visible UI, blurry screens, watermark, logo overlays. SPEECH PACK: creator-to-camera tutorial speech implied, but do not transcribe captions here.
Ai Spotify Canvas Maker
Spotify Canvas is one of the most constrained music visual formats, which is exactly why dedicated examples matter. You only have a few seconds, the clip has to loop cleanly, and the composition needs to feel good on a phone while the music keeps playing underneath. A strong Canvas visual usually feels atmospheric and intentional rather than loud or overloaded.
This page is useful when you want to make a loop that strengthens the track instead of competing with it. Look for prompts that create subtle motion, clear vertical framing, and a repeat point that feels natural. The best examples often use one simple visual idea extremely well, rather than trying to compress a full music video concept into a tiny loop.
What makes a good Spotify Canvas? A clean loop, strong vertical composition, and motion that supports the mood of the song without becoming distracting.
How long should the loop feel? Usually short and seamless. The viewer should not notice a harsh restart when it repeats.
Can I reuse a social clip as Canvas? Sometimes, but most social clips are too busy. Canvas usually needs a simpler, more atmospheric visual rhythm.