Explore. Create. Go viral.

Browse trending looks, switch between featured topics, and open standout prompts or ready-to-use scenes without leaving the page.

Describe what you want to create...

GLOBAL LOCK:
Subject is a Caucasian male in his early 30s, short dark hair, wearing a black baseball cap and a black minimalist t-shirt with a small white "KITH" logo on the left chest. He has a friendly, energetic, and authoritative demeanor. The environment is a professional home studio with a dark background, featuring warm practical lighting (a desk lamp to the left, a vertical LED light bar to the right) and shelves with tech collectibles. The lighting is low-key with a soft key light on the subject's face. The color grade is warm, high-contrast, and cinematic. The speech is fast-paced, enthusiastic tech commentary with a clear, dry microphone signature.

[00:00–00:12]
Subject is off-camera. The visual is a high-quality AI-generated screen recording. A grassy field with a stop sign is shown. A text box at the bottom says "Add a building." A large, modern apartment building seamlessly appears in the background. The camera pans right. The text changes to "And maybe some street signs." Street signs and a lamp post appear. The text changes to "And just make it look like New York City." The scene transforms into a gritty New York alleyway with brick walls and graffiti. A piece of paper is added to the ground via voice command, then deleted. The motion is smooth and iterative.

[00:13–00:21]
Hard cut to the subject in a Medium Close-Up (MCU). He is centered, gesturing with his hands. Large yellow kinetic text "vocal" and "world building" pops up over his chest. He is explaining the concept with high energy. The background is blurred (shallow depth of field).

[00:22–00:38]
Split-screen view. The top 60% of the frame shows the New York alleyway AI demo from the beginning, continuing to evolve with graffiti and a bicycle appearing. The bottom 40% shows the subject in a circular cutout, talking and gesturing. Captions appear in the middle: "Build a New York scene," "Lay the city structure," "Add more details." The subject's lip-sync is tight to the audio.

[00:39–00:55]
Rapid montage of B-roll overlays. [00:39] A first-person shooter game view (GTA style). [00:40] The Sims interface with a female character. [00:41] A man in a chicken suit running by a canal. [00:42] A jet ski on the same canal. [00:43] Cut back to the subject in MCU, gesturing wildly. [00:50] Close-up of Tony Stark's eyes from Iron Man with HUD graphics. [00:51] A garage full of classic cars. [00:53] A person in a VR headset. The subject's voiceover continues throughout.

[00:56–01:10]
The subject remains in MCU, but logos and text overlays appear. [00:58] Large white text "Genie." with "A new frontier for world building" below it. [00:59] A mammoth with jetpacks demo. [01:00] A man underwater with a tablet. [01:01] Two women walking on a cliffside. [01:03] A green monster emerging from a street crack. The subject points and gestures toward the overlays.

[01:11–01:31]
Final sequence. The subject is in MCU, speaking directly to the camera with increasing intensity. Fast cuts between his face and various AI-generated scenes: a Coca-Cola truck in a snowy tunnel, a person looking at a digital watch, a group of people in futuristic suits, a close-up of a running shoe, a person with a tattooed back in a pool. The video ends with the subject making a "strap in" gesture with his hands as the text "world building" appears.

NEGATIVE PROMPT:
Visual: blurry face, inconsistent hat logo, flickering lights, low-resolution AI renders, distorted hands, robotic movements, messy background, flat lighting, dull colors.
Speech: robotic voice, monotone delivery, background noise, echo, muffled audio, poor lip-sync, stuttering, unnatural pauses.

SPEECH PACK:
[00:00-00:12]
Transcript: "Add a building and maybe some street signs and just make it look like New York City. Maybe... a piece of paper next to the building? No, delete that."
TAKE_A: (Calm, instructional, slightly hesitant on the "Maybe...")
TAKE_B: (Confident, rapid-fire commands)
TAKE_C: (Casual, conversational, like talking to a friend)

[00:13-00:21]
Transcript: "What you're seeing is called vocal world building. This is where artists and creators can use words to describe a visual environment and have it come to life in real time using AI."
TAKE_A: (High energy, emphasizing "vocal" and "real time")
TAKE_B: (Authoritative, slow down on the definition)
TAKE_C: (Excited, breathless delivery)

[01:24-01:31]
Transcript: "So if you're an artist, creator, entrepreneur, or a brand... strap in. Because we have officially just entered the era of the world building."
TAKE_A: (Building crescendo, punchy "strap in")
TAKE_B: (Serious, visionary tone)
TAKE_C: (Fast, energetic, direct call to action)