0:00 / 0:00

Simone Ferretti

Q: What exactly is being edited in this video?

The workflow keeps the same base subject and performance while replacing costume elements, accessories, and effects.

Q: Why is the fixed square layout effective?

A fixed layout makes the transformations easy to compare because the educational frame stays constant while the edited subject changes.

Q: What is the core lesson for creators?

Creators should lock identity and pose first, then vary wardrobe, accessories, and effects in clearly separated stages.

@sferro21

INSTAGRAM · 2026-03-23Source

Remix This

Recreate with Wan2.7

Make your own AI viral video

Prompt

GLOBAL LOCK: 1:1 square short-form social tutorial ad, split composition with presenter talking in the upper portion and AI-edited character result in the lower portion, same young adult white male presenter with brown hair and dark long-sleeve shirt in a neutral studio, large bold yellow title text on the right reading about editing elements inside videos using AI, dark gray background, clean creator-education layout. Lower panel continuously transforms the same man into different character variants while keeping pose and framing consistent: superhero suit, black textured armor, aviator jacket with sunglasses, and final fire-powered action variant with glowing flames around the forearms. Keep before-to-after logic clear and maintain swipe-to-learn CTA styling at the bottom.

00:00-00:02
Open on the tutorial layout immediately: presenter at the top speaking with open-hand gestures, big yellow headline text on the right, lower panel shows the first AI-modified version of the same man in a dark superhero-style suit with red graphic accents, composition is static and ad-like, optimized for instant readability in a square feed.

00:02-00:04
Swap the lower-panel character to a black tactical or armored outfit while the presenter continues gesturing above, keep the face aligned so viewers understand the same source person is being edited rather than replaced, dark neutral studio lighting stays constant.

00:04-00:06
Introduce the aviator-style version: sunglasses, military-inspired bomber jacket, warm indoor background behind the edited subject, visual contrast becomes stronger because clothing, eyewear, and mood all change while the framing remains stable, headline remains fixed.

00:06-00:09
Finish on the most dramatic version, keeping the aviator jacket and sunglasses but adding stylized flame effects rising from the hands or forearms, the presenter above punctuates the CTA while the bottom text encourages viewers to swipe to learn, end with the strongest transformation visible and the whole layout reading like a compact educational ad for AI video element replacement.

NEGATIVE PROMPT: face drift between outfit variations, broken sunglasses, warped jacket patches, bad flame anatomy, inconsistent hand position, unreadable yellow headline, cluttered layout, mismatched skin tone between original and edited subject, cheap superhero suit texture, muddy fire glow, duplicated arms, poor square composition, low-detail background, incorrect presenter framing, logo artifacts, unstable bottom CTA text.

SHOT PROMPTS:
1. Square tutorial cover with presenter top frame and superhero edit bottom frame.
2. Same subject transformed into black armored outfit with stable face.
3. Aviator-jacket and sunglasses variant with warm background.
4. Final flame-hand action variant with swipe-to-learn CTA.

SPEECH PACK:
Single male presenter voice, short direct tutorial cadence, close-mic social audio, clear ad-style delivery. Core meaning: you can edit elements inside videos using AI, change wardrobe and visual identity while keeping the subject, and swipe to learn the workflow. Lips are visible in the presenter panel and should sync to simple persuasive gestures rather than long dialogue.

How sferro21 Made This Edit Elements Inside Videos Using AI — and How to Recreate It

This short square video is a compact proof-of-concept for one of the most useful creator workflows in AI video: changing specific visual elements inside an existing clip without rebuilding the whole performance from scratch. In under 10 seconds, Simone Ferretti uses a consistent presenter layout and transforms the same lower-panel subject through multiple versions: superhero-style suit, black armored outfit, aviator jacket with sunglasses, and a final fire-powered action variant. Because the camera framing and body position stay nearly the same while wardrobe and effects change, the video teaches a clear lesson: AI can preserve the performance while swapping key visual components.

TOC: why the ad works, first 3-second hook, transformation breakdown, visual system, prompt reconstruction, remake steps, replaceable variables, editing tips, common failures, publishing lessons, FAQ, and structured data.

Why this video works

The video is successful because it solves a practical creator problem very quickly. Many people understand AI image generation, but fewer understand selective video element editing. This reel demonstrates that concept with minimal friction. The same man stays in place, but his costume, styling, accessories, and effects change. That makes the workflow immediately legible for search intent like edit elements in video with AI, replace clothes in video using AI, change character design without reshooting, or AI video style swap tutorial.

What happens in the first 0-3 seconds

The hook combines three things at once: a talking presenter, a strong yellow headline, and a clear lower-panel transformation. The first lower-panel version already looks like a superhero reinterpretation of the original subject, so the viewer understands the promise almost instantly. The fixed square layout also helps because it feels like a product demo rather than a random clip montage.

Transformation breakdown

00:00-00:02 Superhero suit variant

The lower panel begins with a body-preserving costume swap. The subject remains centered and recognizable, but the outfit becomes a dark suit with strong red accents, creating a comic-book or action-hero impression.

00:02-00:04 Armored black variant

The second state deepens the transformation by pushing toward a heavier tactical or armor-like look. This demonstrates that the workflow can swap not only color but material and costume language.

00:04-00:06 Aviator variant

The third version adds sunglasses and a bomber jacket, changing not just the wardrobe but also the attitude of the character. This is a more lifestyle- or film-inspired edit than the earlier suit variations.

00:06-00:09 Fire-hand action variant

The final version adds energetic flame effects around the hands while preserving the aviator styling. That escalation is important because it proves the workflow can edit both costume elements and special-effect layers.

Visual style breakdown

The top frame remains visually stable: presenter, neutral dark room, warm face light, large yellow headline, and instructional gesture language. The lower frame is where all of the change happens. That separation is smart because it keeps the educational layer fixed while the edited subject cycles through visual possibilities. The result is a reel that feels easy to understand even without pausing.

Prompt reconstruction notes

To recreate this type of content, describe the original source performance first: same face, same pose, same framing, same body orientation. Then define the element changes one by one: superhero outfit, black armor textures, aviator jacket and sunglasses, flame effects on forearms or hands. The key is to tell the model what must stay stable and what can change. In a workflow like this, identity, pose, and framing are locked, while wardrobe, accessories, and effect layers are variable.

Step-by-step remake workflow

1. Start with a stable source clip

The clip should have a clear subject, readable pose, and minimal camera movement so the edit focuses on visual replacement rather than motion reconstruction.

2. Decide what stays fixed

Lock the person’s face, body position, and framing. That is what makes element replacement feel impressive and believable.

3. Define one change category at a time

Separate costume changes from accessory changes and from effect changes. The reel’s progression works because each variation is distinct.

4. Build escalating variants

Move from simple wardrobe changes to more dramatic stylization and then to VFX-like effects such as fire. Escalation increases retention.

5. Keep the educational frame stable

If you are presenting the workflow, keep your title area and presenter framing fixed so all attention goes to the lower-panel transformation.

6. Finish with a clear next action

The “Swipe to learn” CTA works because the visual proof has already done the convincing.

Replaceable variables

You can replace the superhero and aviator looks with chef uniforms, luxury fashion, armor, business suits, fantasy robes, sports kits, or music-video styling. You can replace flame effects with smoke, neon glow, electric arcs, water simulation, or hologram overlays. What should remain unchanged is the structure: same source subject, same frame, visible variation in one controlled category at a time.

Editing and presentation tips

Square format works especially well here because it supports a stacked comparison. Keep the headline large and readable. Use one fixed presenter crop and one fixed transformation panel. Avoid unnecessary transitions between variants. The clarity of the changes is the main product. If the audience cannot instantly see what changed, the educational value drops sharply.

Common failure cases

The most common failure is changing too many variables at once, which makes the result feel like a different person rather than the same clip with edited elements. Another failure is weak face consistency. A third is overcomplicating the layout and reducing readability. A fourth is adding effects too early. This video escalates from clothing to accessories to flames, which makes the demonstration easier to follow.

Publishing and growth actions

Position this kind of page around practical creator searches: edit video elements with AI, replace clothes in video, add effects to existing video without reshooting, AI wardrobe swap video, and AI VFX layer replacement. On social, the best cover is usually the final aviator-plus-flame frame because it communicates both realism and transformation. In supporting copy, emphasize that the workflow preserves performance while changing style. That is the actual value proposition.

FAQ

What exactly is being edited in this video?

The same base subject is kept, while costume, accessories, and visual effects are changed across multiple versions.

Why is the fixed square layout effective?

It keeps the educational context stable and makes each lower-panel transformation easier to compare.

Why end on the fire-hand version?

Because it is the strongest escalation of the idea, combining wardrobe swap with an obvious VFX layer.

What is the core lesson for creators?

When using AI for element replacement, lock identity and pose first, then vary clothing, accessories, and effects in deliberate stages.