How sferro21 Made This Edit Elements Inside Videos Using AI — and How to Recreate It
This short square video is a compact proof-of-concept for one of the most useful creator workflows in AI video: changing specific visual elements inside an existing clip without rebuilding the whole performance from scratch. In under 10 seconds, Simone Ferretti uses a consistent presenter layout and transforms the same lower-panel subject through multiple versions: superhero-style suit, black armored outfit, aviator jacket with sunglasses, and a final fire-powered action variant. Because the camera framing and body position stay nearly the same while wardrobe and effects change, the video teaches a clear lesson: AI can preserve the performance while swapping key visual components.
TOC: why the ad works, first 3-second hook, transformation breakdown, visual system, prompt reconstruction, remake steps, replaceable variables, editing tips, common failures, publishing lessons, FAQ, and structured data.
Why this video works
The video is successful because it solves a practical creator problem very quickly. Many people understand AI image generation, but fewer understand selective video element editing. This reel demonstrates that concept with minimal friction. The same man stays in place, but his costume, styling, accessories, and effects change. That makes the workflow immediately legible for search intent like edit elements in video with AI, replace clothes in video using AI, change character design without reshooting, or AI video style swap tutorial.
What happens in the first 0-3 seconds
The hook combines three things at once: a talking presenter, a strong yellow headline, and a clear lower-panel transformation. The first lower-panel version already looks like a superhero reinterpretation of the original subject, so the viewer understands the promise almost instantly. The fixed square layout also helps because it feels like a product demo rather than a random clip montage.
Transformation breakdown
00:00-00:02 Superhero suit variant
The lower panel begins with a body-preserving costume swap. The subject remains centered and recognizable, but the outfit becomes a dark suit with strong red accents, creating a comic-book or action-hero impression.
00:02-00:04 Armored black variant
The second state deepens the transformation by pushing toward a heavier tactical or armor-like look. This demonstrates that the workflow can swap not only color but material and costume language.
00:04-00:06 Aviator variant
The third version adds sunglasses and a bomber jacket, changing not just the wardrobe but also the attitude of the character. This is a more lifestyle- or film-inspired edit than the earlier suit variations.
00:06-00:09 Fire-hand action variant
The final version adds energetic flame effects around the hands while preserving the aviator styling. That escalation is important because it proves the workflow can edit both costume elements and special-effect layers.
Visual style breakdown
The top frame remains visually stable: presenter, neutral dark room, warm face light, large yellow headline, and instructional gesture language. The lower frame is where all of the change happens. That separation is smart because it keeps the educational layer fixed while the edited subject cycles through visual possibilities. The result is a reel that feels easy to understand even without pausing.
Prompt reconstruction notes
To recreate this type of content, describe the original source performance first: same face, same pose, same framing, same body orientation. Then define the element changes one by one: superhero outfit, black armor textures, aviator jacket and sunglasses, flame effects on forearms or hands. The key is to tell the model what must stay stable and what can change. In a workflow like this, identity, pose, and framing are locked, while wardrobe, accessories, and effect layers are variable.
Step-by-step remake workflow
1. Start with a stable source clip
The clip should have a clear subject, readable pose, and minimal camera movement so the edit focuses on visual replacement rather than motion reconstruction.
2. Decide what stays fixed
Lock the person’s face, body position, and framing. That is what makes element replacement feel impressive and believable.
3. Define one change category at a time
Separate costume changes from accessory changes and from effect changes. The reel’s progression works because each variation is distinct.
4. Build escalating variants
Move from simple wardrobe changes to more dramatic stylization and then to VFX-like effects such as fire. Escalation increases retention.
5. Keep the educational frame stable
If you are presenting the workflow, keep your title area and presenter framing fixed so all attention goes to the lower-panel transformation.
6. Finish with a clear next action
The “Swipe to learn” CTA works because the visual proof has already done the convincing.
Replaceable variables
You can replace the superhero and aviator looks with chef uniforms, luxury fashion, armor, business suits, fantasy robes, sports kits, or music-video styling. You can replace flame effects with smoke, neon glow, electric arcs, water simulation, or hologram overlays. What should remain unchanged is the structure: same source subject, same frame, visible variation in one controlled category at a time.
Editing and presentation tips
Square format works especially well here because it supports a stacked comparison. Keep the headline large and readable. Use one fixed presenter crop and one fixed transformation panel. Avoid unnecessary transitions between variants. The clarity of the changes is the main product. If the audience cannot instantly see what changed, the educational value drops sharply.
Common failure cases
The most common failure is changing too many variables at once, which makes the result feel like a different person rather than the same clip with edited elements. Another failure is weak face consistency. A third is overcomplicating the layout and reducing readability. A fourth is adding effects too early. This video escalates from clothing to accessories to flames, which makes the demonstration easier to follow.
Publishing and growth actions
Position this kind of page around practical creator searches: edit video elements with AI, replace clothes in video, add effects to existing video without reshooting, AI wardrobe swap video, and AI VFX layer replacement. On social, the best cover is usually the final aviator-plus-flame frame because it communicates both realism and transformation. In supporting copy, emphasize that the workflow preserves performance while changing style. That is the actual value proposition.
FAQ
What exactly is being edited in this video?
The same base subject is kept, while costume, accessories, and visual effects are changed across multiple versions.
Why is the fixed square layout effective?
It keeps the educational context stable and makes each lower-panel transformation easier to compare.
Why end on the fire-hand version?
Because it is the strongest escalation of the idea, combining wardrobe swap with an obvious VFX layer.
What is the core lesson for creators?
When using AI for element replacement, lock identity and pose first, then vary clothing, accessories, and effects in deliberate stages.