0:00 / 0:00

Phil Franco | AI Tools Expert 💰

@ai.withphil

INSTAGRAM · 2026-03-18Source

67likes

105comments

Remix This

Recreate with Kling 3

Make your own AI viral video

Prompt

GLOBAL LOCK: vertical Instagram AI tutorial reel hosted by a red-haired bearded male creator speaking directly to camera from a warm wood-panel backdrop; repeated cutaways to Pollo AI interface, ChatGPT prompt windows, generated portrait grids, and face-consistent character examples; bold short text beats synchronized with each spoken step; social-media tutorial pacing; clean screen-recording inserts; no unrelated footage, no color drift, no extra hosts, no meme chaos.

00:00-00:05
The host introduces an AI face-consistency workflow in a vertical talking-head setup. Split-screen and stacked portrait examples show the same person rendered in multiple styles, while bold on-screen text emphasizes that this can be done in a few steps.

00:05-00:11
The reel cuts between the host and a ChatGPT window, explaining how to upload a selfie and ask for a full descriptive prompt or face analysis. The creator gestures while short text phrases summarize each instruction.

00:11-00:18
Screen recordings show Pollo AI and related interface panels, including prompt boxes, generation modes, and output galleries. The host explains how to paste prompts, select models, and generate high-consistency character images from the selfie input.

00:18-00:26
Generated results fill the screen: grids of portraits, stylized headshots, and character variants with similar facial identity. The host calls out benefits like cheaper generation, faster workflow, better emotional range, and more natural skin consistency.

00:26-00:33
The tutorial transitions into the editing stage, where generated images are dropped into a video editor or transformation workflow. Example outputs show the same person preserved across multiple frames and styles, reinforcing per-frame alignment and prompt reuse.

00:33-00:36
The host ends with a direct call to action, prompting viewers to comment for the AI tool or workflow details. End card style remains simple, with the host centered and example outputs floating around him.

NEGATIVE PROMPT:
horizontal video, outdoor vlog footage, unrelated gaming UI, messy desktop clutter, unreadable text overload, warped faces, inconsistent identity drift, low-resolution screen captures, extra presenters, cartoon slapstick, random stock footage, dramatic camera shake

How ai.withphil Made This Pollo AI Face Consistency Selfie To Video AI Video — and How to Recreate It

Write the final English HTML here. No inline styles.

Video Overview

This AI tutorial video explains how to take a single selfie, turn it into a detailed identity prompt, generate multiple highly consistent portraits, and then move those images into a video workflow. The creator presents the process as a practical shortcut for anyone trying to make the same person appear stable across many AI-generated images and scenes.

The reel uses a familiar short-form structure: talking-head explanation, screen recordings of tools, generated examples, and bold text overlays that reduce each stage into a quick sequence of actions. The value proposition is clear from start to finish: fewer prompt-writing headaches, more face consistency, and faster asset creation for AI video.

Workflow Explained

The method begins with a selfie. That image is uploaded into a language model workflow so the creator can extract or refine a detailed description of facial traits, hair, skin tone, clothing cues, and visual identity markers. Instead of manually guessing how to describe a face, the user turns the original image into a reusable prompt base.

Next, that identity prompt is moved into Pollo AI or a similar generation interface. The creator shows prompt input panels, model choices, and result grids to demonstrate how the same person can be rendered repeatedly with better similarity from output to output. The final step is to drop those outputs into an editor or transformation flow so the character remains aligned across frames in the final video.

Why the Method Works

The central problem this reel addresses is identity drift. Many AI image and video workflows produce characters that look similar in one frame but change too much in the next. By starting with a strong image-derived description and then reusing that description systematically, the workflow improves continuity and reduces the need for endless manual corrections.

The creator also highlights practical benefits: faster generation, lower iteration cost, more natural skin rendering, better emotional expression, and cleaner per-frame alignment. Those claims fit the visual examples shown on screen, where multiple portraits and stylized outputs still preserve the same recognizable core identity.

Tool Stack and Screen Flow

The reel repeatedly alternates between three visual layers: the creator speaking on camera, ChatGPT-style prompt-building screens, and Pollo AI generation or editing interfaces. This keeps the tutorial grounded. Viewers can see both the abstract logic of the workflow and the concrete UI actions needed to repeat it.

That pacing is effective because it mirrors how users actually work. First they define identity, then they generate image assets, then they move those assets into an editor or video tool. By visually sequencing those stages, the video becomes more than a sales pitch. It becomes a reproducible mini-workflow for AI creators trying to build face-consistent character content.

Why This Tutorial Is Useful

This video is useful for anyone making AI avatars, influencer clones, video transformations, cinematic character scenes, or social content based on a real person. If the same face needs to appear in multiple styles, outfits, or shots without drifting too far, this reel offers a pragmatic starting framework.

For prompt engineers and AI content creators, the biggest takeaway is that consistency usually comes from process discipline rather than one magic model. Use a reference image, derive a structured identity description, generate in batches, compare outputs, and only then move into video assembly. That is the operational logic this tutorial communicates well.