0:00 / 0:00

Kling 2.6 Motion Control Tests 🎬 Os dejo algunas pruebas que hice con Kling... La verdad es que sigo pensando que le queda mucho trabajo para llegar a un resultado decente 👀 No hay consistencia en ningún momento, la cara se deforma con cada segundo que pasa por no decir que de cada 10 videos que intento generar, 8 de ellos me dan error 🥲 De momento sirve para hacer videos graciosos para internet o las redes sociales pero en ningún caso para un proyecto profesional 😅 Igualmente, si quieres que te mande los vídeos de referencia que usé para hacer estos vídeos comenta "ARIA" y te los mando por mensajes 💌

Aria Cruz | Influencer AI

@soy_aria_cruz · Digital creator

INSTAGRAM · 2025-12-30Source

1.8Klikes

1.1Kcomments

Remix This

Prompt

GLOBAL LOCK: A photoreal vertical 4:5 social-media test video showing a brunette woman performing multiple motion-control actions in a bedroom/studio corner setup while a narrow left-side overlay displays the two source references used for the test. Keep the main subject as a young woman with very long straight black hair, pale skin, light green eyes, large silver hoop earrings, and a black-and-white horizontal striped knit sweater. She sits or stands in front of a white chair with a beige wall and slanted ceiling behind her, lit by warm indoor lighting. On the left edge, preserve a teal vertical strip that contains a top reference image of the same woman in a black sleeveless top, a plus sign, and a lower reference image showing the desired motion pose. Add yellow arrow graphics and the words "KLING Motion Control" on the teal strip. The overall feeling should be an honest AI tool test for social media, not polished commercial footage. Keep the subject attractive and recognizable, but allow subtle instability associated with a model test. No subtitles, no narration requirement, no extra text beyond the built-in motion-control overlay.

[00:00-00:03.60] Start with the woman centered and smiling gently toward camera, shoulders relaxed and hair falling evenly down both sides. The left overlay clearly shows the two reference images and branding strip. Her face should initially look close to the source identity, with clean beauty-shot symmetry and direct eye contact.

[00:03.60-00:07.20] Transition into more active hand-led gestures driven by the motion reference. She points toward camera, lifts both fists near her chest, and shifts her shoulders slightly forward. Keep the sweater stripes readable and the earrings visible, but allow minor hand-size exaggeration and subtle face reshaping as the motion increases.

[00:07.20-00:11.40] Continue through a sequence of pose changes including chin-on-hand, fists raised, and a playful forward lean. Preserve the same room, lighting, and hairstyle, but show typical motion-control drift: mouth shape changes become less stable, facial proportions wobble, and the arms occasionally look softer or less anatomically clean than in a real recording.

[00:11.40-00:15.18] End with broader expression changes, including a slightly open mouth, a cheek-pointing gesture, and a playful wink-like face. The clip should still feel like a useful test of reference-plus-motion transfer, but the final seconds reveal the core weakness: identity consistency drops as gestures become more complex. Finish on a frame that feels shareable for AI-tool critique rather than as a flawless finished ad.

NEGATIVE PROMPT: perfect commercial beauty ad, missing teal overlay, missing reference thumbnails, different woman, short hair, different sweater, extra fingers, broken wrists, floating hands, duplicated earrings, missing chair, dramatic color grading, outdoor background, hyper-cinematic lighting, subtitles, watermark overlays, text captions, extreme camera movement, perfect anatomical motion, anime rendering, plastic skin, fully stable face in every frame.

SHOT PROMPTS:
SHOT 1 DELTA: Calm centered beauty shot with strongest identity match and clean room background.
SHOT 2 DELTA: Pointing and fist gestures begin, hand perspective increases and motion drift starts to appear.
SHOT 3 DELTA: Chin-rest and playful fist motions push facial consistency and anatomy harder.
SHOT 4 DELTA: Mouth-open expression, cheek-point gesture, and wink-like ending expose identity instability.

SPEECH PACK:
[00:00-00:15.18]
- speech_present: possible original talking-head audio, but lip-sync is not required for replication
- speakers: one visible female subject
- transcript_segments: []
- audio_direction: optional low room tone or creator talking audio; prioritize gesture timing and visual motion-control behavior over spoken sync
- sync_notes: expressions may imply speech, but the key target is motion-transfer testing rather than accurate dialogue

Case Snapshot

This 15-second vertical clip is not selling a perfect AI video result. It is selling the gap between what a motion-control demo promises and what the output actually delivers. A brunette creator appears on the right side performing a string of social-media gestures while a teal left-side strip shows the two references used to drive the test. At first glance the clip looks strong: the base identity is attractive, the lighting is simple, and the framing is native to Instagram or TikTok. But the real hook is the visible instability. As the gestures become more complex, face shape, mouth structure, and arm anatomy start drifting. That honesty is exactly why this post works as creator content.

What You're Seeing

The left-side overlay makes the experiment instantly legible

The teal strip with the source portrait, motion reference, plus sign, and arrow graphics tells viewers this is a transformation test, not a casual selfie video.

The room setup is deliberately ordinary

A beige wall, slanted ceiling, white chair, and warm indoor light make the clip feel like a realistic bedroom or home-studio recording. That helps viewers focus on the model behavior instead of the set.

The identity match is best at low motion

In the opening seconds the woman's face looks close to the source image. Hair, eyes, jawline, and earrings all hold together reasonably well when movement is minimal.

Hand-led gestures expose the system fast

As soon as she points toward the lens or raises both fists, perspective exaggeration and anatomy softness become easier to spot.

The sweater is doing anchor work

The bold black-and-white stripes help the model preserve torso orientation. Without that garment pattern, the drift would likely feel even worse.

Expression changes are where the face starts breaking

When the mouth opens, the cheeks lift, or one eye narrows into a wink-like expression, facial consistency drops more visibly than during the neutral smile.

This is creator education disguised as entertainment

People watch because the woman is compelling on screen, but they comment because the clip demonstrates a familiar frustration: motion control can look promising in still frames and unreliable in motion.

The CTA makes the post perform better

The caption offers to send the reference videos to anyone who comments a keyword. That turns a tool critique into a comment engine.

Shot-by-shot Breakdown

Time range	Visual content	Motion behavior	Stability assessment	Viewer takeaway
00:00-00:03.60	Centered beauty-shot pose with mild smile and clear reference strip on the left.	Very light head and shoulder movement.	Best identity match in the clip.	The demo initially feels convincing.
00:03.60-00:07.20	Forward pointing gesture and raised fists near the chest.	Perspective-heavy arm motion and quicker facial shifts.	Hands begin to look exaggerated and facial geometry softens.	Good social energy, but realism starts slipping.
00:07.20-00:11.40	Chin-on-hand pose, playful bounce, and continued fist gestures.	Multiple pose transitions within the same framing.	Identity consistency weakens, especially around mouth and cheeks.	The output feels usable for memes, not premium production.
00:11.40-00:15.18	Open-mouth expression, cheek-point gesture, and wink-like ending.	Most demanding expression sequence in the clip.	Highest visible drift in face shape and gesture cleanliness.	The test ends by proving the creator's criticism.

Why This Format Spreads

It is more credible than a pure hype post

Audiences are saturated with flawless AI demos. A creator showing where the tool fails feels more trustworthy and more worth saving.

The woman on screen is strong enough to hold attention

Even before viewers analyze the defects, the clip has enough beauty-shot appeal and gesture variety to stop the scroll.

The visual structure explains itself in under a second

Reference image plus motion image plus result is a clean educational format. People instantly understand what is being tested.

The post invites debate

Creators who use AI video tools already have opinions about consistency, error rates, and production readiness. This clip gives them something concrete to argue about.

The keyword CTA converts curiosity into comments

Offering the reference videos in direct messages gives viewers a practical reason to engage, not just react.

Failure Analysis

Identity holds only when motion stays shallow

The strongest frames are the early neutral ones. Once the motion reference asks for larger gesture changes, the face no longer stays locked to the source identity.

Hands are still a weak point

Pointing toward camera and raising both fists create depth cues that the model struggles to render cleanly. The result is not always grotesque, but it is not production-safe either.

Mouth animation degrades quickly

Open-mouth frames and speech-like expressions make the lower face less stable. That is why the clip feels acceptable as a silent social test and weak as a polished talking-head replacement.

High generation failure rate matters as much as visible drift

The caption states that most attempts fail before even producing a usable clip. That means workflow reliability, not just visual quality, is a major blocker for professional use.

This is why the creator frames it as social-content only

The output is still good enough for funny internet videos, reaction bait, and platform-native experiments. It is just not dependable enough for client-grade or story-critical scenes.

Prompt Breakdown

The scene succeeds because the environment is simple

A plain room, stable camera position, and single subject reduce the number of failure points. If this clip were set in a complex environment, the drift would likely be much more obvious.

The subject design is easy to track

Long dark hair, hoop earrings, pale skin, and a striped sweater create strong visual anchors that help preserve recognition for at least the first half of the video.

The overlay is part of the content package

Do not treat the left strip as decoration. It is the mechanism that makes the output educational and comment-worthy.

The honest framing is as important as the prompt

This post works because it is presented as a test, not as a breakthrough. The language around the clip turns technical weakness into valuable creator insight.

How to Recreate It

Step 1: Choose a source identity with strong visual anchors

Hair length, earrings, and a high-contrast sweater all help the model preserve continuity.

Step 2: Use a clean indoor setup

Keep the background simple and static so viewers can judge motion quality without distraction.

Step 3: Pair one portrait reference with one motion reference

The format is easiest to understand when the audience can see both inputs directly on screen.

Step 4: Escalate from easy gestures to harder expressions

Start with small head movement, then add pointing, fists, chin-rest poses, and mouth-heavy expressions to reveal where the model breaks.

Step 5: Keep the camera locked

Do not add pans or zooms. The point of the test is identity transfer under motion, not virtual cinematography.

Step 6: Publish the flaws, not just the best frame

Audiences learn more from seeing the drift than from seeing one polished still.

Step 7: Add a practical CTA

Offer the prompt, reference video, or setup notes in exchange for a comment keyword to turn critique into measurable engagement.

Growth Playbook

3 opening hook lines

Kling motion control still breaks the face once the gestures get real.
This AI video test starts strong and falls apart exactly where you'd expect.
Good enough for social media, not good enough for professional work yet.

4 caption templates

Hook: "I tested Kling 2.6 motion control with a simple creator setup." Value: "The first seconds look promising, but the face and hands drift as the gestures get harder." Question: "Would you still use this for content?" CTA: "Comment ARIA if you want the reference videos."
Hook: "This is why AI motion control still frustrates me." Value: "Identity is decent at low motion, then the anatomy starts softening." Question: "What tool should I compare next?" CTA: "Write ARIA below."
Hook: "One of the clearest motion-control tests I've run this month." Value: "The overlay shows exactly what went in and why the output only partially works." Question: "Do you want the inputs?" CTA: "Type ARIA."
Hook: "Useful for funny internet videos, not ready for serious production." Value: "The biggest issue is not only drift, but how often generations fail before you even get a usable take." Question: "Has that been your experience too?" CTA: "Comment ARIA."

Hashtag strategy

Broad: #AIVideo #KlingAI #MotionControl #AIContent. These catch general discovery around AI video tools.

Mid-tier: #Kling26 #AIVideoTest #MotionControlTest #CreatorTools. These fit the actual use case more closely.

Niche long-tail: #KlingMotionControl #AIFaceConsistency #AIVideoFailure #ReferenceToVideo. These target users specifically researching this workflow.

FAQ

Why does the clip look best in the opening seconds?

Because the subject is nearly static, so the model has fewer chances to distort the face or anatomy.

What part of the motion causes the biggest problems?

Forward hand gestures, mouth-heavy expressions, and quick pose changes create the most visible drift.

Why keep the left-side reference strip visible?

It makes the test instantly understandable and turns the post into an educational breakdown instead of a vanity clip.

Is this format still worth posting if the result is imperfect?

Yes, because honesty about tool limitations often performs better than polished hype in creator communities.

Can this kind of output be used for paid professional projects?

Not reliably. It is safer for social experiments, trend posts, and humorous content than for client-critical production work.