Using Kling’s Image Generator For Consistent Characters

Calculating... Comments

Okay, so I’ve been working on a thorough article about the current state of maintaining a consistent character in AI videos through images lately. Everyone know the pain is real:

meme-ai-characters — AI characters over time

But I’ve decided to split this large issue into smaller dedicated posts covering particular tools more in-depth. This one is about Kling’s image generator in this context.

The scoop:

Kling AI’s image gen keeps faces crazy consistent across images
Videos are just image sequences so this matters more than you'd think
The strength sliders are super useful but have limits
You can reuse video frames to get new angles and emotions
Kling’s Elements feature helps with clothes and scenes too

I think Kling’s image generator in general—and particularly their feature with reference character face subjects—is super underrated. The whole reference image thing is underhyped, especially for keeping a consistent character across videos. It’s surprisingly good at managing consistency in a series of images.

Videos Are Just Image Sequences

And what’s a video, really? It’s just a series of frames—basically images—playing one after another. That’s what creates animation, with subtle movement and scene changes between each frame. So yeah, a video is just a bunch of images.

When you give tools like Kling two keyframes, for example—one frame and the next—it fills in the gap with many, many intermediate images. That creates smoother animation. The more images between the keyframes (or “source” and “target”), the smoother the motion, and the better the result overall.

The Magic (and Limits) of Kling's Image Generator's Face Consistency

Anyway, back to the image generator and keeping a consistent character.

It works really well—as long as you can feed it a good, high-quality reference image with a clear face. Once you've got that, it can generate an unlimited number of new images based on that reference.

Another example of the same face in new setting — Example of the same face in new setting generated by Kling

Now, there are some nuances to it. One of them is the strength sliders—those bars you can move back and forth—to control how strictly it sticks to the reference, the face and the subject.

klingai-reference-strength-settings-slider — Kling AI reference strength slider

By default, it sits at 42. This isn't enough for precise likeness, only for very similar face.

👉 I am using values between 70 and 100 on that scale

But, if you set the face adherence this high, it’ll limit variations. It'll basically just copy-paste that face, same angle and expression, into every image. That can be what you want—if you need exact facial consistency—but only if you’re okay with the same angle and expression every time.

So yeah, with one reference image, you'll get an infinite amount of near identical-looking faces. You can change the background, outfit, setting, whatever—but the face will be almost copy-pasted. That’s not ideal for more complex works or longer videos.

That said, it’s not catastrophic either. Once a video starts animating, the next frame already introduces change—different angles, expressions, etc. But it can be a little jarring if every 5–10 seconds the face resets to the exact same expression.

You can reduce that strictness a bit—turn the slider left—but then you get more facial variation and a bunch of unintended changes, you start seeing similar people but not the same.

By the way, when you first upload your reference image, the pop up window offers you a choice of selecting what to reference:

Face only
Subject
Entire image

Obviously, face is the most important to us, but if you want to also copy the whole figure and pose, then choose Subject, and play with those sliders for the subject reference strength.

kling-reference-subject-and-face — Sliders for face and subject reference

A Smarter Workflow: Reusing Frames

A better workaround is to generate a bunch of images using your first reference, and for that we can make a short video with the first one, and then grab frames from that video where the character is at a new angle or showing a new expression. Use one of those frames as your next reference image. So take the first best image of the face -> make a video from it -> grab stills from it.

subject-face-angles-workflow-inforgraphics — Getting more face angles and expressions from video

Ideally, your character is initially facing the camera, eyes open, mouth closed.

👉 You're better off having your subject's 'base', 'passport' face to start with, I've learnt this by making a mistake of starting with a more 'interesting' face with a smirky half-smile. It introduces bias further down the line.

Kling does pretty great animating faces. Just make sure the video is high-quality—use the pro mode, not the standard one—so the new face angles are sharp and detailed. Ask for simple, smooth movements, nothing too dynamic as that can create face blur. To maximize emotional variance, prompt Kling to the effect of:

Static shot of the the person auditioning for a movie as an actor. He displays various emotions to show his acting skills. He tilts his head and looks left, then right, then down, all the while showing such emotions as surprise, sadness, confusion, interest...

This is just one context you can give to focus AI's attention on producing emotional expressions for you. Posing for a photoshoot in general works great for that. A 5 second video is likely only enough for a couple of expressions, and I recommend rather doing that than trying to mention several at once for a 10 second length.

various-facial-expressions — Start collecting them in one folder on your computer

You can also do Kling’s Elements, not just image-to-video. Honestly, it’s gotten really good lately. And it allows you to dress your characters easily into what you have in mind.

Kling AI Elements feature for consistency

Kling's Elements for Consistency Example

Kling's Elements for Consistency A short example of placing a character into a new setting wihile changing her clothes too.

So then the video shows them turn slightly, maybe a head tilt. That profile shot? That’s your new reference. That gives you fresh angles and expressions for the next video. You just repeat the process: new reference, new set of images, new animation.

Profile headshot as reference for new image

kling-subject-reference-example-side-by-side — Side by side view of reference image and new generation with new clothes and scene

More examples form this headshot:

Same woman in new clothes now driving a car — Same woman now driving a car

Here's more face variations and their derivatives.

Another face reference generation side by side comparison

Ok the girl's been working hard for us, let's send her on vacation before she organizes her union for a protest:

Kling's Virtual Try-On for Outfit Consistency or Change

Trying to describe clothing in details even if you have evverything down to minute detail right is still largely futile as there's always tons of variations. You realize that after your attempt #25 or so of trying to get the sleeves right, while at the same time AI losing track of collar.

I can not fully recommend the Virtual Try-on, but you could try it, just don't expect too much of it, quality of the output is kind of meh.

Virtual Try-On in Kling for outfit consistency

So basically regarding outfits, when relying solely on Kling, I recommend just going with Elements (mentioned above) and making a video straight away by supplying the outfits you want. You can then extract frames from it and use those as new reference images, just like with faces. But this time you could make a video fo a 'model posing for a photoshoot wearing these outfits, slowly changing poses' (or assuming the specific pose you need for your next video).

Some Examples of Video Made With Consistent Subject Images

Image to Video With Consistent Subject Example

Every image in this video was generated in Kling, from the initial image by Recraft. You might notice though I haven't exactly kept it too strict on the face accuracy, could've been done better by a more patient and thorough person.

A lot of image generation attempts went into making that one as I was trying to get the clothes match at first, among other things, before abandoning that fixation:

A series of generated similar images on Kling AI platform

You can also notice how all the faces there are angled the same way - what I've told you about - the same expression gets stamped on over and over. This was my first attempt, where I've learnt that lesson. Produce various facial expressions first, before trying to generate final images for video production. I'm now officially out of Kling's credits for now 🤪 so I'm concentrating on other platforms for the time being.

So that's in a nutshell how you can work around the problem of consistent characters with Kling. There are other methods and tools that I'm using, which I'm going to be covering in my next post, so stay tuned!

Published: Apr 14, 2025 at 7:45 AM