Mona Lisa frown: Machine learning brings old paintings and photos to life

Machine learning researchers have produced a system that can recreate lifelike motion from just a single frame of a person’s face, opening up the possibility of animating not just photos but also paintings. It’s not perfect, but when it works, it is — like much AI work these days — eerie and fascinating.

The model is documented in a paper published by Samsung AI Center, which you can read here on Arxiv. It’s a new method of applying facial landmarks on a source face — any talking head will do — to the facial data of a target face, making the target face do what the source face does.

This in itself isn’t new — it’s part of the whole synthetic imagery issue confronting the AI world right now (we had an interesting discussion about this recently at our Robotics + AI event in Berkeley). We can already make a face in one video reflect the face in another in terms of what the person is saying or where they’re looking. But most of these models require a considerable amount of data, for instance a minute or two of video to analyze.

The new paper by Samsung’s Moscow-based researchers, however, shows that using only a single image of a person’s face, a video can be generated of that face turning, speaking and making ordinary expressions — with convincing, though far from flawless, fidelity.

It does this by frontloading the facial landmark identification process with a huge amount of data, making the model highly efficient at finding the parts of the target face that correspond to the source. The more data it has, the better, but it can do it with one image — called single-shot learning — and get away with it. That’s what makes it possible to take a picture of Einstein or Marilyn Monroe, or even the Mona Lisa, and make it move and speak like a real person.

In this example, the Mona Lisa is animated using three different source videos, which as you can see produce very different results, both in facial structure and behavior.

It’s also using what’s called a Generative Adversarial Network, which essentially pits two models against one another, one trying to fool the other into thinking what it creates is “real.” By these means the results meet a certain level of realism set by the creators — the “discriminator” model has to be, say, 90% sure this is a human face for the process to continue.

Techcrunch event

San Francisco | October 27-29, 2025

REGISTER NOW

In the other examples provided by the researchers, the quality and obviousness of the fake talking head varies widely. Some, which attempt to replicate a person whose image was taken from cable news, also recreate the news ticker shown at the bottom of the image, filling it with gibberish. And the usual smears and weird artifacts are omnipresent if you know what to look for.

That said, it’s remarkable that it works as well as it does. Note, however, that this only works on the face and upper torso — you couldn’t make the Mona Lisa snap her fingers or dance. Not yet, anyway.

Topics

AI, Biotech & Health, deepfakes, facial recognition, machine learning, Samsung

Devin Coldewey

Writer & Photographer

Devin Coldewey is a Seattle-based writer and photographer.

His personal website is coldewey.cc.

View Bio

Topics

More from TechCrunch

Mona Lisa frown: Machine learning brings old paintings and photos to life

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

AI recruiter Alex raises $17M to automate initial job interviews

Vibe-coding startup Anything nabs a $100M valuation after hitting $2M ARR in its first two weeks

The AI services transformation may be harder than VCs think

Famed roboticist says humanoid robot bubble is doomed to burst

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Electronic Arts will reportedly be acquired for $50B

Spotify to label AI music, filter spam and more in AI policy change

It isn’t your imagination: Google Cloud is flooding the zone

Mona Lisa frown: Machine learning brings old paintings and photos to life

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Most Popular

AI recruiter Alex raises $17M to automate initial job interviews

Vibe-coding startup Anything nabs a $100M valuation after hitting $2M ARR in its first two weeks

The AI services transformation may be harder than VCs think

Famed roboticist says humanoid robot bubble is doomed to burst

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Electronic Arts will reportedly be acquired for $50B

Spotify to label AI music, filter spam and more in AI policy change

It isn’t your imagination: Google Cloud is flooding the zone