Face2Face: Real-time Face Capture and Reenactment of RGB Videos [video]

mortenjorck · on March 19, 2016

This demo is doubly amazing: First, the obviously impressive (and slightly unsettling in its implications) manipulation of the target face, but second, the fact that this is all being done with a single RGB camera.

Consider the massive rig required to perform the at-the-time groundbreaking performance capture for the game L.A. Noire: http://i.kinja-img.com/gawker-media/image/upload/s--y6fmsAIU... This is how far computer vision has come in five years.

rawnlq · on March 19, 2016

In terms of gaming applications this could be huge for virtual reality avatars. You can still be anonymous but still convey facial expressions with a webcam!

scoot · on March 19, 2016

I'm curious how you think a webcam will be able to read your facial expression when you have a VR headset on you face.

toisanji · on March 19, 2016

smaller vr rigs should just cover the eyes. or maybe there is a small camera hanging under the vr rig to capture face elements.

greeneggs · on March 19, 2016

This research uses a camera to capture the mouth area and strain sensors for the upper face (that part can obviously be improved).

http://www.hao-li.com/Hao_Li/Hao_Li_-_publications_%5BFacial...

vmp · on March 19, 2016

This is insanely awesome. Something that comes to my mind is the use of this for dubbing movies and TV series; I'm very sensitive about correctly syncing what's being said to what we see, to the point where I only watch movies in their native tongue - even if I don't know the language and need subtitles. This could be a game-changer.

JoshTriplett · on March 19, 2016

> Something that comes to my mind is the use of this for dubbing movies and TV series

One of the references towards the end of the video mentions using this for translation, so that's definitely one of the intended applications.

ghayes · on March 19, 2016

It's even cooler since you could even do this retroactively if you had the original footage of the dubbing voice actor.

Clever321 · on March 19, 2016

I'm curious, how do you both read subtitles and watch that the sound is properly synced to an actor's lips? I can't read that fast, so I spend 80% of my time "watching a movie" simply reading text on the bottom of the screen.

peteretep · on March 19, 2016

    > I can't read that fast

I can't speak for the op, but I can read a great deal faster than most people speak.

0x4a42 · on March 19, 2016

Subtitles aren't synched to actor's lips. He talked about dubbed voices, not subtitles. :)

drawkbox · on March 19, 2016

Very well done. The best part is how they re-enact the mouth/teeth to look so real by capturing it by sampling earlier parts in the video to then use that on the non expression still or loop. I was blown away when Trump's teeth looked so real then they explained this process and why.

This could be huge (yuuuge) in games and virtual spaces. At GDC Unreal 4 has a demo recently and seems we are approaching that era[1]

[1] https://youtu.be/JbQSpfWUs4I?t=6m

kristiandupont · on March 19, 2016

One of the barriers existing in webcam meetings is the inability to make eye contact. It seems subtle but I think it is more important than one might intuitively think.

I've thought a lot about how to overcome this and came up with nothing but cameras beneath the screen (which Apple seems to have worked on but we have yet to see it: http://appleinsider.com/articles/09/01/08/apple_files_patent...). This technology could possibly provide a competing solution.

mchahn · on March 19, 2016

Discerning fake video from real just got a lot harder. Video is one of the last bastions of honest evidence.

benevol · on March 19, 2016

The data vacuuming companies (FB, Google, ad networks, etc.) collect the information about us required to know how to manipulate us and technologies such as this one represent the tools to actually get it done.

yeukhon · on March 19, 2016

I heard light source has been a way to determine whether something is likely to be fake or not.

NeonVice · on March 19, 2016

Conan could use this for his fake celebrity interviews instead of just cutting the mouth out of an image. :)

SergeyHack · on March 19, 2016

Imagine live edit of your video conversations, that attaches joyful emotions to any mention of an advertised brand.

tibbon · on March 19, 2016

The paper is great, but I wanna see some source code!

albertzeyer · on March 19, 2016

Where can I find the paper?

Hydraulix989 · on March 19, 2016

I don't think they want the world to see some source code, should it get into the wrong hands, though the repercussions of this work are bound to hit us sometime. [1]

[1] https://www.youtube.com/watch?v=GBkT19uH2RQ

spriggan3 · on March 19, 2016

The video made me feel very uncomfortable, and it takes a lot to make me feel that way.

zaro · on March 19, 2016

Faking news got an order of magnitude easier :)

sageinventor · on March 19, 2016

It would be cool to use this to fix movie footage in post production. You could just copy a face over if the actor screwed up

rawnlq · on March 19, 2016

Or bringing back dead actors using past footage!

listic · on March 19, 2016

What is Target Actor and what is Reenactment Result? The former looks better to me.

izym · on March 19, 2016

Taget actor is the source material that they're changing, and the latter is the result of that.

xchip · on March 19, 2016

Does anyone have the link to the paper?

mccappy · on March 23, 2016

http://www.ieee.org/conferences_events/conferences/conferenc...

mccappy · on March 23, 2016

http://www.graphics.stanford.edu/~niessner/thies2016face.htm...

oliyoung · on March 19, 2016

Terrifying, amazing but terrifying.

namelezz · on March 19, 2016

This is impressive.

diskcat · on March 19, 2016

The title is really underwhelming compared to how cool the demo is.

imaginenore · on March 19, 2016

Hilarious and scary at the same time. The admissibility of videos in courts is becoming more and more questionable.

deelowe · on March 19, 2016