Maybe I'm misunderstanding the code, but it looks like it's matching audio to vi...

derimagia · on Oct 21, 2018

I didn't take a deep dive of the code but in order to train it's going to need to be fed audio files with the actual video/mouth shapes/etc. Essentially it needs it to tell the reward to give back (if it was right). Once it "learns" it wouldn't need the audio file.

pavs · on Oct 21, 2018

in order to train doesn't it have to match audio output to a video of mouth movement?

Doesn't deep learning imply training on sample result?

person_of_color · on Oct 21, 2018

Exactly. How is this "lipreading"? Clickbait.