Personally, I'm just a little impressed that you can train an active agent to pl...

tobilarscheid · on Jan 4, 2017

This is exactly what I wondered about. So what exactly is the function you are training for? Is it basically like "if the screen (showing the track) looks like this, apply these controls"?

PeterisP · on Jan 7, 2017

An more accurate description of the function would be "given this picture of a screen, what is the most likely key my author was pressing in this situation" - no goals, no values, no optimization, but simply learning to imitate the actions performed by a human.

Coincidentally, one of the neural network components in AlphaGo did pretty much the same, i.e. attempted to guess what human player would usually play in this situation purely based on the image and nothing else.

eli_gottlieb · on Jan 4, 2017

In TFA it says that he was training a supervised learner to predict the control state from the screen state. So yes, "if the screen looks like this, apply these controls", and that can play Mario Kart 64.