Hacker News new | past | comments | ask | show | jobs | submit login
Learning by playing (deepmind.com)
139 points by stablemap on March 1, 2018 | hide | past | favorite | 8 comments



Been occasionally pestering family and some colleagues over the last 1/2 - 1 year, about the idea that "playfulness" of children is why they learn so quickly, and that it is relevant to ML :P

Children build up a continuously more sophisticated "hypothesis" of how the world works, and are always inspired (by some neural process?) to explore the the limits of that hypothesis by perturbing things in new little fun ways ... throw away that ball ... upwards, downwards, sidewards ... bite it, taste it, sit on it, stand on it.

I guess there is some randomness to the new experiments chosen to try, but they are also based on the continuously improved understanding of the world. Realizing that a ball is bouncy opens up a world of new "fun" experiments to try, a.s.o. a.s.o.

Well, pretty obvious to every parent of course :D ... but in any way a really major and important area to learn from I think.


From my understanding they can train the agent to accomplish a fairly complex task, by first "training" the agent to be able to accomplish simpler tasks that are perhaps only slightly related to the final goal. If you want to learn to run, first learn the more basic skills of balancing, standing up and walking. The agent appears to decide what task it wants to try to pursue, but still receives signals about all the tasks available to it, and has a way of planning and following through on what future tasks to carry out.

I wonder if there's any way to automatically generate useful, simple goals?


Co-author of the paper here. That is a good high-level summary of the approach :).

Generating useful "simple"/"low-level" goals automatically indeed is an interesting avenue for pushing this further.


Hi Tobias, thanks for dropping into the thread.

When this kind of software becomes mature enough that a consultant can install a robotic arm on a factory line, and quickly (several hours) train it to do the job of a factory line worker, there will be a massive economic incentive to do so.

How far do you think we are from this level of maturity? What are the remaining steps required to reach that level?


Maybe something akin to this: https://blog.openai.com/interpretable-machine-learning-throu...

My thinking is you could train the first generation without simpler tasks but it would pick up crude skills along the way. Have the first generation generate tasks that convey the basic skills it learned and use that to teach the second generation.

Edit: Or just feedback the generated tasks and not train a new neural net...


I know this has to do with machine learning for robots but play was central to Plato's Socratic method and I remember vaguely in the Republic it was suggested that children start to learn by playing. Perhaps, we should call him the philosopher, Play-Doh. (Sorry, I couldn't resist. I'll show myself the door.)


Wait a second, it's pronounced Plateau, like what a learning agent would reach eventually.

The long O is obviously French. And actually it's transliterated Platon (Πλάτων), but who knows, maybe the English knew that ν (nu) and υ (upsilon) are easy to confuse. I was joking but hey, who knew I would, reading the wiki artikel on "ν", discover that it was optional on occasion to link two words.


Truly fascinating, I really commend the article for being easy to... grasp. The virtual animations and the video(https://www.youtube.com/watch?v=mPKyvocNe_M) of the real life robo arm really made the concept crystal clear.

I bet it was exciting to see SAC-X graduate to each concept.

Did it ever surprise you?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: