Hacker News new | past | comments | ask | show | jobs | submit login

There is no enough reinforcement volume to learn anything complex.

If child gets external reward for every waking moment until she is 12 years old, it's just 4.2 million signals.

Reinforcement learning works for fine motor control and other tasks where the feedback loop is tight and immediate. Reinforcement and conditioning can also modulate high level cognition and behavior, but it's not the secret sauce of learning.




I could not follow your argument. Could you elaborate? (Honest question)


Reinforcement learning is learning by interacting with an environment. RL agent learns from the consequences of its actions.

Biological system don't live long enough to get enough feedback to learn complex behavior trough consequences. Animal or human must be able to generalize and categorize what they have learned correctly without external feedback teaching it how to derive the function that's doing it.

For example, if you want to learn how to tie a complex knot and learn it trough trial and error you might have try it million times if you improve your behavior mainly trough consequences of your actions. In practice you probably try only 5-10 times before you learn to do it and it involves pausing and looking at the problem. There is some kind of unsupervised model building happening that is not involving external input.


Ok, I see what you mean now.

But it seems like (or one could misunderstand you in way that) you see those concepts as mutually exclusive. I would assume a combination of reinforcement learning and unsupervised (and supervised) learning.

Rats have been trained to detect landmines and then go back to their trainers and show them the mine. This is complex behaviour that was taught using reinforcement (at least on a top level). There will be some unsupervised learning going on in the rat's brain on a lower level. But it is complex behaviour and it's been reinforcement learnt.


>see those concepts as mutually exclusive.

I certainly don't. Reinforcement or conditioning is part of it, but it's not the cake.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: