Hacker News new | past | comments | ask | show | jobs | submit login

This sounds ambitious.

I wonder if they can also address the following problem. Currently, deep learning toolkits need thousands of training images to classify images of, e.g., dogs and cats. A human, in contrast, could learn the difference between a dog and a cat by looking just at a single example (or perhaps a few). So right now, deep learning is too much "simple" pattern matching, and too little real "AI".




I'm not convinced that a person who's never seen animals before could tell the difference between all future dogs and cats from a single training example. Humans draw upon a lifetime of learning and experience to achieve this 'one shot learning' capability.

If you take a pre-trained convnet (which, by analogy is like a person who has had 'life experience' of looking at objects), and extract activations for unseen object categories, in many cases you CAN one-shot-learn these new object categories. Try feeding them into a SVM or use L2 distance between test images and the one-shot exemplar image.

On top of this, there's a lot of work on memory-augmented nets and meta-learning for learning new categories on the fly.


I'd argue that it's less beneficial to learn new categories as it is to simply recognize when categories differ between samples.

For example, with bears -- I personally know of black bears and polar bears. I can be a little more detailed with fish but with dogs there are dozens of "different" [easily recognizable] types within the same category of "dog".


Anyone who had raised a child would tell you that, a human, requires years of training to learn the difference between a dog and a cat, by looking countless examples.

Of course, the disparity between deep neural nets and human brains remains unknown. A human learns the difference between a cat and a dog, while at the same time, learns so many different things, yet a neural net only learns the difference between a cat and a dog. We don't know how much we don't know.


One shot learning is such an active area of research there's a long Wikipedia page[1] about it.

I think the SOTA is probably [2], which came out of DeepMind. There's still a way to go before it matches ResNet performance on ImageNet (or even human performance on any real task) though.

[1] https://en.wikipedia.org/wiki/One-shot_learning

[2] https://arxiv.org/abs/1605.06065



Keep in mind that it literally takes human beings years before they can perform basic intelligence tasks. I do agree that AI right now is too focused on pattern matching from large data sets, but Deepmind has definitely been exploring other ways to think about memory or attention in artificial neural networks, and they tend to be more biologically inspired.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: