So, I don't get this. The fact that the brain uses very little power and yet man...

ak4g · on Nov 7, 2016

This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints.

https://arxiv.org/abs/1605.06065

A hobbyist looking for something plug-and-play will still generally want lots of data; the cutting edge is not exactly "curl|bash"-able. But the papers coming out this year have been dispatching what I thought would be entire areas of study in a dozen pages, one after another after another.

Not only do I think it's a "when" and not an "if", I think the timelines people throw around date to "ancient" times - meaning, a few years ago. Given where were are right now, what we should be asking is whether "decades" should be plural.

YeGoblynQueenne · on Nov 7, 2016

I don't see how your endearing enthusiasm is supported by the paper you reference.

It's a paper, so I won't be doing it justice by tl;dr'ing it in three sentences but, in short:

a) One-shot/ meta learning is not a new thing; the paper references work by Seb. Thrun from 1998 [1]. Hardly a six-month old revolution that's taking the world by storm.

b) There are serious signs that they are overfitting like crazy, and

c) their approach requires few examples but they must be presented hundreds of thousands of times before performance improves. That's still nowhere near the speed or flexibility of human learning.

Also, did you notice they had to come up with a separate encoding scheme, because "learning the weights of a classifier using large one-hot vectors becomes increasingly difficult with scale" [2]? I note that this is a DeepMind paper. If something doesn't scale for them you can betcha it doesn't scale, period.

So, not seeing how this is heralding the one-shot-learning/ meta-learning revolution that I think you're saying it does.

___________

[1] Their reference is: Thrun, Sebastian. Lifelong learning algorithms. In Learning to learn , pp. 181–209. Springer, 1998.

[2] Things are bad enough that they employ this novel encoding even though it does not ensure that a class will not be shared across different episodes, which will have caused some "interference". This is a really bad sign.

nickpsecurity · on Nov 7, 2016

"This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints."

That's an understandable, but probably incorrect, view that comes from focusing on claims in state-of-the-art publications too much without the wider context of history & brain function. The problem parent is referring to also includes the general, "common sense" that we build up over time with extreme diversity of experiences that is developed despite tons of curveballs & able to create them ourselves. New knowledge is incorporated into that framework pretty smoothly. An early attempt to duplicate that was Cyc project's database of common sense. There's maybe just five or six total per Minsky with most AI researchers not thinking it's important. Those last words told me to be pessimistic already.

Whereas, the only computer capable of doing what they are trying to do uses a diverse set of subsystems specialized to do their jobs well. A significant amount of it seems dedicated to establishing common sense tying all experiences together. The architecture is capable of long-term planning, reacting to stuff, and even doing nothing when that makes sense. It teaches itself these things based on sensory input. It does it all in real-time with what appears to be a mix of analog and digital-like circuits in tiny amount of space and energy. And despite this, it still takes over a decade of diverse, training data to become effective enough to do stuff like design & publish ANN schemes. :)

There's hardly anything like the brain being done in ANN research that I've seen. The cutting-edge stuff that's made HN is pale imitation with small subset of capabilities trying to make one thing do it all. The pre-print you posted is also rudimentary compared to what I described above. Interestingly, the brain also makes use of feedback designs where most I see shared here (like in the late 90's) was feed-forward as if trying to avoid exploring the most effective technique that already solved the problem. Like the linked paper did.

They just all seem to be going in fundamentally wrong directions. Such directions will lead to nice, local maxima but miss the global maxima by a long shot. Might as well backtrack while they're ahead if they want the real thing.

ak4g · on Nov 7, 2016

Sorry - should have quoted the specific thing I was calling out-of-date, which was from a few comments up-thread:

> Brains (read: humans) can learn from very few examples and in very little time. [...] Those are all things that ANNs have proven quite incapable of doing, as has any other technology you might want to think of.

nickpsecurity · on Nov 7, 2016

Fair enough. That's true where ANN's are getting better on that.

wyager · on Nov 7, 2016

> So how then can anyone expect that we're going to solve the hard problems the brain can, with ANNs?

Because RNNs can approximate any self-referential circuit with reasonable efficiency. Just like the brain does with neurons.

YeGoblynQueenne · on Nov 8, 2016

Sure. And all multi-layer networks with three or more layers can approximate any function with arbitrary accuracy... given sufficient many inputs.

Where "sufficient many" translates as "for real problems, just too many".

wyager · on Nov 8, 2016

Note that I said "with reasonable efficiency", not "with some huge number of inputs".

That's because we don't need to represent any function; we need to represent the class of functions that can be efficiently represented in a human brain as well, which is pretty much the same. Note that we can also implement any boolean component with only a few neurons in an NN, and using an RNN gives us working memory as well, so we can implement any sort of digital processor with reasonable efficiency in an RNN (where "reasonable efficiency" means "a linear multiple of the number of components in the original circuit").

YeGoblynQueenne · on Nov 8, 2016

>> That's because we don't need to represent any function; we need to represent the class of functions that can be efficiently represented in a human brain as well, which is pretty much the same.

The problem is that to learn a function from examples you need the right kind of examples, and for human cognitive faculties it's very hard to get that.

For instance, take text- text is a staple in machine-learning models of language... but it is not language. It's a bunch of symbols that are only intelligible in the context of an already existing language faculty. In other words, text means nothing unless you already understand language, which is why although we can learn pretty good models of text, we haven't made much progress in learning models of language. Computers can generate or recognise language pretty damn well- but when it comes to understanding it... Well, we haven't even convincingly defined that task, let alone being able to train anything, RNN or whatever, to perform it.

You can see similar issues with speech or image processing, where btw RNNs have performed much better than with language.

So, just because RNNs can learn functions in principle it doesn't mean that we can really reproduce human behaviour in practice.

oldmanjay · on Nov 7, 2016

Did you forget that the brain has evolved over billions of years to reach the capability it has?