The fact that the brain uses very little power and yet manages to solve really hard problems means that whatever it's doing is very efficient. The fact that ANNs need terrabytes of data and petaflops of processing power and still can only show rudimentary aptitude in mechanical tasks means they're not very efficient at all. Not that anyone ever called ANNs "efficient" (I'm not talking about backprop- but about iterating through covariance matrices). But if they were as efficient as the brain, they'd now be way, way smarter than us.
We know from undergraduate Comp Sci that there are problems that can simply not be solved, except with efficient algorithms. The fact that the brain is doing something terribly efficient is a big hint that whatever it's doing requires it to be that (because evolution hand waving hand waving). ANNs are nothing like that - they're practically brute force.
So how then can anyone expect that we're going to solve the hard problems the brain can, with ANNs?
This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints.
A hobbyist looking for something plug-and-play will still generally want lots of data; the cutting edge is not exactly "curl|bash"-able. But the papers coming out this year have been dispatching what I thought would be entire areas of study in a dozen pages, one after another after another.
Not only do I think it's a "when" and not an "if", I think the timelines people throw around date to "ancient" times - meaning, a few years ago. Given where were are right now, what we should be asking is whether "decades" should be plural.
I don't see how your endearing enthusiasm is supported by the paper you reference.
It's a paper, so I won't be doing it justice by tl;dr'ing it in three sentences but, in short:
a) One-shot/ meta learning is not a new thing; the paper references work by Seb. Thrun from 1998 [1]. Hardly a six-month old revolution that's taking the world by storm.
b) There are serious signs that they are overfitting like crazy, and
c) their approach requires few examples but they must be presented hundreds of thousands of times before performance improves. That's still nowhere near the speed or flexibility of human learning.
Also, did you notice they had to come up with a separate encoding scheme, because "learning the weights of a classifier using large one-hot vectors becomes increasingly difficult with scale" [2]? I note that this is a DeepMind paper. If something doesn't scale for them you can betcha it doesn't scale, period.
So, not seeing how this is heralding the one-shot-learning/ meta-learning revolution that I think you're saying it does.
___________
[1] Their reference is: Thrun, Sebastian. Lifelong learning algorithms. In Learning to learn , pp. 181–209. Springer, 1998.
[2] Things are bad enough that they employ this novel encoding even though it does not ensure that a class will not be shared across different episodes, which will have caused some "interference". This is a really bad sign.
"This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints."
That's an understandable, but probably incorrect, view that comes from focusing on claims in state-of-the-art publications too much without the wider context of history & brain function. The problem parent is referring to also includes the general, "common sense" that we build up over time with extreme diversity of experiences that is developed despite tons of curveballs & able to create them ourselves. New knowledge is incorporated into that framework pretty smoothly. An early attempt to duplicate that was Cyc project's database of common sense. There's maybe just five or six total per Minsky with most AI researchers not thinking it's important. Those last words told me to be pessimistic already.
Whereas, the only computer capable of doing what they are trying to do uses a diverse set of subsystems specialized to do their jobs well. A significant amount of it seems dedicated to establishing common sense tying all experiences together. The architecture is capable of long-term planning, reacting to stuff, and even doing nothing when that makes sense. It teaches itself these things based on sensory input. It does it all in real-time with what appears to be a mix of analog and digital-like circuits in tiny amount of space and energy. And despite this, it still takes over a decade of diverse, training data to become effective enough to do stuff like design & publish ANN schemes. :)
There's hardly anything like the brain being done in ANN research that I've seen. The cutting-edge stuff that's made HN is pale imitation with small subset of capabilities trying to make one thing do it all. The pre-print you posted is also rudimentary compared to what I described above. Interestingly, the brain also makes use of feedback designs where most I see shared here (like in the late 90's) was feed-forward as if trying to avoid exploring the most effective technique that already solved the problem. Like the linked paper did.
They just all seem to be going in fundamentally wrong directions. Such directions will lead to nice, local maxima but miss the global maxima by a long shot. Might as well backtrack while they're ahead if they want the real thing.
Sorry - should have quoted the specific thing I was calling out-of-date, which was from a few comments up-thread:
> Brains (read: humans) can learn from very few examples and in very little time. [...] Those are all things that ANNs have proven quite incapable of doing, as has any other technology you might want to think of.
Note that I said "with reasonable efficiency", not "with some huge number of inputs".
That's because we don't need to represent any function; we need to represent the class of functions that can be efficiently represented in a human brain as well, which is pretty much the same. Note that we can also implement any boolean component with only a few neurons in an NN, and using an RNN gives us working memory as well, so we can implement any sort of digital processor with reasonable efficiency in an RNN (where "reasonable efficiency" means "a linear multiple of the number of components in the original circuit").
>> That's because we don't need to represent any function; we need to represent
the class of functions that can be efficiently represented in a human brain as
well, which is pretty much the same.
The problem is that to learn a function from examples you need the right kind of
examples, and for human cognitive faculties it's very hard to get that.
For instance, take text- text is a staple in machine-learning models of
language... but it is not language. It's a bunch of symbols that are only
intelligible in the context of an already existing language faculty. In other
words, text means nothing unless you already understand language, which is why
although we can learn pretty good models of text, we haven't made much progress
in learning models of language. Computers can generate or recognise language
pretty damn well- but when it comes to understanding it... Well, we haven't even
convincingly defined that task, let alone being able to train anything, RNN or
whatever, to perform it.
You can see similar issues with speech or image processing, where btw RNNs have
performed much better than with language.
So, just because RNNs can learn functions in principle it doesn't mean that we
can really reproduce human behaviour in practice.
The fact that the brain uses very little power and yet manages to solve really hard problems means that whatever it's doing is very efficient. The fact that ANNs need terrabytes of data and petaflops of processing power and still can only show rudimentary aptitude in mechanical tasks means they're not very efficient at all. Not that anyone ever called ANNs "efficient" (I'm not talking about backprop- but about iterating through covariance matrices). But if they were as efficient as the brain, they'd now be way, way smarter than us.
We know from undergraduate Comp Sci that there are problems that can simply not be solved, except with efficient algorithms. The fact that the brain is doing something terribly efficient is a big hint that whatever it's doing requires it to be that (because evolution hand waving hand waving). ANNs are nothing like that - they're practically brute force.
So how then can anyone expect that we're going to solve the hard problems the brain can, with ANNs?