> This is not just a matter of scaling up to larger networks.
> The answer is simply that our brains are more than neural networks.
At the risk of wasting time arguing against mysticism, there is no evidence for either of these statements. (Well, the latter is technically true, but not in the way I think you mean. There's no particular reason an NN couldn't do anything a brain does.) The only thing we can say with confidence is that the OP's model focuses more on rhyme than content, which is true for a lot of popular rappers as well.
>> There's no particular reason an NN couldn't do anything a brain does.
Brains (read: humans) can learn from very few examples and in very little time. Despite that, we learn a rich context that is flexible enough to constantly incorporate new knowledge and general enough to transfer learning across diverse domains.
Those are all things that ANNs have proven quite incapable of doing, as has any other technology you might want to think of.
You don't need to reach for a mystical explanation, either. Our technology is far, far less advanced than the current hype cycle will have you believe. Thinking that we can reproduce the function of the human brain with computers is what is the real mystical belief.
our brain is a 20W meat computer. i'd say that believing it can't be reproduced is quite mystical indeed. it's a matter of time; it'll be a 20MW factory-sized supercomputer at first, but it'll be done. not saying it'll happen in this decade, or the next, but this century must be it, assuming humanity makes it to 2100s.
I agree it can be reproduced over time. I encourage that they do everything they can to do that. It's worth a Manhattan project just because of all the side benefits it will probably lead to. We now have computation and storage for it, too. One ANN scheme even printed neurons in analog form on an entire wafer then packed the whole thing without cutting it. Because brain-like architectures let you do such things. :)
Now for the problem: that's not what most of them are doing. Instead, they're intentionally avoiding how the brain does reasoning and asynchronous/analog implementation to devise weaker techniques built on synchronous, digital implementations in tinier spaces. They try to make up for this weakness by throwing massive amounts of computation at it but it's already clear the algorithms themselves are what has to change. Ideally, we'd start experimenting with every version of the brains own algorithms and structures for specific types of activities in brain structures we're pretty sure perform such activities. We might accidentally discover the right stuff for certain problems. Tie them together over time.
That's not what they're doing though. So, they will have to independently invent an entirely new scheme that matches the brain's capabilities with techniques mostly opposite of what it relied on for those capabilities. Looks like a loosing proposition to me. They might achieve it but I'd rather the money clone the brain or it's architectural style.
The fact that the brain uses very little power and yet manages to solve really hard problems means that whatever it's doing is very efficient. The fact that ANNs need terrabytes of data and petaflops of processing power and still can only show rudimentary aptitude in mechanical tasks means they're not very efficient at all. Not that anyone ever called ANNs "efficient" (I'm not talking about backprop- but about iterating through covariance matrices). But if they were as efficient as the brain, they'd now be way, way smarter than us.
We know from undergraduate Comp Sci that there are problems that can simply not be solved, except with efficient algorithms. The fact that the brain is doing something terribly efficient is a big hint that whatever it's doing requires it to be that (because evolution hand waving hand waving). ANNs are nothing like that - they're practically brute force.
So how then can anyone expect that we're going to solve the hard problems the brain can, with ANNs?
This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints.
A hobbyist looking for something plug-and-play will still generally want lots of data; the cutting edge is not exactly "curl|bash"-able. But the papers coming out this year have been dispatching what I thought would be entire areas of study in a dozen pages, one after another after another.
Not only do I think it's a "when" and not an "if", I think the timelines people throw around date to "ancient" times - meaning, a few years ago. Given where were are right now, what we should be asking is whether "decades" should be plural.
I don't see how your endearing enthusiasm is supported by the paper you reference.
It's a paper, so I won't be doing it justice by tl;dr'ing it in three sentences but, in short:
a) One-shot/ meta learning is not a new thing; the paper references work by Seb. Thrun from 1998 [1]. Hardly a six-month old revolution that's taking the world by storm.
b) There are serious signs that they are overfitting like crazy, and
c) their approach requires few examples but they must be presented hundreds of thousands of times before performance improves. That's still nowhere near the speed or flexibility of human learning.
Also, did you notice they had to come up with a separate encoding scheme, because "learning the weights of a classifier using large one-hot vectors becomes increasingly difficult with scale" [2]? I note that this is a DeepMind paper. If something doesn't scale for them you can betcha it doesn't scale, period.
So, not seeing how this is heralding the one-shot-learning/ meta-learning revolution that I think you're saying it does.
___________
[1] Their reference is: Thrun, Sebastian. Lifelong learning algorithms. In Learning to learn , pp. 181–209. Springer, 1998.
[2] Things are bad enough that they employ this novel encoding even though it does not ensure that a class will not be shared across different episodes, which will have caused some "interference". This is a really bad sign.
"This is just an outdated view of the state-of-the-art. It's understandable, given that it's outdated by maybe six months, iff you're willing to go with preprints."
That's an understandable, but probably incorrect, view that comes from focusing on claims in state-of-the-art publications too much without the wider context of history & brain function. The problem parent is referring to also includes the general, "common sense" that we build up over time with extreme diversity of experiences that is developed despite tons of curveballs & able to create them ourselves. New knowledge is incorporated into that framework pretty smoothly. An early attempt to duplicate that was Cyc project's database of common sense. There's maybe just five or six total per Minsky with most AI researchers not thinking it's important. Those last words told me to be pessimistic already.
Whereas, the only computer capable of doing what they are trying to do uses a diverse set of subsystems specialized to do their jobs well. A significant amount of it seems dedicated to establishing common sense tying all experiences together. The architecture is capable of long-term planning, reacting to stuff, and even doing nothing when that makes sense. It teaches itself these things based on sensory input. It does it all in real-time with what appears to be a mix of analog and digital-like circuits in tiny amount of space and energy. And despite this, it still takes over a decade of diverse, training data to become effective enough to do stuff like design & publish ANN schemes. :)
There's hardly anything like the brain being done in ANN research that I've seen. The cutting-edge stuff that's made HN is pale imitation with small subset of capabilities trying to make one thing do it all. The pre-print you posted is also rudimentary compared to what I described above. Interestingly, the brain also makes use of feedback designs where most I see shared here (like in the late 90's) was feed-forward as if trying to avoid exploring the most effective technique that already solved the problem. Like the linked paper did.
They just all seem to be going in fundamentally wrong directions. Such directions will lead to nice, local maxima but miss the global maxima by a long shot. Might as well backtrack while they're ahead if they want the real thing.
Sorry - should have quoted the specific thing I was calling out-of-date, which was from a few comments up-thread:
> Brains (read: humans) can learn from very few examples and in very little time. [...] Those are all things that ANNs have proven quite incapable of doing, as has any other technology you might want to think of.
Note that I said "with reasonable efficiency", not "with some huge number of inputs".
That's because we don't need to represent any function; we need to represent the class of functions that can be efficiently represented in a human brain as well, which is pretty much the same. Note that we can also implement any boolean component with only a few neurons in an NN, and using an RNN gives us working memory as well, so we can implement any sort of digital processor with reasonable efficiency in an RNN (where "reasonable efficiency" means "a linear multiple of the number of components in the original circuit").
>> That's because we don't need to represent any function; we need to represent
the class of functions that can be efficiently represented in a human brain as
well, which is pretty much the same.
The problem is that to learn a function from examples you need the right kind of
examples, and for human cognitive faculties it's very hard to get that.
For instance, take text- text is a staple in machine-learning models of
language... but it is not language. It's a bunch of symbols that are only
intelligible in the context of an already existing language faculty. In other
words, text means nothing unless you already understand language, which is why
although we can learn pretty good models of text, we haven't made much progress
in learning models of language. Computers can generate or recognise language
pretty damn well- but when it comes to understanding it... Well, we haven't even
convincingly defined that task, let alone being able to train anything, RNN or
whatever, to perform it.
You can see similar issues with speech or image processing, where btw RNNs have
performed much better than with language.
So, just because RNNs can learn functions in principle it doesn't mean that we
can really reproduce human behaviour in practice.
Brains take 2+ years of constant training before they start to do much of anything we would associate with strong AI, and another 10 or so years of constant training before they can do anything worth spending money on. I'm not sure how you call that "very few examples". Brains do have some high-level inference facilities that work on smaller data sets, but the support hardware for that appears to be genetically coded to a large degree, and we can make computers do a lot of that sort of stuff too. No reason we couldn't make a big NN do the same.
> Thinking that we can reproduce the function of the human brain with computers is what is the real mystical belief.
No, not really. Most physicists believe that physics is either computable or approximable to below the noise floor. Thinking otherwise requires some sort of mystical religious belief about non-physical behavior.
>> Brains take 2+ years of constant training before they start to do much of anything we would associate with strong AI, and another 10 or so years of constant training before they can do anything worth spending money on. I'm not sure how you call that "very few examples".
You're talking about human brains. The brains of, say, gazelles, are ready for surviving in an extremely hostile environment a few minutes after they are born. See for example [1]. Obviously they can't speak or do arithmetic, but they can navigate their surroundings with great competence, find sustenance (even just their mothers' teat) and avoid danger.
That's already far, far beyond the capabilities of current AI and if I could make a system even half that smart I'd be the most famous woman on the planet. Honestly. And also, the richest. And most poweful. Screw Elon Musk and his self-driving cars- I'd rule the world with my giant killer robots of doom :|
Also- "very few examples": that's the whole "poverty of the stimulus" argument. In short, babies learn to speak without ever hearing what we would consider enough language. Noam Chomsky used that to argue for an innate "universal grammar" but there must be at least some learning performed by babies before they learn to speak their native language, and they manage it after hearing only very, very little of it.
Are you saying that brains will eventually be possible to copy with computers? In a thousand years, with completely different computers, maybe. Why not. But with current tech, forget about it.
General consensus is that this is hard-wired genetic behavior. It's mildly impressive, but nothing that we think we couldn't do on a computer with enough time and effort.
> In short, babies learn to speak without ever hearing what we would consider enough language.
All known humans who were deprived of social contact during early development were unable to learn speech later on. Babies get a ton of language stimulus; I'm not sure where you're getting "what we would consider enough".
> In a thousand years, with completely different computers, maybe.
We're only a few orders of magnitude off from standard COTS computer equipment being able to match the throughput you would expect from a human brain doing one "useful" thing per neuron at several kHz (which is probably a gross overestimation). Even if we decided to do a full neurophysiological simulation for every neuron in the brain, that only adds a few more orders of magnitude required compute power.
We expect to hit $1/(TFLOP/s) over the next 20 years or so, and there's physically no way the brain is doing more than a (PFLOP/s), unless neurons are doing some insane amount of work at a sub-neuronal level (which, I admit, is possible, but quite unlikely).
I would propose a long-term bet, but I'm not sure what the conditions would be.
I haven't seen evidence of that. They're given a few things to start with. Then they seem to apply a hyper-effective scheme for learning on raw data that combines personal exploration (unsupervised) and societal guidance (supervised). It then takes these brains nearly two decades of training data & experiences to become effective in the real-world. Virtually everything people say about what ANN's might accomplish leaves off that last part that was critical to the superior architecture they're trying to compete with.
> The answer is simply that our brains are more than neural networks.
At the risk of wasting time arguing against mysticism, there is no evidence for either of these statements. (Well, the latter is technically true, but not in the way I think you mean. There's no particular reason an NN couldn't do anything a brain does.) The only thing we can say with confidence is that the OP's model focuses more on rhyme than content, which is true for a lot of popular rappers as well.