I understand and empathize with the skepticism or rather criticisms around hand wringing with respect to the implications of current deep learning methods.
However, as someone who builds them for vision applications I'm increasingly convinced that some form of ANN will underlie AGI - what he calls a universal algorithm.
If we assume that general intelligence comes from highly trained, highly connected single processors (neurons) with a massive and complex sensor system, then replicating that neuron is step one - which arguably is what we are building, albeit comparatively crudely, with ANN's.
If you compare at a high level how infants learn and how we train RNN/CNNs they are remarkably similar.
I think where the author, and in general the ML crowd focuses too much is on unsupervised learning as being pivotal for AGI.
In fact if you look again at biological models the bulk of animal learning is supervised training in the strict technical sense. Just look at feral children studies as proof of this.
Where the author detours too much is assuming the academic world would prove a broader scope for ANN if it were there. In fact however research priorities are across the board not focused on general intelligence and most machine learning programs explicitly forbid this research for graduate students as it's not productive over the timeline of a program.
Bengio and others I think are on the right track, focusing on the question of ANN towards AGI and I think it will start producing results as our training methods.
First, I'm curious where you thought I was focused on unsupervised learning? It certainly didn't cross my mind when I was writing this --- I was (implicitly) strictly talking about supervised machine learning.
My post actually is in support of what the latter of your comment says in a round-a-bout way. In general, the people that are making huge strides in deep learning (Bengio, Hinton, Lecun are obviously the big three) understand the capabilities and, maybe more importantly, the limitations of DL. My main point is that the ML community at large is actually not on the same page as the experts, and that causes many more problems.
I want us, as a community, to stop treating deep learning any different than any other ML algorithms --- have a consensus, based on scientific facts, about the possibilities and limitations thus-far. If we, "the experts", don't understand these things about our own algorithms, how can we the rest of the world to understand them?
> I want us, as a community, to stop treating deep learning any different than any other ML algorithms --- have a consensus, based on scientific facts, about the possibilities and limitations thus-far. If we, "the experts", don't understand these things about our own algorithms, how can we the rest of the world to understand them?
I agree. It's interesting watching the "debate" around deep learning. All the empirical results are available for free online, yet there's so much misinformation and confusion. If you're familiar with the work, you can fill in the blanks on where things are headed. For instance, in 2011, I think it became clear that RNNs were going to become a big thing, based on work from Ilya Sutskever and James Martens. Ilya was then doing his PhD and is now running OpenAI, doing research backed by a billion dollars.
The pace of change in deep learning is accelerating. It used to be fairly easy for me to stay current with new papers that were coming out; now I have a backlog. To a certain extent, it doesn't matter what other people think, much of the debate is just noise. I don't know what AGI is. If it's passing the turing test, we're pretty close, 3 years max, maybe by the end of the year. Anything more than that is too metaphysical and prone to interpretation. But there have been a bunch of benchmark datasets/tasks established now. Imagenet was the first one that everyone heard about I think, but sets like COCO, 1B words, and others have come out since then and established benchmarks. Those benchmarks will keep improving, pursuing those improvements will lead to new discoveries re: "intelligence as computation", and something approximately based on "deep learning" will drive it for a while.
> If it's passing the turing test, we're pretty close, 3 years max
Well yes? If a Turing test you realize the simulation of some idiot in the online chat, it has long been there - and nobody wants. But the system, which can lead a meaningful conversation, today there is no trace. And there is even no harbingers of its occurrence.
<- This was translated by Google Translate from a piece of perfectly intelligible and grammatically correct text in another language. If this is the state of the art in machine translation, how on Earth can you expect a machine that can converse on human level in three years?
Sadly, I read the google translated text and it read like a person where english was their second language. I didn't realize it was an "example" until I read your next section. So it had me fooled.
You could probably replace 90% of YouTube comments with a simple trigram-based chat bot and no one would notice. But that's hardly a good measure of AI quality.
Although, your comment illustrates the main problem with the Turing Test. It depends on too many factors and assumptions that have nothing to do with the AI itself.
A good AGI test should be constructed in such a way that any normal person passes it with 100% certainty and no trivial system can pass it at all.
Over the past 10 years (deep learning got going in ~2006, really), the state of the art has improved at an exponential rate, and that's not slowing down. There are plenty of reasons to be bullish. Or at least, now seems like a bad time to leave the field.
I've read those papers when they came out. Correct me if I'm wrong, but they were not peer-reviewed.
The first one looks very impressive from the examples they've provided, but extraordinary claims require extraordinary proof. I will believe it only when I see an interactive demo. It's been nearly a year and I haven't seen it surfacing in a real product or usable prototype of any sort. Why?
Somehow all the papers that have "deep neural" stuff get 1/100th of the scrutiny applied to other AI research. I don't see anyone hyping up MIT's GENESIS system, for example.
The second paper has a really weird experiment setup. The point of one-shot learning is to be able to extract information from a limited number of examples. The authors, however, pretrain the network on a very large number of examples highly similar to the test set. Whether or not their algorithm is impressive depends on how well it is able to generalize, and they're not really testing generality -- at all. Again, why?
I'm curious where you thought I was focused on unsupervised learning?
I suppose I should have stated better that I was talking more broadly, however the discussion about hand training being what holds the discipline back I think implies that unsupervised learning is the key.
Perhaps I misunderstood your writing, but I think there is a nuance there that isn't well discussed.
> In fact if you look again at biological models the bulk of animal learning is supervised training in the strict technical sense. Just look at feral children studies as proof of this.
Feral children are incapable of learning a language as adults. But children don't require input (or feedback) from an existing language in order to "learn" a language; a community of children is capable of developing a full natural language without exposure to adult language. The problem the feral children encounter is not that no one teaches them to speak, it's that they're alone.
This is much as if you gave a neural network "supervised training" by letting it interact with several other untrained copies of itself, but no coded data. Does that meet your definition of "supervised training"?
(Obviously, a language created by an isolated group of children will be a new language rather than a known one. Children learning an existing language do need input, but there are some interesting quirks to the process -- they don't seem to need feedback, or use it if they get it, and in the natural course of things the input they get is 100% "correct" and 0% "incorrect". Try training a neural net to discriminate dogs from non-dogs when you can't code anything as "not a dog".)
Does that meet your definition of "supervised training"?
I think as a second layer, yes.
To your original statement though, being alone is the same as no-one teaching them to talk.
Adults, and other children, in fact teach infants to talk. It might not seem explicit (though there are very explicit programs for infant language development) but in fact it works the same way. See my other comment about object segmentation and classification.
> To your original statement though, being alone is the same as no-one teaching them to talk.
No it's not. You can have one without the other. Being alone implies no one teaching them to talk. However, children who are not alone still learn to talk despite no one teaching them.
>However, children who are not alone still learn to talk despite no one teaching them.
This is where we disagree. The distinction is that you probably see training/teaching as intentional - but it's not always. You can (and do) learn just by watching and listening without it being an explicit procedural training.
Again, congenitally deaf children who cannot learn any language spoken by the adults around them (since they can't perceive the speech) nevertheless create sign languages among themselves when they are part of a community of deaf children, and those sign languages display the full complexity and expressivity of any natural language. These de novo languages cannot have been taught to the children because they didn't exist to be taught. They are a product of the interaction between children, a side effect.
The same thing happens among the children of pidgin-speaking communities -- they receive input that does not belong to any language and, acting as a group, construct a full language from that. (This process is called creolization.) But the case of deaf children is especially clear because they don't even receive the malformed linguistic input that creolizing children do.
>If you compare at a high level how infants learn and how we train RNN/CNNs they are remarkably similar.
That doesn't sound correct. Infants engage in active inference to resolve their uncertainties, engage in transfer learning between different domains by default, and use their transfer-learning abilities to learn new categories or domains from even just one example.
So I would say - kind of. But at it's root, the mechanisms are based on having a social network of people touching, talking, playing etc... that does the training - though it might not look the same.
A perfect example I use all the time which is much later in development but relevant, is when a 1.5 year old will point at something and ask if it's a [thing]. The initial training was people segmenting (holding, moving different ways) classifying (calling it [thing], asking for [thing], telling the subject it is [thing]), which was then reinforced/weighted by asking if [thing] was indeed [thing].
So it doesn't look like explicit training, but it follows all the same steps.
However, as someone who builds them for vision applications I'm increasingly convinced that some form of ANN will underlie AGI - what he calls a universal algorithm.
If we assume that general intelligence comes from highly trained, highly connected single processors (neurons) with a massive and complex sensor system, then replicating that neuron is step one - which arguably is what we are building, albeit comparatively crudely, with ANN's.
If you compare at a high level how infants learn and how we train RNN/CNNs they are remarkably similar.
I think where the author, and in general the ML crowd focuses too much is on unsupervised learning as being pivotal for AGI.
In fact if you look again at biological models the bulk of animal learning is supervised training in the strict technical sense. Just look at feral children studies as proof of this.
Where the author detours too much is assuming the academic world would prove a broader scope for ANN if it were there. In fact however research priorities are across the board not focused on general intelligence and most machine learning programs explicitly forbid this research for graduate students as it's not productive over the timeline of a program.
Bengio and others I think are on the right track, focusing on the question of ANN towards AGI and I think it will start producing results as our training methods.