Hacker News new | past | comments | ask | show | jobs | submit login
Artificial intelligence pioneer says we need to start over (axios.com)
211 points by elsewhen on Sept 15, 2017 | hide | past | favorite | 150 comments



Machine learning may be nearing its ceiling.

The history of AI goes in cycles. Someone has a good idea which solves some problems, followed by "strong AI Real Soon Now" enthusiasm, followed by that idea hitting its ceiling. AI has been through search, backtracking, the General Problem Solver, hill-climbing, and expert systems. Each was overhyped at the time, and each hit its ceiling.

The big difference this time is that the ceiling with machine learning is high enough for large-scale profitable applications. That wasn't the case with the previous rounds. AI used to be a dinky field - about 30-50 people each at Stanford, CMU, and MIT, plus a few tiny groups elsewhere. Now it's a huge field with big companies and big profits. That makes it self-sustaining.

Hinton has a point in that we're missing something. Back-propagation is an extremely inefficient method, especially since the slower you do it, the better it seems to work. More generally, most of machine learning is "turn the problem into an optimization problem and bang on it really hard with lots of compute power". This works for a useful class of problems. But it has limits.

How long until the next big idea? The last "AI winter", after expert systems, was 15 years.


>The big difference this time is that the ceiling with machine learning is high enough for large-scale profitable applications. That wasn't the case with the previous rounds.

My company (fortune 500) depends on expert systems, some even date back to the 80's AI boom. Doing what we do, at the scale we do it, would simply not be possible without them.

Lots of companies make a lot of money with expert systems, but they have no incentive to fund AI research since expert systems have hit a clear capability ceiling.


> Lots of companies make a lot of money with expert systems, but they have no incentive to fund AI research since expert systems have hit a clear capability ceiling.

I'm not sure I follow the reasoning here. Is the suggestion that AI cannot do better than expert systems?


I think the point is that just because companies are making money off of AI advances, that doesn't necessarily mean we won't hit another AI winter once we hit a ceiling with current techniques.


Okay but what's the relevance of expert systems to this argument?


This isn't the first round of AI that's been useful for "large-scale profitable applications"


Expert systems are one of the, supposedly not profitable, previous AI booms the preceded an AI winter.

My point is, they were and are profitable, but clearly that didn't keep winter at bay.


What's a good resource for learning about expert systems? I'm curious about this "dead end" but most of what I've found online has been about the ai winter and not the technical details of expert systems.


The Wikipedia article isn't a bad start. https://en.wikipedia.org/wiki/Expert_system

I have previously built an expert system - a system of codified expert human knowledge built into a system of executable, interrelated rules. Rule engines are the execution engines of expert knowledge in an expert system. The RETE algorithm is another good place to start.

https://en.m.wikipedia.org/wiki/Business_rules_engine https://en.m.wikipedia.org/wiki/Inference_engine https://en.m.wikipedia.org/wiki/Rete_algorithm

When I was first learning, I downloaded and played with CLIPS, a pretty cool old expert system tool built by NASA. https://en.m.wikipedia.org/wiki/CLIPS


:) The incentive all good corporate robots need, to fund research is generally completed research funded by the public. We know this. Don't worry yourself too much. We'll call you when we need a freeloader.


It's no my corporation. It's just the one devouring my soul.


>How long until the next big idea? The last "AI winter", after expert systems, was 15 years.

Oddly enough neural nets languished in the frozen tundra for 50+ years if you take it back to perceptrons (mid 60's)[1]. Misnky and Papert's take down of perceptrons set the field back years. If you want to be more strict then it's been 31 years since Rummelhart and McClelland published their book[2], based on their earlier work that Hinton co-authored. They laid out multilayer NN with back-propagation. I got hold of the book in the UK in '87 and used it to code a multi-layer back-propagation network. It was a machine that would read out aloud given some text, even for words it had not seen before (good luck with through, though, and trough.) The problems were data and processing speed. I had to hand code features and wait for overnight training runs on VAXen and later a Sun-3 clocked at 16.67MHz[3]. It took big data an brute force to get the glimpses or magic that we see now. The fundamentals have been around for a long time.

It may be the case that one of the many approaches that fell by the wayside will make a comeback a al NN. NN added more layers and tweaked backprop, plus data and speed. It may be the expert systems will rejuvenate as something "new" with machines writing their own rules and somehow getting over showstoppers like the unmanageability of a large rule bases. It's probably going to be some combination of things that we already know about, along with yet more data, and processing power that will get us to the next level. The odds are that whatever that beast is will be self-training and regulating in an analogous way to back-prop but with logic thrown in. Good luck to us all for inventing such things, and for trying to understand what they are really doing.

[1]https://en.wikipedia.org/wiki/Perceptron [2]https://mitpress.mit.edu/books/parallel-distributed-processi... [3]https://en.wikipedia.org/wiki/Sun-3


Each was overhyped at the time, and each hit its ceiling.

That each was over-hyped I can buy. I'm less convinced of the latter part of that statement. Take ANN's. You would have thought that they had hit their ceiling at one time as well, but eventually the confluence of more compute power, more data, and better algorithms resulted in the recent resurgence. I don't see any reason to think that some of the other existing AI techniques might not also experience an explosion in utility, if combined with massive increases in compute power, more data, and better algorithms as well.


I think that machine learning, particularly deep learning, will increasingly be used in combination with some of the older AI techniques to produce hybrid systems, which will solve previously intractable problems. AlphaGo is a great example; of course it used deep learning, but it also used classical search, as well as reinforcement learning.

The point is, progress isn't linear but exponential: each new technique multiplies our capabilities.


I think that machine learning, particularly deep learning, will increasingly be used in combination with some of the older AI techniques to produce hybrid systems, which will solve previously intractable problems.

This is what I believe as well. I've been doing some work lately on some older approaches to automated abductive inference, with an eye towards doing some of this kind of hybridization. No results to report yet, but I do think there's some meat on this bone.


I feel that Numenta will break the ceiling that we're coming to now. I have quite a few upcoming personal projects that I plan to use there neural network(HTM) structure to enhance, the key being enhance.

The mindset that I feel is lacking in the ML/AI community is that we should be creating systems that can be augmented and improved using AI, the product itself should stand alone without using any AI at all. This is similar to how JIT compilation can improve the machine code as the program runs. Part of how our own brains work is that the cerebral cortex doesn't really have any direct control over the body at all, it can only watch and identify patterns and project intentions to the the rest of the brain/body. This is a passive structure initially, it observes both the environment and how the lower structure interact with that environment. Only after it's observed and learned the "commands" can it begin to influence how the body moves around and interacts with the world. You could look at the CLI terminal as an example, on it's own I can interact and use it perfectly fine. But lets say I added some patterning matching and prediction AI that could make really good suggestions about what command I will type next, at some point you could promote these suggestions so that they always execute when the AI predicts you would use it. This is exactly how the brain works.

Most of my ideas of applied AI revolve around improving our HID interfaces to computers themselves. For example, I'd like to reduce our chances of developing RSI related to keyboard/mouse/trackpad usage and I feel that chorded keyboards are more ergonomic. But chorded keyboards are slower to use and harder to learn. The mapping of the chords is where I plan to apply some AI, initially I'm going to randomize how chords are bound to character inputs, these random values will use many fingers to input 1 character. As the user learns and uses the chords to input characters the AI can identify common patterns and promote those patterns into chords that use less and less fingers to input, thus improving the input speed potential of the user. The next place to use AI in this project would be autocorrection and using the user's input context(program, filetype, time, surrounding text) to continue improving these bindings and making completion suggestions.


From Yann LeCunn. ---

Numenta, Vicarious, NuPic, HTM, CLA, etc.

Jeff Hawkins has the right intuition and the right philosophy. Some of us have had similar ideas for several decades. Certainly, we all agree that AI systems of the future will be hierarchical (it’s the very idea of deep learning) and will use temporal prediction.

But the difficulty is to instantiate these concepts and reduce them to practice. Another difficulty is grounding them on sound mathematical principles (is this algorithm minimizing an objective function?).

I think Jeff Hawkins, Dileep George and others greatly underestimated the difficulty of reducing these conceptual ideas to practice.

As far as I can tell, HTM has not been demonstrated to get anywhere close to state of the art on any serious task.

I still think [Vicarious] is mostly hype. There is no public information about the underlying technology. The principals don’t have a particularly good track record of success. And the only demo is way behind what you can do with “plain vanilla” convolutional nets (see this Google blog post and this ICLR 2014 paper and this video of the ICLR talk).

HTM, NuPIC, and Numenta received a lot more publicity than they deserved because of the Internet millionaire / Silocon Valley celebrity status of Jeff Hawkins.

But I haven’t seen any result that would give substance to the hype.

Don’t get fooled by people who claim to have a solution to Artificial General Intelligence, who claim to have AI systems that work “just like the human brain”, or who claim to have figured out how the brain works (well, except if it’s Geoff Hinton making the claim). Ask them what error rate they get on MNIST or ImageNet.

---- Reference:

http://fastml.com/yann-lecuns-answers-from-the-reddit-ama/


MNIST error rates are noise these days, but otherwise agree. These guys have been around for the entire deep learning boom of the last decade, and have never produced results to match their handwaving predictions. They're the Moller Skycar of the field.


That implies we have hit or at least know the exact capacity ceiling for, in particular, recurrent/dynamic networks (I do agree that with static networks it seems more clear cut). While it will not be general AI, I'd assume, things like the neutral Turing machine, pointer networks, and many other open research areas leave room to speculate that we have not yet "found" that ceiling.


Back in the nineteen-nineties I'd built a neural-net creator using a method other than back-propogation that was able to create nets with about a hundred neurons that played tic-tac-toe well or perfectly (depending on the net created that day.) This was on a 12 mhz 286 using what I called static point multiplication (based on shift instructions, not multiplication instructions.) So I went to Toronto to see Professor Hinton, to share. My not being a big back-propagation fan was one thing that divided us before meeting him, but otherwise I thought we'd be pretty much on the same page.

It didn't go well. I happened to mention that I thought parallel programming would be a big help speeding up neural nets, and that I'd toured such a multi-processor computer in Alberta. He blew up. I never got to explain what I'd done with neural nets. He tossed me out of his office summarily and wouldn't listen to another word I had to say - somehow he'd gotten the idea in his head that I was a salesman for the company whose machine I'd seen, which machine was too weird to be considered anyway, and absolutely nothing I could say would persuade him that I wasn't a salesman. So the nets I'd created and methods other than back-propagation never got mentioned. Weirdest meeting of my life, and the end of my desire to do anything with neural nets, since other academic departments were even more violently rejecting of the idea back then.


I'd like to hear more about your method. I've looked at a number of different ways of building neural nets over the years and all have their advantages and disadvantages, but if you have something completely different then I'd be thrilled to hear more. I agree with Hinton that we really need to rethink the way we're doing things.


What's the idea? Sounds interesting? Publish a paper on it? Or put up a blog post?


My method was evolutionary, put too simply - something not completely undiscussed even at the time and it's undoubtedly happening in one way or another now. I also designed a method for parallel processing dedicated to neural nets that seemed much more efficient than just "more processors", I've no idea whether NVIDIA et al have caught up with design that or passed it by now.


This was mid-1980s, not 1990s. Hence the 80286 CPU. A bit of a cold caused a brain cramp for me.


Obviously a clickbait headline, few people think current tech will lead to AGI, but while we all wait on a new breakthrough in AI, we can build much more impressive systems with what we have now than even a few years ago.

I went to a talk from people who had built more scalable training algorithms for Restricted Boltzmann Mahcines (which Hinton was a pioneer in) a few months ago and asked them why they had chosen to use RBMs. Besides the waffly answers about how generative models were more interpretable than discriminative models, the real answer seems to have been that this was a research program that had started when RBMs were still in vogue, and their funding was tied to it, and once it ran out they were going to switch to investigating Generative Adversarial Networks like everyone else.


I don't think that's correct. There are a lot of people who should know better that think that the current ml technology will lead to agi.


Let me guess. You think so, because we don't know what the general intelligence is, but the current approaches of ML are surely can't be extended to produce it.


We have some simple models of not even general intelligence that ml can't do. To steal from hofstadter:. Let's say I have a function that maps two strings. I'll give you an example function. "abc" -> "abd". What does "efg" map to? And how do you train current generation of ml to solve those problems in one shot like you just did?

I don't want to knock machine learning. There's a very huge set of very interesting problems that it does solve that were basically intractable till now and we haven't even scratched the surface of applications for. But the existing techniques have understandable restrictions to the scope of soluble problem.

I would also not be surprised if the best models included current ml strategies in some way, for example, in supervisor threads or initial processing of complicated input data into semantic-ish vector valued forms.


Just ML is never going to lead to an AGI. There are lots of other things that need to be first understood about intelligence, common sense reasoning, cognition and other things which we don't have a clue of (though some think they know it all), to build an AGI.


Just ML is never going to lead to an AGI.

I think there's a chance it could in principle. Or at least, I think it could yield something close. If human intelligence is largely rooted in a neural network with billions of neurons and trillions of synapses, then it seems reasonable to think that, in principle, a sufficiently deep/wide ANN could show "intelligence".

But here's the rub. Doing that would be silly. We already know how to make a neural network with human like intelligence using billions of neurons and trillions of synapses... it just takes a man, a woman, and about 9 months. :-)

Also consider the power consumption of an ANN of that scale, given the current state of things.

So yeah, to achieve anything truly AGI like, which doesn't need it's own nuclear power plant, I suspect we're going to need other approaches in addition to today's ML/DL stuff.


i think it's very unlikely. Keep in mind the genesis of the current round of AI algorithms have largely been enabled by 1) the easy access to parallel computing (e.g. GPU, clustering solutions), and 2) large amounts of ingestable data. The truly spectacular AGI will emerge when you can do "one shot" learning. With the exception of zero-shot translation (which is a phenomenal result) we haven't seen very much along those lines.


Among those who believe in AGI are Tesla, Google, Baidu, Microsoft, Apple, etc.


Brief description of the issue for non-experts:

Supervised learning

You can judge the output of your network against ground truth. You say that's a cat? Nope, it's a dog! And then slightly adjust your network so it's less likely to give that wrong answer in the future. How exactly you adjust the network is what backpropagation describes (in combination with something called a learning rate).

Unsupervised learning

You need to learn without someone telling you what the answer is. Most of the time for biological intelligence there isn't an oracle describing the truth at every moment of life to judge actions/decisions against. If you don't have someone telling you you've made a mistake, how can you know when to adjust your network? And if you don't know what the truth is, exactly how to adjust the network becomes tricky.

Somehow biological brains work without that oracle, and there are lots of ideas about how it does that. But right now for artificial intelligence none of those ideas has been shown to work so amazingly well that it has taken off like backprop has in the supervised learning world.

Hinton wants to find that amazing algorithm.


You need to learn without someone telling you what the answer is. Most of the time for biological intelligence there isn't an oracle describing the truth at every moment of life to judge actions/decisions against.

There is: Death, without reproducing. If an organism dies quickly but left children, they were biologically successful. If you didn't, then your evolutionary branch goes away.

I'm not sure this can be simulated, but it might be worth trying. Biological brains are the product of a billion years of the above strategy, so hopefully it's feasible to do the same thing.

Don't people already do that? Well, no. The strategy implies some sort of physics simulation with general rules, not a specific ruleset tuned to your domain.


>> Most of the time for biological intelligence there isn't an oracle describing the truth at every moment of life to judge actions/decisions against. [emphasis mine]

> There is: Death, without reproducing.

The problem is that it's not describing the truth at every moment of life, just the total approximate fitness of that lifeform by the end of its lifetime. It would be time consuming and cost-inefficient to simulate many possible configurations of actors until the end of their lifetimes.


Yet it's the only strategy that we're absolutely certain produces intelligent life.


Well, that it did produce intelligence once; not that it necessarily always does.

Although I do tend to think that given enough time (some millions of years), other intelligence will evolve on earth, from parrots or elephants or dolphins or octipuses or something. Though there's no evidence of a brain growth spurt that led to us, in other species; and it's not settled why we had it.


Are you absolutely certain we qualify as intelligent life?

edit: This comment isn't a joke, I mean this seriously. Modeling a system after ourselves may not be the best idea if we're not the best possible system.


It's a good question, and I think the answer is - up to a point.

We're biased towards individualism and the illusion of personal intelligence making clever decisions using its own resources, but in fact our real intelligence is collective.

Individually, most humans are a little bit smarter than other animals, but not much.

Humans are also excellent mimics. We learn by copying, and a lot of what we consider intelligence is copied behaviour and copied beliefs - often applied unconsciously.

So we've created systems - spoken language, writing, science, technology - that can externalise, collect, and preserve the best of our learning, so the species as a whole, including average humans, can get the benefit of it.

This works fine as far as it does, but our political and economic systems are still founded on animal logic (i.e. personal advantage no matter the cumulative long-term cost) instead of conscious collective intelligence amplification and species optimisation.

Based on this, AI won't get far unless someone invents an analogous way to externalise and abstract learning, so different AI projects can bootstrap each other, and there are common systems for abstraction and generalisation.

Unsupervised learning isn't going to be all that interesting unless it can do that. It'll be a fun toy for certain classes of problems, but it won't be anywhere close to an AGI.


An interesting question, but if we make a program that makes wry jokes about what it means to be alive or argues persuasively about its own freedoms, the world will notice.


You can define 'intelligent life' however you want. No matter how high you set the bar though, an algorithm that reproduces the intelligence of the average human and can be run in parallel at speeds unhinged by biological constraints would provide immense power, wouldn't it?


I think I'm more scared of creating an intelligent life form that isn't modeled after us. Maybe an irrational fear, but one I have regardless.


Of course you need some objective function and feedback about how well you're doing wrt that. In RL this is true too. The feedback needed for backpropagation might be, for each action you take in life, what your chance of death is right after that action


That's a different problem. Some things are hard wired by biology, but individual organisms learn during their lives in spite of vague feedback. That's what we're trying to replicate.


Having just raised a human from newborn to walking, talking kid, I have to say that her life so far has been chock full of oracles guiding her learning.

Long before language, babies spend hours every day just interacting with the physical environment, which is "true": the toy is here, not there. The body weighs a certain amount. If you move the head this way, you'll roll over. If you move it that way, no rolling. Etc.

And parents provide a different kind of guidance. We react differently to different sounds, actions, movements of child. Even if these are not "truth" philosophically, they represent for the child emotional truth.

I'm just a guy who reads things about AI on the Internet, but I have to say it sometimes seems to me like AI folks have funny ideas of what "intelligence" is, and how it develops. The description of "unsupervised" learning above doesn't comport with any human experience I recognize. IME, humans receive constant immediate feedback on everything we do in life.


What you've described for unsupervised learning is really closer to reinforcement learning, where you get rewards without any information. True unsupervised learning is more like "figure out the clusters of these n-dimensional vectors".


Well your teachers, parents, society tell you what is right and wrong so we really aren't learning unsupervised.


Also things like pain, hunger and pleasure also are for the most part generated outside of the "learning" parts of the brain, so similar to a pre-defined utility function/oracle.


Yes a 1yo child has a good way of getting feedback from the environment when learning to walk. They fall and it hurts !


But who teaches the previous generations and how? Chicken or egg first?

I think the origin is communication and memory. When a being (X) signals to another being (Y) the origins of its pain (signal of increased death probability) and X and Y can remember this cause, they can begin to teach.


Very little of what human's learn is done "unsupervised".


This is codifying "instinct" which presumably has already been done at least once (via DNA in some species).


Pain and pleasure are the oracles. Maybe not at every moment, but within a given lag.


Well, the algorithm is death and reproduction, isn't it?

If the output of your networks performs well enough when it confronts the real world, you get to reproduce before you die, and some of the essence of your network gets to live for another generation.

If not, you die, and your network doesn't get to contribute to the next generation.


>> Most of the time for biological intelligence there isn't an oracle describing the truth at every moment of life to judge actions/decisions against.

Is it possible that your conscious self is the oracle that provides feedback to subconscious networks?


No, why would "your conscious self" know any of the answers? The problems biological organisms face aren't even well defined enough to have "answers" generally.


People learn many things by being told a set of rules and/or objectives. Over time they can self-evaluate their performance against those rules and objective.

I suppose an animal may have a need - it feel hungry. If it can satisfy that need then it should reinforce whatever it did to get not-hungry. I suppose that doesn't require conscious awareness. But at some level of complexity we have to be able to plan more complex actions (hunting, tracking) and be able to self-evaluate performance on the sub-tasks used to get food.


Solomonoff induction captures the algorithm, but it cannot be implemented. Perhaps that means the mind is non algorithmic.


Biological brains try to predict the immediate future: what they will sense next. That's the ground truth.


I wager it has something to do with prediction and rules


Actually:

>>Most of the time for biological intelligence there isn't an oracle describing the truth at every moment of life

This is false for most of learning of human intelligence. Most of schooled human intelligence looks like supervised learning.

Innate intelligence and other animal intelligence looks like pre-trained models and some reinforcement learning.


"Schooled human intelligence"? What does that even mean?

I think you are severely underestimating humans, and probably animals too.

People have multiple modes of learning. You can try to get an A in school ("optimize" for the objective function), but you also might be skeptical of what the teacher tells you. Likewise you might be skeptical of what you read in a book.

You can also correct the teacher based on common sense and what you already know.

Currently, a computer can do no such thing. I think he's right that there is a fundamental difference.

Humans also learn from extremely small amounts of data. (Sometimes they learn badly, but they can change their minds, or not if they still manage to navigate the world.)


How >>Humans also learn from extremely small amounts of data. (Sometimes they learn badly, but they can change their minds, or not if they still manage to navigate the world.) ??

What does this mean? That 18 years of schooling is a "small amount of data"? Do you think you show 2 pictures of an airplane to a child and they know what an airplane is? Do you even have kids?


Except that by the age you reach school you already know amazing wealth of knowledge about the world and how it works, particularly the kind of commonsense knowledge that AI has always had enormous trouble to acquire. See Moravec's paradox.


5 year olds "know amazing wealth of knowledge about the world and how it works"?? Que? How many 5 year olds do you know or interact with?


Your statement is exactly the incarnation of Moravec's paradox. You don't see the amazing things that a 5yr old (even a 2yr old) can do, exactly because these things are completely obvious to us. They become much less obvious once you want your robot to be able to do these same things. At that point these things become horribly difficult.


You're severely underestimating children (or over-estimating the current state of AI). 5 year-olds have better natural language skills than any AI in existence, better motor and navigation skills than the most advanced DARPA robot, better mood recognition capabilities, etc.

You appear to be thinking of just facts and figures. AI is not that.


When did I mention AI?


How many robots have you interacted with?

5-year old humans have picked up a lot. For instance, I knew about the dangers that hot surfaces, sharp objects, angry cats, and big sisters represented.

I had learned to walk, run, swim, ride a bicycle, and catch a ball.


Ahh yes who doesn't remember their days in school doing gradient descent and backprop on millions of samples of text describing the solar system.


Can we go with 'incomplete' rather than 'false'? :) One of the most interesting areas of study for me is early childhood motor learning which is a fascinating combination of unsupervised, reinforced, and imitation learning. There are even significant intrinsic knowledge (instinctual/reflexive/morphological) effects. Also I really like to encompass non-human learning when talking about this stuff to remind myself of all the work nature did before we got to cultural transmission of knowledge.


Please describe what "unsupervised learning" a child may be doing.


Learning to crawl. Learning to perceive depth. Learning to walk. Learning to pick objects. Learning to hold balance. Learning to hold balance while picking an object. Learning to handle liquids (such as food or drink). Learning to avoid dangerous situations. Learning to ride on a bike. Learning to see. Learning to interact with other agents. And so on and so on.

The list is almost endless. In all of those cases, if there is any supervision it is minimal.


Even before: learning to swallow in the womb. There's definitely no reinforcement from other humans involved in that, and it's fundamental that they learn it in the womb because it would be too late to learn it with milk.


That's been passed down over 3.5 billion years to form the DNA that creates that part of the brain in an infant. AKA pre-trained.


Not really, they develop it by sucking on their thumbs, drinking amniotic fluid, etc. You can see it develop, and you can see how premature babies have issues with it. Adding "breathe" to sucking and swallowing is even harder for preemies.

Starting to suck when they feel something on their lips _is_ a reflex though, it isn't learnt.


reflex = pre-trained?


More or less, yes. They usually disappear before the first year. Many later reappear in a more refined and voluntary way, which is learned rather than pre-trained.


All those things you described can be explained by having a pre-trained model built from DNA. Babies aren't a blank slate. Why would you even think they are? Do you think a baby learns to breath from blank-slate intelligence?


If what you are saying was true, then why all these skills require time to develop? If they are pre-programmed in the DNA, why are they not enabled on the day of birth? Clearly from evolutionary point of view an infant that is already able to walk and perceive depth would have vastly superior chances of survival? Why all the brain development, and much enhanced plasticity in the early stage of development? Why do neural receptive fields change rapidly in the first months of life and do depend a lot on the stimulus received?

Clearly there is stuff pre-programmed in the DNA and some animals e.g. are able to hold balance right after birth. But apparently most of this stuff in case of humans is instructions how to build a learning system, rather then encoding the complexity of the world in a strand of DNA. I grant you, the evolution did not have the time to encode ways to handle lego bricks and yet 2yr olds spontaneously acquire ability to manipulate these objects way beyond what any contemporary robot can do.


I hope it's clear that the answer lies somewhere between these two extremes. Most likely, there are instilled a number of prototypical reflexes that quickly make available the "right" decision for a given situation. Consider it that we don't know the objective function per se, but that we start in a place and with a gradient vector already pointed in such a way that we quickly reach the optimal location in action space with relatively few exploratory trials.


Yes, the walking movement is one of a newborn's 9 primitive reflexes. [1] However, developing the fine control skills required to balance while walking is a skill that babies learn, and it takes a long time to do so.

[1] https://en.m.wikipedia.org/wiki/Primitive_reflexes


> If what you are saying was true, then why all these skills require time to develop? If they are pre-programmed in the DNA, why are they not enabled on the day of birth?

I'd argue that we see clear examples of such in animals that do require those for immediate survival. We ought to think about it as a consequence of natural selection rather than a 'necessity' that is somehow intelligently pre-programmed for survival. It's far more likely that path to early survival traits were mutational (accidental) rather than supervised.


Language acquisition. Sure, parents will point at an object and say, "Ball", "Chair", "Food", etc. But they don't do this for the conjunctions and interjections and prepositions. Nor for most of the 10,000 words a child will learn in the first few years of life.

Children learn the rules of grammar long before they realize there are any rules at all. Tenses and plurals, etc.


It's amazing how little people on this site know or understand intelligence, artificial and human.

You would be surprised how much infants learn from listening around them and how long it takes to learn that language, and how a brain interprets language was formed through so many years of evolution to be highly tuned to learn language fast.


We're arguing past each other then. I have no doubt that evolution is the essential thing that made us capable of unsupervised learning. That's not the central point of the debate for me. What I'm talking about is the problem of replicating that ability (unsupervised learning) in AI systems. We don't have the luxury of billions of years of evolution to do it. So we need to engineer another way.

I understood your previous comments as diminishing the sheer amount of unsupervised learning we do as humans.


[flagged]


This comment seriously violates the guidelines: https://news.ycombinator.com/newsguidelines.html. Please stop and re-read them.


I think innate intelligence/ animal intelligence is developed by evolutionary algorithm. You don't need to do back prop for it. If the network is wrong, the animal just dies.


I would agree with this approach. In evolution, it would be the equivalent to a conditional lethal mutation. In humans and even non-humans, may behaviors are learned through a form of adaptive behavior that oftentimes becomes a form of abductive reasoning. What is learned is "good enough" to serve as ground truth until there is evidence to contradict.


What you call "evolutionary algorithm" is just a pre-trained model your DNA created. That took 3.5 billion years to get to this point.


Everything Hinton says echoes strongly Kuhn's philosophy on the scientific paradigm. I guess it's not too much of a surprise that he thinks this is the case, but it also shows how intelligent (and shockingly humble) he is to make such a statement.

I wish I'd heard of the conference, dammit!


I think it's great that top researchers are actually learning from Kuhn's observations. I have hope that humanity can indeed learn from its earlier historical mistakes, but it requires us to be very aware of them in order to collectively avoid them.


I agree.

Mr Hinton was somewhat vague and brief in his statements here, though. What do you think the crisis point is that he's reached exactly? (It must be sure enough for the man who introduced the paradigm in the first place to have caught on, yet the change of thought process unclear enough for him to recognize that it will be somebody newer who will have to introduce something else)


I think his quote is out of context. If he is saying that back propagation networks alone are insufficient for AGI, that is neither surprising nor controversial. Hinton has never been working on the AGI problem. It sounds more like the author is making use of the quotes he/she got from Hinton in the most clickbaity way possible.


I get what you're saying, but I don't think he was speaking explicitly about AGI. To me it didn't sound like they were leaving the applied problems he has worked on. Maybe I'm reading too much between the lines, but so far there haven't been any major successes in un-supervised learning models and he's suggesting here that rather than continue with the models that have had success up to this point, it would be more worth the energy to start again at a lower level, rethinking the problem, in order to achieve similar results unsupervised.

This may be where my limited knowledge is my handicap, but I didn't think you would need an AGI in order to achieve unsupervised learning. There would still be an input and an output as there is now -- but instead of massive datasets used in training the models could simple experience the problem in real time, adjusting and learning based upon new results. Of course unsupervised learning models could contribute greatly to the pursuit of AGI, but I didn't think AGI predicate unsupervised learning models (which is what I am taking from the article -- or more like article abstract -- here).


> Maybe I'm reading too much between the lines, but so far there haven't been any major successes in un-supervised learning models

Word2vec, a variant of which is Transform, Google's new machine translation framework which beats the pants off every other approach, to name one. The system it is replacing is also unsupervised.


Transformer is heavily reliant on supervised learning. Not sure where you're getting that from.


I was talking about word2vec.


I hadn't heard of that at all! Thanks. I'll check it out.


Do you mean the theory of scientific revolution ?


That's the one. I'm actually just finishing it up now, so it's fresh enough in my mind that it jumps out at me in application.


I'd be careful reading it: Kuhn is excellent at building narratives, and what is more narrative friendly than revolution? In reality science is far more messy than he acknowledges.

Unfortunately while I think many of Kuhn's observations are interesting, I'm confused what you think scientists might learn from them. After all, the work in question is science, not philosophy of science, and frankly it seems a lot of time like the latter gets in the way of work (see: Popper's radical empiricism). Poetic? Sure. Useful? Difficult to see how.... Mostly his work seems useful for bolstering the credibility of speculative Popular Science articles (or Axios in this case).


Oh well I didn't think I'd applied any subjective notion to my recognition that it sounds like the paradigmatic event that Kohn describes. And having recently gone through the book it was the first thing I noticed.

My admiration for Hinton there lay separate from my remark about Kuhn.

I'm not quite sure where you got that I think scientists must learn from Kuhn's observations. But I will say, that while it's kept me away from my other studies (for the obvious displacement of reading one thing and not the other), I don't see the harm in indulging in such reflection. But my background isn't engineering, mathematics, or physical sciences before my more recent work. It's in humanities and the arts where multidisciplinary studies are encouraged. I suppose I'll probably always carry that with me. If nothing else, perspective is a good thing to have -- and not all thoughts, reflections, or ideas entertained need be engaged.

At any rate, the book was recommended and I try to remain open minded.


> In reality science is far more messy than he acknowledges.

Funny you should say that. I take Kuhn's whole point to be that science is far messier than the empiricists acknowledge. What is the complication that you think Kuhn overlooks?


So it says "In 1986, Geoffrey Hinton co-authored a paper that, four decades later, is central to the explosion of artificial intelligence." Is it just me that considers that 3 decades? I am curious which is the typo. Was it 76? or 3 decades?


I was trying to rationalize it this way:

1980's - first decade

1990's - second decade

2000's - third decade

2010's - fourth decade

I guess, technically, the third decade of age for this paper ended in 2016, making 2017 the fourth.

But this seems like silly ways to rationalize it. Genuinely seems like a typo.


1986 was 31 years ago. Maybe not decades in the sense of "1990s" being a decade. But certainly 3 periods of 10 years.


It is fantastic to hear this from a seasoned academic about his own field-- People get trapped in orthodoxy and become unconsciously unwilling to explore bold ideas in favor to minor tweaks. The framework of thinking itself becomes precious, but big advances usually come from discarding the old framework in favor of something radically new. The kind of humility and bold thinking needed for that kind of change is rare and hard to foster (especially while still remaining grounded by empiricism).


I currently work in DNA analysis. I wholeheartedly agree.


That article is pretty light on details. I wonder if he pointed towards a specific form of unsupervised learning.

Anyway it's pretty funny in light of an intro I remembered from one of his old papers:

"It would be truly wonderful if randomly connected neural networks could turn themselves into useful computing devices by using some simple rule to modify the strength of synapses. This was the hope that lay behind the original Hebb learning rule and it is the vision that has driven neural network modelers for half a century. Initially, researchers tried simulating various rules to see what would happen. After a decade or two of messing around, researchers realized that there was a much better way to explore the space of possible learning rules: First write down an objective function [...] and then use elementary calculus to derive a learning rule that will improve the objective function." [1]

ie. backprop

So actually backprop was the solution to all that initial "messing around" with unsupervised rules. Though of course to be fair (if I understand correctly) those rules had very little to do with modern "unsupervised learning" methods (e.g. autoencoders, which still rely on backprop or similar optimization).

[1] http://www.cs.toronto.edu/~fritz/absps/hebbdot.pdf published in 2003


At first glance, the obvious solution seems to be to create intelligence the same way nature did: some sort of evolution. Some sort of algorithm where multiple "networks" mutate and reproduce in some way, in response to some fitness function. Mutations that make a network more fit result in an increased reproductive rate, while other mutations decrease that rate.

Evolutionary algorithms have been around for a while but haven't really taken off. Maybe the problem is that you can't get from zero to intelligence with a single fitness function.

Think about it: the "fitness function" for our single-called ancestors was not how well they could thrive in a post-industrial service economy. Nor was it how well they could thrive as hunter-gatherers in the African Savannah. It was a very, very different fitness function.

To get from zero to humans, nature had to evolve single-called organisms, then mitochondria, then basic multi-celled organisms, then animals that lived in the ocean, then amphibious ones, then land-dwelling ones, then mammals, then apes, then early humans, then modern humans.

By the way, I left out the overwhelming majority of the steps.

At each step, the evolutionary pressures on these organisms were wildly different. Each set of pressures was necessary in order for the next layer of complexity to evolve.

I think that if we want to evolve an intelligent entity, we would probably have to do it like this.


But nature also evolved a bunch of other organisms alongside humans that are fit in their own way. How do you evolve for just the (human level) intelligent entity and not a cockroach or rat?

Point is that you're going to end up with a massive ecosystem of organisms if you could fully simulate an evolutionary history. It's not even clear that human level intelligence would evolve. It's only happened once in our planet's history. We might be flukes.

You might have to run the massive simulation a thousand times and somehow be able to pinpoint the human-level intelligence when it does succeed.


Excellent points!

One would have to figure out which evolutionary pressures to apply in order to select for the desired traits. For example, at some point in the past, humans and rats had a common ancestor. That ancestor had two offspring. One offspring's descendants faced a series of evolutionary pressures that led to rats. The other's descendants faced a different series of pressures, leading to humans. The designer of a multi-stage evolutionary algorithm could choose which evolutionary pressures to apply, and when to apply them. The designer could thus shape the evolution of the program in the desired direction.

The trick is knowing what pressures to apply in order to get the desired results. How could one do that? I have no idea! But if someone can figure it out, perhaps they could mimic what nature has blindly done.

EDIT: As for the worry of ending up with a menagerie of programs, there are a couple of things that I think could help. First, the designer must be careful not to take the natural-selection analogy too far. The real world has numerous diverse environments, each with myriad niches; and in each niche, different evolutionary pressures shape organisms. This sort of algorithm need not provide all these diverse niches. It could apply a single set of pressures to all the programs in it, pushing them either in the desired direction or to extinction. Second, if the designer could identify evolutionary states that are a "dead end" and unlikely to evolve in the desired direction, they could remove the dead-end programs. Whether such dead-end states can be reliably identified is anyone's guess.


Sounds like the making of a really interesting game, at the very least.


What makes you think that our brains or intelligence is the most efficient or most effective? We happened to evolved a brain that allowed us to think on what we consider sophisticated. This doesn't mean that it's a great system. We also evolved to have a spine, which is a terrible design. There's a ton of other examples of evolution designing things that are terrible and ineffective, but get the job done. Is that what we're striving for?


Why pooh-pooh ideas if you're not sure either? Let's explore, combine, and mutate the iterations that each party thinks may be successful.


You're entirely right, of course! Evolution certainly does not create anything optimal.

My point is that perhaps we understand evolution better than we understand how to create a mind from scratch.


How to create intelligent systems are already encoded in human DNA. You just have to find a way to decode it


To be fair, reinforcement learning can be done with neural networks and can be seen as a form of "unsupervised learning". The results aren't as spectacular (yet?) as what we've seen in unsupervised learning, but it has potential.

I think the main problem is that we're incredibly dependent on advances in computing power to make advances in deep learning. IMHO, we've only made incremental software progress in using deep networks since the advent of convolutional networks.


If you are interested in biologically plausible models of cognition check out 'vector symbolic architectures' and 'associative memory' research of the 90s.


Biological plausibility is just silly. What it mostly does is just put artificial constraints on the problem that doesn't need to be there. It's all mathematics. However you want to interpret or metaphorize the mathematics is up to each math-phobic field, but constraining yourself to what randomly evolved serves little good.


While it's likely true that there are other ways to approach the problem, we're yet unsure of how to get there. The existing biological model has as least proven itself to work to some level, at least insofar as Descartes has lead us to believe.


I would bet a lot of money that engineers testing model after model in a state of flow, without any concern for biological validity, will yield a better result than any biologically inspired thing will. Time efficiency is also an important factor here. The model that will win is the one that is first to "market" (academic paper).


Can't say I'm disagreeing with you. In fact, I'm not. To illustrate, I think it is far more likely we'll have "Jarvis" before we are building our own analogues of the human brain (or better).

I do want it to happen, though. And I think it will, in spite of whatever model wins. It's just too interesting a problem. How that takes shape, I can't dream of predicting. (eg. How soon will or understanding of the use of organics in computing advance to such a level)


Biological plausibility is only silly if you assume that biology has NOT in ~4 billion years of evolution managed to explore the design space and arrive at good general maxima/peak design for intelligent processes and are not merely stuck on the first local peak design encountered.

It assumes that outside of this biological design space lies both lower peaks and much higher unexplored peaks.

While it is likely that somewhere in the universe are higher peaks of design schemes for intelligence, it is also likely that those peaks are difficult to reach from here (4B+ years of evolution hasn't done it yet). It's also possible that the Fermi Paradox is showing is that there is no substantially better general design schema.

So, the more successful approach would probably be to seek designs for the first Automated General Intelligence in a design space that has already demonstrably succeeded, then perhaps use that AGI to search for other better design spaces.


1) One trouble with imitating biology is that stochastically driven processes tend to achieve maxima by exploiting environmental circumstances that are not intended or considered to be part of the design constraint. See for example the genetic-algorithm designed tone differentiator circuit on FPGA, which achieved extreme efficiency but the experimentors could not explain the circuit's behavior based on known theory, moreover, the circuit ceased to function correctly when tested outside of the lab it was 'grown' in.

2) Biology has to satisfy many constraints which are not hard limitations on artificial brains, such as:

* Thermodynamics concerns, surface-area to volume ratio

* A tightly controlled operating temperature of 97-99 F

* Weight (wetware brains must float on CSF because they belong to moving animals)

* Size


Point 1, we know. Just look at all kinds of the insects.

Point 2, this reads as some advantages compared to our current computers.


It's not silly to try and mimic the only known model that has ever worked. Lots of modern technology is inspired by "randomly evolved" mechanics, it's a perfectly reasonable place to start.


If the objective is to simulate general intelligence then you want to copy the natural one as closely as possible. Simulating is easier than inventing.


False dichotomy. The alternative is not blind trial and error.



What we lack is an AI equivalent of a theory of aerodynamics. At the moment we're just brute forcing random solution when we have perfectly good systems we should be trying to reverse engineer: biological brains. This will likely require genetic & evutionary algorithms, a developmental genotype-phenotype mapping, and competitive ecological systems for open ended evolution. Right now we have back propogration which is effectively reinforcement learning--just one piece of the puzzle. I actually applied to Hinton's lab in 2009 for a postdoct position to address these very issues. He said they didn't have any money and were not taking new postdocs.


I've never understood this about "AI". Why do we think that our brains or animal brains are the most effective models for intelligence? Just because we evolved to have a brain that works a certain way doesn't mean it's the most efficient or even efficient at all. Of course, I don't have much experience with AI other than what I've read.


Do you have much experience with brains? Are you aware of the ratio of computational capacity to energy requirements that is as yet unmatched in power?

Not to mention that, as a system, the brain is highly efficient in reusing modular structures for increasingly complex tasks.

I have a feeling when people mention "AI modelled after a brain" they consider a simplification of neuronal dynamics to be how to achieve that. But that is a very naïve way to do it. Looking at it from the system level is the more fruitful approach, IMHO.


I don't know. I think backprop is probably utilized a lot in biological networks. Isn't that why we take tests in school? Obviously backprop doesn't make sense in an unsupervised setting. There's no label to backprop on.

But here's an example:

I see a stove eye is black when cold, then when I see it turn red, I touch it. Ow, it hurts. That's supervised learning. Don't touch things that are glowing red when they don't normally glow red. Now, I see an iron pole glowing red. It's not a spiral like the stove eye, but it's not normally glowing red and now it is. I'd better not touch it. I can deduce that these two objects are made out of a similar material since they're normally black or gray and now glow red. I can also deduce that something glowing red means it's hot. That's unsupervised learning.

To me, unsupervised learning is looking at a fluffy object with four legs, eyes, a nose, tail, it moves, etc. and knowing that it's some kind of animal. Creating groups of things like k-means. Supervised learning is your mom telling you that this one example is a tiger. Unsupervised learning is understanding the delta between your one labeled example and the rest of the examples in your animal group is largest when the animal is NOT a tiger.


>Isn't that why we take tests in school?

Maybe not. I remember (but don't have on hand) that people who studied a novel concept for 1 hour and then were tested on it for 1 hour, without seeing the result of the testing, bettered recalled the content than those who studied it for 2 hours. Granted, lots of variants of such a study should be performed before drawing strong conclusions, but it may be that back propagation isn't much used.

>I see a stove eye is black when cold, then when I see it turn red, I touch it. Ow, it hurts. That's supervised learning.

Yet most people learn to avoid glowing hot things without having to touch them. I can be told something red that isn't normally red is a danger, and I learn without ever having to touch one and be told my touching it was the wrong course of action. There is also interesting things that happen with sensation vs perception. Ever burn yourself with really hot metal that wasn't glowing? When I did, my fingers perceived extreme cold, not heat. You can also do the reverse by getting a rod made to look like it is glowing hot, chilling it until it is really cold, and hitting someone with it. They'll perceive being burned.


I think an important part of this is the label that the network creates itself. So, in your example of "four legs and fluffy = 'animal'", that label of "animal" must be a completely computer-created label. Not necessarily using the word "animal", or using any kind of identifier that humans would use. A completely artificial grouping, or domain, that is identified by the computer on its own, and maintained and stored as a potentially useful label for future use.


I think Unsupervised Learning is fundamentally connected with the mission/goal of the AI. Whatever the mission is give to AI, it need to start learning information landscape by itself and make classification and use that knowledge to make predictions and act accordingly to optimized the outcome of the mission its given.


But you can't even 'give an AI a mission' without a model of the world! And, if you expect the AI to achieve its mission in a way that's acceptable to you, you better be supplying it with a big enough model for it to reasonably be able to hit that tiny target in the space of possibilities.

Take self-driving cars as an example. Do you not think that any AI that will be able to do this will be essentially 'given' a huge set of knowledge before it's expected to learn anything by itself?


But, when you revise the model based an analyzing the predictions of the previous run and adapt to the positive result, isn't that supervised learning with back-propagation?


I should point out that OpenAI put out a blog post / paper suggesting that genetic algorithms can perform similarly to backpropagation for Atari/Q-Learning tasks:

https://blog.openai.com/evolution-strategies/


It's time we begin to incorporate meso to large scale structure into neural nets. The human brain is able to accomplish what it can because of the way it organizes neurons into clusters and sub organs (thalamus, Broca's area etc.)


I think it's silly to think that modeling AI after how a brain works is silly. We happened to develop the brain we have over many random iterations over billions of years. I don't understand why we think the human brain is the optimal model or method for intelligence. I guess it gives us something we kinda sorta understand to use as an example, but it seems silly to think our brain and how it works is a good model.


I agree with the article. Throwing more cycles at a problem using the existing models isn't that answer. Better, more efficient models are needed.

And yes, I'm working toward that goal.


Can you share something about your work? Or how I may get involved?


Not yet ready to share, but I will put out a bunch a data when the time is right.

The hardest part is finding interesting challenges for the AI to conquer.


there was this tutorial on how to implement a more biological accurate neuron in javascript.

https://medium.com/javascript-scene/how-to-build-a-neuron-ex...

I asked the author how to put it into use, how to train it? because clearly this biological neuron is not a smooth function. but I didn't get any response.


Does anyone have a more substantive link?


They also need to start over on all that wiring behind Dr Hinton in the photo.


I like this guy!

Yes back-propagation can help classify and predict many things.

But semantic information needs more research. For now the state of the art is Cyc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: