Hacker News new | past | comments | ask | show | jobs | submit login
Why 2015 Was a Breakthrough Year in Artificial Intelligence (bloomberg.com)
120 points by JohnHammersley on Jan 1, 2016 | hide | past | favorite | 66 comments



I am old enough to remember three A.I. booms and two intervening "winters". The first boom was picking all the low hanging fruit like playing checkers, moving blocks, solving word problems. The power of computers in the 60s and 70s was pitiful. Then the easy stuff was done for while.

The second boom was expert systems and logic super computers. Those systems never worked that well and A.I. went into long sleep.

Now its supercomputer data mining and greatly improved neural networks.


It's different this time!


I think GP is saying it actually is different this time.


I don't think there's going to be a winter this time. The current boom may only take us so far, but there are a couple more booms on the horizon that in all likelihood will be cusping well before this one is finished.


Care to give any examples of those upcoming booms?


Well it really is. Though they are all called "A.I.", they are wildly different fields with little in common.


All computer science is AI, including before computers existed. Boole even called his logic the "Laws of Thought". It's just the goal-posts that move.


Back in 2005 I suggested that it would be interesting to build a system that would play any game without being given directions (that blog post has since disappeared from the web). This, and not image recognition, would mean intelligence, I thought at the time: you can't truly tell a donkey from a dog without having some general knowledge about the world donkeys and dogs (and your AI) are living in.

The academic image recognition machine though seems unstoppable and yes it does seem to improve over time. I honestly don't know what the limits of "dumb" image recognition are in terms of quality, but calling it AI still doesn't make sense to me.


AGI is the idea of creating general intelligence. Which almost certainly requires some reinforcement learning (the ways dogs and humans learn, by having a goal and an internal reward system).

Deepmind's Atari player is based on deep reinforcement learning, where increasing the score represents a reward:

https://m.youtube.com/watch?v=EfGD2qveGdQ

I used to believe the moving goalpost idea, that AI is anything that isn't yet possible. I now disagree.

I saved myself a lot of energy by avoiding entirely the question "what counts as AI?" by switching to the question, "what counts as AGI?" which is a term with a clearer threshold:

https://en.m.wikipedia.org/wiki/Artificial_general_intellige...


> but calling it AI still doesn't make sense to me.

That's the core problem of AI, no matter what progress is made, it's instantly called not AI anymore while the goalpost of what AI is is continually being pushed out to not this. The real issue of course is what don't how intelligence actually works so it's impossible to set a fixed goalpost of when AI is truly achieved.


> The real issue of course is what don't how intelligence actually works so it's impossible to set a fixed goalpost of when AI is truly achieved.

No, the real issue is that people still think, intuitively, that there's a little homunculus in their head making the decisions. Each time we build something that doesn't look like a homunculus, we've failed at attaining "real" intelligence...


> impossible to set a fixed goalpost of when AI is truly achieved

We have a kind of fixed goal post in human intelligence. When computers are worse at something than humans like chess it's thought of as intelligence and when they get better it's ticked off as just an algorithm. The AI researches gradually tick off abilities. Chess long ago, image recognition happening now, general reasoning at some point in the future.


Making decisions is not that complicated, neither is it interesting. Your iPhone can make a decision to kill a process that takes too much system resources while in the background. What's more interesting (and what seems to be the main function of the so called homunculus) is being aware of your own location in space and time, as well as remembering previous locations. In other words, having some model of the world and knowing your place in it is what computers haven't achieved yet in any meaningful way.


So, map building is now what will truly define AI until it's also just another technology.


How is map building AI? It's a pretty mechanical process. Start somewhere, make some measurements, move along, repeat. At what point is there any notion of intelligence involved?


It's deeper than you describe...

https://en.wikipedia.org/wiki/Simultaneous_localization_and_...

... but I agree that there's nothing in particular that distinguishes it from other problems.

My comment was satirical in nature. SLAM is an interpretation of what the parent comment had described:

"[B]eing aware of your own location in space and time, as well as remembering previous locations. In other words, having some model of the world and knowing your place in it[.]".

There is a general pattern of statements of the form "We'll only really have AI when computers X", followed by computers being able to X, followed by everyone concluding that X is just a simple matter of engineering like everything else we've already accomplished. As my AI prof put it, ages ago, "AI is the study of things that don't work yet."


Or it could be, a system capable of reasoning its way to doing X would be intelligent, but you can also teach to the test, so to speak, and build a system that does X without being generalizable and thus satisfy X without being intelligent.


> Or it could be, a system capable of reasoning its way to doing X would be intelligent, but you can also teach to the test, so to speak, and build a system that does X without being generalizable and thus satisfy X without being intelligent.

Which is exactly what we do with many kids today; makes you wonder how many times we might invent AI and not know it because we don't raise it correctly so it appears too dumb to be considered a success.


That's sort of where a lot of people have arrived, to be sure, with distinguishing the notion of an "artificial general intelligence" from other things in the AI bucket.


Your iPhone is aware of it's own location in space and time and remembers previous locations as well.


I don't think the iPhone has a model of the world with its own body in it. You could program that too, and once you also add the ability to move/act according to a certain goal, you have an intelligent agent. However, making your system more or less useful is what takes the task of AI to levels of complexity we can't reach yet. Compare ants and dogs: the former is achievable in terms of simulation but not interesting, the latter can be potentially useful but is already too complex for us to implement.


Control theory is a thing. Causal induction is a thing.


Control theory: worked with some smart folks when it came to designing plant control systems. All of it was human labor. There was zero intelligence on the part of the tools and all parameters had to be figured out by the person designing the control stategy.

Causal induction: sounds interesting until you dig in and realize everything is non-computable.

So what exactly is your point?


>Causal induction: sounds interesting until you dig in and realize everything is non-computable.

Wait, what?


Here you go: https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_induc.... Follow the chain a little bit and you'll come to things like: http://www.hutter1.net/publ/aixiaxiom2.pdf. Super interesting but very impractical.


Causal induction is a different thing from Solomonoff Induction.

https://cocosci.berkeley.edu/tom/papers/tbci.pdf


That's basically what I'm saying in the previous sentence.


The reason behind that is probably that a large enough collection of interconnected algorithms executing simultaneously may be indistinguishable from intelligence. The questions then becomes 'what is large enough' and 'which algorithms', and this is where we may be off by much more than our most optimistic guesses and that still leaves the possibility of being flat-out wrong about this.


> that a large enough collection of interconnected algorithms executing simultaneously may be indistinguishable from intelligence

Or is intelligence.


AI is like magic.

If you don't know how it works -- it looks like magic. It can tell a donkey from a horse, it can play checkers, diagnose a patient etc.

After you are told the trick -- it is just A*, or Rete algorithm, or a multi-layer NN. It makes it less magic and it becomes just another algorithm.


There isn't a lot in the world of AI that seems worthy of the I. Image recognition though, while not a part of our conscious reasoning is a very strong part of our brains. It continues to advance because it is immediately profitable.


That's actually starting to happen. Deepmind built an AI that can learn to beat the best human players on many atari games. After just a few hours of playing the game and learning. And of course it uses all that advanced image recognition stuff. That has always been the hard part. The actual game playing part is just a simple reinforcement learning neural network that is put on top of it.

The reason it's AI is because it isn't specific to speech recognition. Deep learning is very general. The same algorithms work just as well at speech recognition, or translation, or controlling robots, etc. Image recognition is just the most popular application.


I've read elsewhere that they trained in whether the score went up. For it to really be a general game playing AI, it should be able to figure out the goals of games without scoring systems, such as adventure games.


That is, unfortunately, impossible. All AIs need some kind of reward function, an incentive to do things. Without that they have no reason to do anything.


So your thesis is that AIs will never reach human intelligence, which can indeed figure out goals of video games on its own?


It might be able to figure them out, though it has no reason to care.

Certainly, a new born baby given a video game controller would not be able to figure it out.


Important to realize the AI fails completely at other Atari games.


Specifically the ones that require any sort of identification of state. It kicks butt in hand eye coordination tasks and it is awesome that it can learn those hand-eye tasks automatically, but higher order reasoning is obviously out. AI makes progress every year and when those progress in individual tasks are coalesced into a coherent single entity we will have what the average person calls intelligent.


Yes, but that's because the specific ANN the Deepmind researchers used doesn't have any state. IIRC it's given the last 2 or 3 frames as input. No doubt that makes the learning easier (I should mention that reinforcement learning is used, unlike most tasks ANNs are applied to). But recently stateful ANNs (typically based on Long Short-Term Memory (LSTM) networks, from 1997 already) are becoming more and more popular. I would like to see someone make another attempt at Atari games with such a stateful network; probably already done actually.


This is true. The reason there's a scene in the movie, 2001, in which HAL plays chess is that at the time it was thought that playing chess well required real human intelligence.

But as soon as chess programs got good, we all took them for granted.


That's actually trivially doable and has been done for video games and board games.


One that can play ANY game, not programmed to play a specific one.


Yes, it's trivial. For board games all AIs essentially use Alpha–beta pruning + scoring function and an approximation to that scoring function can be autogenerated using standard machine learning techniques.

For (classic) video games it's actually somewhat similar but rather than a board being fed in as an input you just feed in a bitmap of the display (sometimes at a lower resolution using compression techniques to reduce input features) and optimize moves made to maximize the score at any point rather than end-game.


I see no AI breakthroughs. I see image processing.


Image recognition requires AI. It used to be believed that it was simple. A famous AI researcher in the 50's once sent a bunch of grad students to solve it over the summer. They then started to realize just how complex and impossible the task was.

60 years later, we have finally made general purpose learning algorithms, vaguely inspired by the brain, which are just powerful enough to do it. And because they are general purpose, they can also do many other things as well. Everything from speech recognition, to translating sentences, or even controlling robots. Image recognition is just one of many benchmarks that can be used to measure progress.


Relevant xkcd: http://xkcd.com/1425/


Thanks to that, I found http://xkcd.com/1428/. That is genius.


I do find it inspiring. I studied neural networks at CMU and at Toronto. But if you look at the false positives that these CNN architectures generate, you'll have to agree that there is no "intelligence" present.


Image processing appears pretty central to human thought hence phrases like 'throw some light on the problem' and 'get the full picture', if you see what I mean.


Not really. One could be deaf or blind from birth, yet fully intelligent. It's more a matter of getting sensory from the world into the brain. That sensory may be visual, auditory, touch, etc. The brain is very plastic in this manner.


The processing of that non-visual input will still be performed by the visual cortex.


It certainly won't be in sighted individuals, and I doubt it can be adapted as you suggest in blind people. The visual cortex is quite specialized, and performs quite low-level image processing, e.g. detection of edges and corners, as well as movement. It does not do any object recognition. That's done elsewhere. For example, the fusiform gyrus is involved in facial recognition.

A good introduction to how the visual cortex works can be found in Eye, Brain, And Vision: http://hubel.med.harvard.edu/book/bcontex.htm


For example echolocation (in humans, not bats) is done in visual cortex[1]. Visual cortex shows activity during dreams and imagination. Blind people do use their visual cortex.

[1] Sorry, don't have a cite, but there are a lot of results on Google talking about it: https://archive.is/iDE27


This doesn't surprise me. They are using their visual cortex to process sensory data (auditory images) which substitute for sight.


I see what you did there.


AI now stands for Analyzing Images


I see a number of meaningful advances in computing technologies last year. But the term "AI breakthroughs" is getting meaningless these days.


2700 deep learning projects at Google... What are they doing, besides the obvious? What's a good ball-park estimate of the number of "projects at Google"?


2700 doesn't strike me as remotely crazy. I'm guessing that some of those projects all serve the same goal, e.g., for their speech system, they have: acoustic modeling; language modeling; named-entity recognition; intent classification; domain classification; grapheme-to-phoneme conversion; language detection; wake-word detection. This ignores other stuff that happens around speech (for example, I know they were using a CRF to label different types of numbers for training their spoken-to-written form converters, which AFAIK are still using WFSTs, although at this point I wouldn't be shocked if both of those systems were converted to DNNs). So let's take an estimate of 10 DNNs for their speech systems. Per language, so make that 200 DNNs to support 20 languages. This ignores that they have separate models for YouTube, voice search (one model for on-device and a cloud-side model), voicemail.

Their machine translation system probably has a similar # of DNNs, and there you have to deal with language pairs, rather than single languages. Let's call it another 400.

That's two side-projects. Then you pull in query prediction, driverless cars, all kinds of infrastructure modeling, spam detection, all of the billions of things that are happening in ads, recommendations, I haven't really even mentioned search yet... Honestly, if I'm right in assuming that the cited figure is really "# of DNNs that do different things", then I'm surprised it's not higher.


Why are there no units on those graphs?


I'll believe it's AI when the speech recognition error rates finally start dropping again.


Speech recognition barely improved since the 1990s.

We had Dragon natural speaking on a 133MHz Win95 PC (offline of course). After training it for like 10min it worked better or equal as good as Ford's Sync car assistent (offline) and Siri/GoogeNow/Cortana. Well all these services licensed the Nuance speech technology which they got from buying the company behind Dragon natural speaking software. The Ford board computer runs WinCE and has only 233MHz and is still sold in many 2016 Ford cars around the world. And with cloud hosting, to scale the service each users gets only a small amount of total CPU timeslice anyway.

What I want is an offline speech recognition software on my mobile devices! So do I have to install Win95 on an emulator in my smartphone just so my multi-core high end smartphone can do what a Pentium 1 could do in 1996? My hope is on open source projects. Though most such OSS projects are university projects with little documentation how to build the speech model, little community, on an outdated site, written in Java 1.4 and no GitHub page. There is definitely a need for good and competitive C/C++/(native code) TTS and speech recognition project.


> Speech recognition barely improved since the 1990s.

I find it hard to believe, do you have any citations for that - or is that just your gut feel?

A cursory search shows a 26% error rate[1] for Dragon NaturalSpeaking in the year 2000 (beaten by IBM in the same report at 17%).

By May 2015, if Sundar Pichai is to be believed, Google has an 8% error rate[2]. In my books, 26-to-8% (or even 17-to-8%) is far from barely improved.

1. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC79041/#!po=1.562... Table 3, General Vocabulary

2. http://venturebeat.com/2015/05/28/google-says-its-speech-rec...


> By May 2015, if Sundar Pichai is to be believed, Google has an 8% error rate[2].

Much of the Google's stuff is for search term recognition only. It's functionality on general dictation is nowhere near that good.


This is an old article. Posted December 12, 2015.


That's less than a month ago?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: