AI winter is well on its way

bitL · on May 30, 2018

I was recently "playing" with some radiology data. I had no chance to identify diagnoses myself with untrained eyes, something that probably takes years for a decent radiologist to master. Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training. In 4 out of 12 categories this classifier beat best human performing radiologists. Now the very same model with no other change than adjusting number of categories could be used in any image classification task, likely with state-of-art results. I was surprised when I applied it to another, completely unrelated dataset and got >92% accuracy right away.

If you think this is a symptom of AI winter, then you are probably wasting time on outdated/dysfunctional models or models that aren't suited for what you want to accomplish. Looking e.g. at Google Duplex (better voice synchronization than Vocaloid I use for making music), this pushed state-of-art to unbelievable levels in hard-to-address domains. I believe the whole SW industry will be living next 10 years from gradual addition of these concepts into production.

If you think Deep (Reinforcement) Learning is going to solve AGI, you are out of luck. If you however think it's useless and won't bring us anywhere, you are guaranteed to be wrong. Frankly, if you are daily working with Deep Learning, you are probably not seeing the big picture (i.e. how horrible methods used in real-life are and how you can easily get very economical 5% benefit of just plugging in Deep Learning somewhere in the pipeline; this might seem little but managers would kill for 5% of extra profit).

khawkins · on May 30, 2018

AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits. Just like an asset bubble, the value of the industry as a whole pops as people collectively realize that AI, while not being worthless, is worth significantly less than they thought.

Understand that in pop-sci circles over the past several years the general public is being exposed to stories warning about the singularity by well respected people like Stephen Hawking and Elon Musk (http://time.com/3614349/artificial-intelligence-singularity-...). Autonomous vehicles are on the roads and Boston Dynamics is showing very real robot demonstrations. Deep learning is breaking records in what we thought was possible with machine learning. All of this progress has excited an irrational exuberance in the general public.

But people don't have a good concept of what these technologies can't do, mainly because researchers, business people, and journalists don't want to tell them--they want the money and attention. But eventually the general public wises up to the unfulfillment of expectations, and drives their attention elsewhere. Here we have the AI winter.

varelse · on May 30, 2018

I'd clarify that there is a specific delusion that any data scientist straight out of some sort of online degree program can go toe to toe with the likes of Andrej Karpathy or David Silver with the power of "teh durp lurnins'." And the predictable disappointment arising from the craptastic shovelware they create is what's finally creating the long overdue disappointment.

Further, I have repeatedly heard people who should know better, with very fancy advanced degrees, chant variants of "Deep Learning gets better with more data" and/or "Deep Learning makes feature engineering obsolete" as if they are trying to convince everyone around them as well as themselves that these two fallacious assumptions are the revealed truth handed down to mere mortals by the 4 horsemen of the field.

That said, if you put your ~10,000 hours into this, and keep up with the field, it's pretty impressive what high-dimensional classification and regression can do. Judea Pearl concurs: https://www.theatlantic.com/technology/archive/2018/05/machi...

My personal (and admittedly biased) belief is that if you combine DL with GOFAI and/or simulation, you can indeed work magic. AlphaZero is strong evidence of that, no? And the author of the article in this thread is apparently attempting to do the same sort of thing for self-driving cars. I wouldn't call this part of the field irrational exuberance, I'd call it amazing.

bitL · on May 30, 2018

> Deep Learning makes feature engineering obsolete

I think even if you avoid constructing features, you are basically doing a similar process where a single change in a hyper-parameter can have significant effects:

- internal structure of a model (what types of blocks are you using and how do you connect them, what are they capable of together, how do gradients propagate?)

- loss function (great results come only if you use a fitting loss function)

- category weights (i.e. improving under-represented classes)

- image/data augmentation (self-driving car won't work without significant augmentation at all)

- properly set-up optimizer

The good thing here is that you can automate optimization of these to a large extent if you have a cluster of machines and a way to orchestrate meta-optimization of slightly changed models. With feature engineering you just have to do all the work upfront, thinking what might be important, and often you just miss important parts of features :-(

varelse · on May 30, 2018

Yep, and in doing so, you just traded "feature engineering" for graph design and data prep, no? And that's my response to these sorts. And their usual response to me is to grumble that I don't know what I'm doing. I've started tuning them out of my existence because they seem to have nothing to contribute to it.

jacquesm · on May 31, 2018

It's a huge difference in terms of the time invested to create something that performs well. Hand crafted feature engineering is better for some tasks but for quite a few of them automated methods perform very well indeed (at least, better than I expected).

riku_iki · on May 30, 2018

> if you have a cluster of machines and a way to orchestrate meta-optimization of slightly changed models

curious, if there is any good quality open source project for this..

pizza · on May 30, 2018

I'm not aware of actually written code/projects that does this, but try looking into neural architecture search, it should be useful https://github.com/markdtw/awesome-architecture-search

kinsomo · on May 30, 2018

> But eventually the general public wises up to the unfulfillment of expectations, and drives their attention elsewhere. Here we have the AI winter.

And more importantly, business and government leaders wise up and turn off the money tap.

khawkins · on May 30, 2018

This is why it's wise for researchers and business leaders to temper expectations. Better a constant modest flow of money into the field than a boom-bust cycle with huge upfront investment followed by very bearish actions.

pasabagi · on May 31, 2018

I think the problem is, that's absolutely against the specific interests of university departments, individual researchers, and newspapers - even if it's in the interest of the field as a whole.

sram1337 · on June 5, 2018

Prisoner's Dilemma

kabouseng · on May 31, 2018

That requires super rational agents in the game theory sense...

Florin_Andrei · on May 30, 2018

> AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits.

I think they also happen when the best ideas in the field run into the brick wall of insufficiently developed computer technology. I remember writing code for a perceptron in the '90s on an 8 bit system, 64 k RAM - it's laughable.

But right now compute power and data storage seem plentiful, so rumors of the current wave's demise appear exaggerated.

danmaz74 · on May 31, 2018

I wonder, though, what will happen with the demise of Moore's law... can we simply go with increased parallelism? How much can that scale?

kamaal · on May 31, 2018

That part will be harder than we can imagine.

Most of the software world will have to move on stuff like Haskell or functional language. As of now bulk(almost all) of our people are trained to program in C based languages.

It won't be easy. There will be a renewal for high demand software jobs.

bitL · on May 31, 2018

I don't think Haskell/FP is a solution either... Even if it allows some beautiful straightforward parallelization in Spark for typical cases, more advanced cases become convoluted, require explicit caching, and decrease performance significantly, unless some nasty hacks are involved (resembling cut operator in Prolog). I guess bleeding edge will be always difficult and one should not restrict their choices to a single paradigm.

bitL · on May 30, 2018

I wish GPUs were 1000x faster... Then I could do some crazy magic with Deep Learning instead of waiting weeks for training to be finished...

jacquesm · on May 31, 2018

That's more a matter of budget than anything else. If you problem is valuable enough spending the money in a short time-frame rather than waiting for weeks can be well worth the investment.

bitL · on May 31, 2018

I cannot fit a cluster of GPUs into a phone where I could make magic happen real-time though :(

jacquesm · on May 31, 2018

Hm. Offload the job to a remote cluster? Or is comms then the limiting factor?

bitL · on May 31, 2018

It won't give us that snappy feeling; imagine learning things in milliseconds and immediately displaying them on your phone.

Florin_Andrei · on May 31, 2018

Jeez. That would be faster than protein-and-water-based systems, which up until now are still the faster learners.

pizza · on May 30, 2018

somebody is working on photonics-based ML http://www.lighton.io/our-technology

cyberpunk0 · on May 31, 2018

> AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits.

A symptom of capitalism and marketing trying to push shit they don't understand

jeffreyrogers · on May 30, 2018

I don't think the claim is that AI isn't useful. It's that it's oversold. In any case, I don't think you can tell much about how well your classifier is working for something like cancer diagnoses unless you know how many false negatives you have (and how that compares to how many false negatives a radiologist makes).

bitL · on May 30, 2018

There are two sides to this:

- how good humans are in detecting cancer (hint: not very good) and if having an automated system even as a "second opinion" next to an expert might not be useful?

- there are metrics for capturing true/false positives/negatives one can focus on during learning optimization

From studies you might have noticed that expert radiologists have e.g. F1-score at 0.45 and on average they score 0.39, which sounds really bad. Your system manages to push average to 0.44, which might be worse than the best radiologist out there, but better than an average radiologist [1]. Is this really being oversold? (I am not addressing possible problems with overly optimistic datasets etc. which are real concerns)

[1] https://stanfordmlgroup.github.io/projects/chexnet/

salawat · on May 30, 2018

Alright. What is the cost of a false positive in that case?

The problem AI runs into is that with too much faith in the machine, people STOP thinking and believe the machine. Where you might get a .44 detection rate on radiology data alone, that radiologist with a .39 or a doctor can consult alternate streams of information. The AI may still be helpful in reinforcing a decision to continue scrutinizing a set of problem.

AI's as we call them today are better referred to as expert systems. AI carries too much baggage to be thrown around Willy nilly. An expert system may beat out a human at interpreting large unintuitive datasets, but they aren't generally testable, and like it or not, it will remain a tough sell in any situation where lives are on the line.

I'm not saying it isn't worth researching, but AI will continue to fight an uphill battle in terms of public acceptance outside of research or analytics spaces, and overselling or being anything but straightforward about what is going on under the hood will NOT help.

kinsomo · on May 30, 2018

> The problem AI runs into is that with too much faith in the machine, people STOP thinking and believe the machine.

See https://youtu.be/R_rF4kcqLkI?t=2m51s

In medicine, I want everyone to apply appropriate skepticism to important results, and I don't want to enable lazy radiologists to zone out and press 'Y' all day. I want all the doctors to be maximally mentally engaged. Skepticism of an incorrect radiologist report recently saved my dad from some dangerous, and in his case unnecessary, treatment.

salawat · on May 31, 2018

Or for a more mundane example, I tried to identify a particular plant by doing an image based I'd with Google. It was identified as a Broomrape because the pictures only had non-green portions of the plant in question. It was ACTUALLY a member of the thistle family.

whatshisface · on May 30, 2018

The problem could be fixed by asking doctors to put their diagnosis into the machine before the machine reveals what it thinks. Then, a simple Bayesian calculation could be performed based on the historical performance of that algorithm, all doctors, and that specific doctor, leading to a final number that would be far more accurate. All of the thinking would happen before the device polluted the doctor's cognitve biases.

bitL · on May 30, 2018

There is a problem with that approach that at some point hospital management starts rating doctors by how well their diagnoses match those automated ones, and punish those who deviate too much, removing any incentives to be better/different. I wouldn't underestimate this, dysfunctional management exhibits these traits in almost any mature business.

stillsut · on May 30, 2018

No, it's a "second opinion", and the human doctors are graded with how well their own take differs with the computer's advice, when the computer's advice is different from the ground truth.

And there's probably not even a boolean "ground truth" in complicated bio-medicine problems. Sometimes the right call is neither yes or no, but: this is not like anything I've seen before, I can't give a decision either way, I need further tests.

braindongle · on May 30, 2018

Is there a prevailing approach to thinking about (accounting for?) false negatives in ground truth data? I'm new to this area, and the question is relevant for my current work. By definition, you simply don't know anything about false negatives unless you have some estimate of specificity in addition to your labeled data, but can anything be done?

ujal · on May 30, 2018

I don't get the sentiment of the article either. I can't speak for researchers but software engineers are living through very exciting times.

  State of the art in numbers:
  Image Classification - ~$55, 9hrs (ImageNet)
  Object Detection - ~$40, 6hrs (COCO)
  Machine Translation - ~$40, 6hrs (WMT '14 EN-DE)
  Question Answering - ~$5, 0.8hrs (SQuAD)
  Speech recognition - ~$90, 13hrs (LibriSpeech)
  Language Modeling - ~$490, 74hrs (LM1B)

"If you think Deep (Reinforcement) Learning is going to solve AGI, you are out of luck" --

I don't know. Duplex equipped with a way to minimize his own uncertainties sounds quite scary.

varelse · on May 30, 2018

Duplex was impressive but cheap street magic: https://medium.com/@Michael_Spencer/google-duplex-demo-witch...

Microsoft OTOH quietly shipped the equivalent in China last month: https://www.theverge.com/2018/5/22/17379508/microsoft-xiaoic...

Google has lost a lot of steam lately IMO. Facebook is releasing better tools and Microsoft, the company they nearly vanquished a decade ago, is releasing better products. Google does remain the master of its own hype though.

sah2ed · on May 30, 2018

> Microsoft, the company they nearly vanquished a decade ago, is releasing better products.

Google nearly vanquished Microsoft a decade ago? Where can I read more about this bit of history :) ?

IMO, Axios [0] seem to do a better job of criticizing Google's Duplex AI claims, as they repeatedly reached out to their contacts at Google for answers.

0: https://www.axios.com/google-ai-demo-questions-9a57afad-9854...

dogprez · on May 30, 2018

I think they are overselling Google's contributions a bit. It was more "Web 2.0" that shook Microsoft's dominance in tech. Google was a big curator and pushed state-of-the-art. Google was built on a large network of commodity hardware, they were able to do that because of the Open Source Software. Microsoft licensing would have been prohibitive to such innovation. There was some reenforcement that helped Linux gain momentum in other domains like Mobile and Desktop. Googled helped curate "Web 2.0" with developments / acquisitions like Maps and Gmail. When more of your life was spent on the web, the operating system meant less and that's also why Apple was able to make strides with their platforms. People weren't giving up as much when they switched to Mac as they would have previously.

Microsoft was previously the gatekeeper to almost every interaction with software (roughly 1992 - 2002). I don't know of good books on it but Tim O'Reilly wrote quite a bit about Web 2.0.

sah2ed · on May 30, 2018

My question was actually tongue-in-cheek, which I tried to communicate with the smiley face.

I'm quite familiar with Google's history and would not characterize them as having vanquished Microsoft.

For the most part, Microsoft doesn't need to lose for Google to win (except of course in the realm of web search and office productivity).

varelse · on May 30, 2018

You're right, it was Steve Ballmer who nearly vanquished Microsoft at a time when Google was the company to work for in tech and kept doing amazing things. At least IMO.

Unfortunately, by the time of my brief stint at Google, the place was a professional dead-end where most of the hirees got smoke blown up their patooties at orientation about how amazing they were to be accepted into Google, only to be blind allocated into me-too MVPs of stuff they'd read about on TechCrunch. All IMO of course.

That said, I met the early Google Brain team there and I apparently made a sufficiently negative first impression for one of their leaders to hold a grudge against me 6 years later, explaining at last who it was that had blacklisted me there. So at least that mystery is solved.

PS It was pretty obvious these were voice actors in a studio conversing with the AI. That is impressive, but speaking as a former DJ myself, when one has any degree of voice training, one pronounces words without much accent and without slurring them together. Google will likely never admit anything here: they don't have to.

But I will give Alphabet a point for Waymo being the most professionally-responsible self-driving car effort so far. Compare and contrast with Tesla and Uber.

placebo · on May 30, 2018

My thoughts on AGI (at least in the sense of being indistinguishable from interaction with a human) are the same as my thoughts on extraterrestrial life: I'll believe it only when I see it (or at least when provided with proof that the mechanism is understood). This extrapolation on a sample size of one is something I don't understand. How is the fact that machine learning can do specific stuff better than humans different in principle than the fact that a hand calculator can do some specific stuff better than humans? On what evidence can we extrapolate from this to AGI?

We haven't found life outside this planet, and we haven't created life in a lab, therefore n=1 for assessing probability of life outside earth (which means we can't calculate a probability for this yet). Likewise, we haven't created anything remotely like animal intelligence (let alone human) and we have no good theory regarding how it works, so n=1 for existing forms of general intelligence.

Note that I'm not saying there can be no extraterrestrial life or that we will never develop AGI, just that I haven't seen any evidence at this point in time that any opinions for or against their possibility are anything more than baseless speculation.

ujal · on May 30, 2018

This is what we know from Google about Duplex:

"To train the system in a new domain, we use real-time supervised training. This is comparable to the training practices of many disciplines, where an instructor supervises a student as they are doing their job, providing guidance as needed, and making sure that the task is performed at the instructor’s level of quality. In the Duplex system, experienced operators act as the instructors. By monitoring the system as it makes phone calls in a new domain, they can affect the behavior of the system in real time as needed. This continues until the system performs at the desired quality level, at which point the supervision stops and the system can make calls autonomously." --

sqrt17 · on May 30, 2018

If the dollar amounts refer to the training cost for the cheapest DL model, do you have references for them? A group of people at fast.ai trained an ImageNet model for 26$, presumably after spending a couple hundered on getting everything just right: http://www.fast.ai/2018/04/30/dawnbench-fastai/

ujal · on May 30, 2018

Thats what you get with Google TPUs on reference models. The ImageNet numbers are from RiseML, the rest is from here - https://youtu.be/zEOtG-ChmZE?t=1079

timr · on May 30, 2018

"Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training."

OK, but 83% ROC/AUC is nothing to be bragging about. ROC/AUC routinely overstates the performance of a classifier anyway, and even so, ~80% values aren't that great in any domain. I wouldn't trust my life to that level of performance, unless I had no other choice.

You're basically making the author's case: deep learning clearly outperforms on certain classes of problems, and easily "generalizes" to modest performance on lots of others. But leaping from that to "radiology robots are almost here!" is folly.

bitL · on May 30, 2018

Yeah, but the point here was that radiologists on average fared even worse. 83% is not impressive, but better than what we have right now in real-world with real people, as sad as it is. Obviously, best radiologists would outperform it right now, but average ones, likely stressed under heavy workload might not be able to beat it. And of course, this classifier probably works on certain visual structures better than humans and other ones easier detectable by humans would slip through.

There is also higher chance that next state-of-art model would push it significantly over 83% or best human radiologist at some point in the future, so it might not be very economical to train humans to become even better (i.e. dedicate your life to focus on radiology diagnostics only).

epmaybe · on May 31, 2018

I think you're missing a very important part here that maybe you've considered: Domain knowledge. I'm assuming your radiologic images were hand labeled by other radiologists. How did they come to that diagnosis? By only looking at the image? This was a severe limitation of the Andrew Ng paper on CheXnet for detection of pneumonia from chest x Rays. CheXnet was able to outperform radiologists on detection of pneumonia from the chest x Rays, but the diagnosis of pneumonia is considered a clinical diagnosis that requires more information about the patient. My point is that while your results are impressive and indicative of where deep learning could help in medicine, these same results might be skewed since you're testing the model on hand labeled data. What happens if you apply this in the real world at a hospital where the radiologist gets the whole patient chart and your model only gets the x Ray?

bitL · on May 31, 2018

There is a paper discussing higher-order interdependencies between diagnoses [1] on X-ray images (they seem to apply LSTM to derive those dependencies). This could be probably extended to include data outside X-ray images. My take is that it's pretty impressive what we can derive from a single image; now if we have multiple low-level Deep Learning-based diagnostic subsystems and combine them together via some glue (either Deep (Reinforcement) Learning, classical ML, logic-based expert system, PGM etc.), we might be able to represent/identify diagnoses with much more certainty than any single individual M.D. possibly could (also creating some blindspots that humans won't leave unaddressed). It could be difficult to estimate statistical properties of the whole system though, but that's a problem with any complex system, including a group of expert humans.

The main critique for CheXNet I've read was focused on the NIH dataset itself, not the model. The model generalizes quite well across multiple visual domains, given proper augmentation.

[1] https://arxiv.org/abs/1710.10501

timr · on May 30, 2018

"Yeah, but the point here was that radiologists on average fared even worse."

Except they don't. See the table in the original post. Also, comparing the "average" radiologist by F1 scores from a single experiment (as you've done in other comments here) is meaningless.

Unless my doctor is exactly average (and isn't incorporating additional information, or smart enough to be optimizing for false positive/negative rates relative to cost), comparison to average statistics is academic. But I don't really need to tell you this -- your comment has so many caveats that you're clearly already aware of the limits of the method.

sh33mp · on May 30, 2018

This thread is a microcosm of this whole issue of overhyping.

On one hand, we have one commenter saying he can train a model to do a specific thing with a specific quantitative metric, to demonstrate how deep learning can incredibly powerful/useful.

On the other hand, we have another commenter saying "But this won't replace my doctor!" and therefore deep learning is overhyped.

The two sides aren't even talking about the same thing.

timr · on May 30, 2018

Agree that the thread is a microcosm of the debate, but ironically, I'm not trying to say anything like "this won't replace my doctor".

That kind of hyperventilating stuff is easy to brush off. The problem with deep-learning hype is that comments like "my classifier gets a ROC/AUC score of 0.8 with barely any work!" are presented as meaningful. The difference between a 0.8 AUC and a usable medical technology means that most of the work is ahead of you.

sh33mp · on May 30, 2018

Agreed. I think it comes down to the presentation/interpretation of results. The response to "My classifier gets score of X" can be either "wow, that's a good score for a classifier, this method has merit" or "but X is not a good measure of [actual objective]".

So I think it's come down to conflict between

1. Which the author is trying to present 2. What an astute reader might interpret it as 3. What an astute reader might worry an uninformed reader might interpret it as

And my feeling is that, given all the talk about hype in pop-sci, we're actually on point 3 now, even when the author and reader are actually talking about something reasonable. Whereas personally I'm more interested in the research and interpretations from experts, which I find tend to be not so problematic.

bitL · on May 30, 2018

> Unless my doctor is exactly average

Just to get back to this point: what if the vision system of your doctor is below average and you augment her by giving her a statistically better vision system, while allowing her to use the additional sources as she sees fit. Wouldn't be that an improvement? We are talking about vision subsystem here, not the whole "reasoning package" human doctors posses.

timr · on May 30, 2018

Again, check that table. It says a lot:

https://stanfordmlgroup.github.io/competitions/mura/

On just about every test set, the model is beaten by radiologists. Even the mean performance is underwhelming.

bitL · on May 30, 2018

I was referring mainly to this one (from the same group and it actually surpassed humans on average):

https://stanfordmlgroup.github.io/projects/chexnet/

In their paper they even used "weaker" DenseNet-121 instead of DenseNet-169 for Mura/bones. DenseNet-BC I tried is another refinement of the same approach.

timr · on May 30, 2018

Those are some sketchy statistics. The evaluation procedure is questionable (F1 against the other 4 as ground truth? Mean of means?), and the 95% CI overlap pretty substantially. Even if their bootstrap sampling said the difference is significant, I don't believe them.

Basically, I see this as "everyone sucks, but the AI maybe sucks a little less than the worst of our radiologists, on average"

bitL · on May 30, 2018

What would be the good metrics then? Of course metrics are just indicators that can be interpreted incorrectly. Still, we have to measure something tangible. What would you propose? I am aware of limitations and would gladly use something better...

Some people mention Matthews correlation coefficients, Youden's J statistic, Cohen's kappa etc. but I haven't seen them in any Deep Learning paper so far and I bet they have large blindspots as well.

ASalazarMX · on May 30, 2018

> Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training

Of course! Using DenseNet-BC-100-12 to increase ROC AUC, it was so obvious!

imant · on May 30, 2018

Would you mind sharing which other, unrelated dataset you have used the model on?

Faaak · on May 30, 2018

You can take a look at: https://github.com/sfikas/medical-imaging-datasets or http://www.radrounds.com/profiles/blogs/list-of-open-access-...

bitL · on May 30, 2018

I can't unfortunately, proprietary stuff being plugged into existing business right now.

flamedoge · on May 30, 2018

Next winter will probably going to be going over that 92% across all domains.

bitL · on May 30, 2018

Possibly, but will it be called AI winter, if e.g. average human has 88% accuracy and best human 97%?

machinelearning · on June 5, 2018

Yea this sounds extremely unlikely unless the other dataset has a fairly easy decision boundary. The kind of cross-domain transfer learning you seem to think deep neural networks have is nothing I've observed before in my formal studies of neural network

mathattack · on June 2, 2018

How much of this can we pin on IBM's overhype of Watson?

eanzenberg · on May 30, 2018

ROC AUC is fairly useless when you have disparate costs in the errors. Try precision-recall.

bitL · on May 30, 2018

I mentioned F1 in some later comment.

joe_the_user · on May 30, 2018

This is a deep, significant post (pardon pun etc).

The author is clearly informed and takes a strong, historical view of the situation. Looking at what the really smart people who brought us this innovation have said and done lately is a good start imo (just one datum of course, but there are others in this interesting survey).

Deepmind hasn't shown anything breathtaking since their Alpha Go zero.

Another thing to consider about Alpha Go and Alpha Go Zero is the vast, vast amount of computing firepower that this application mobilized. While it was often repeated that ordinary Go program weren't making progress, this wasn't true - the best, amateur programs had gotten to about 2 Dan amateur using Makov Tree Search. Alpha Go added CNNs for it's weighting function and petabytes of power for it's process and got effectiveness up to best in the world, 9 Dan professional, (maybe 11 Dan amateur for pure comparison). [1]

Alpha Go Zero was supposedly even more powerful, learned without human intervention. BUT it cost petabytes and petabytes of flops, expensive enough that they released a total of ten or twenty Alpha Go Zero game to the world, labeled "A great gift".

The author convenniently reproduces the chart of power versus results. Look at it, consider it. Consider the chart in the context of Moore's Law retreating. The problems of Alpha Zero generalizes as described in the article.

The author could also have dived into the troubling question as of "AI as ordinary computer application" (what does testing, debugging, interface design, etc mean when the app is automatically generated in an ad-hoc fashion) or "explainability". But when you can paint a troubling picture without these gnawing problems appearing, you've done well.

[1] https://en.wikipedia.org/wiki/Go_ranks_and_ratings

tim333 · on May 30, 2018

>Deepmind hasn't shown anything breathtaking since their Alpha Go zero

They went on to make AlphaZero, a generalised version that could learn chess, shogi or any similar game. The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

That seemed impressive to me.

Also they used loads of compute during the training but not so much during play.(5000 TPUs, 4TPUs).

Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

felippee · on May 30, 2018

It's not like humanity really needs another chess playing program 20 years after IBM solved that problem (but now utilizing 1000x more compute power). I just find all these game playing contraptions really uninteresting. There are plenty real world problems to be solved of much higher practicality. Moravec's paradox in full glow.

orwin · on May 30, 2018

The fact that it beat Stockfish9 is not what is impressive with AlphaZero.

What was impressive was the way Stockfish9 was beaten. AlphaZero played like a human player, making sacrifices for position that stockfish thought were detrimental. When it played as white, the fact that is mostly started with the Queen pawn (despite that the King pawn is "best by test") and the way AlphaZero used Stockfish pawnstructure and tempo to basicaly remove a bishop from the game was magical.

Yes, since its a game, it's "useless", but it allowed me (and i'm not the only one) to be a bit better at chess. It's not world hunger, not climate change, it's just a bit of distraction for some people.

PS: I was part of the people thinking that Genetic algorithm+deep learning was not enough to emulate human logical capacities, AlphaZero vs Stockfish games made me admit i was wrong (even if i still think it only works inside well-defined environments)

blub · on May 30, 2018

Two observations:

Just because Fischer preferred 1. e4, it doesn't make it better than other openings. https://en.chessbase.com/post/1-e4-best-by-test-part-1

Playing like a human for me also means making human mistakes. A chess-playing computer playing like a 4000 rated "human" is useless, one that can be configured to play at different ELOs is more interesting, although most can do that and there's no ML needed, nor huge amounts of computing power.

dragontamer · on May 31, 2018

> What was impressive was the way Stockfish9 was beaten.

Without its opening database and without its endgame tablebase?

Frankly, the Stockfish vs AlphaZero match was the beginning of the AI Winter in my mind. The fact that they disabled Stockfish's primary databases was incredibly fishy IMO and is a major detriment to their paper.

Stockfish's engine is designed to only work in the midgame of Chess. Remove the opening database and remove the endgame database, and you're not really playing against Stockfish anymore.

The fact that Stockfish's opening was severely gimped is not a surprise to anybody in the Chess community. Stockfish didn't have its opening database enabled... for some reason.

canoebuilder · on May 30, 2018

I think for most people, the research interest in games of various sorts, is not simply a desire for a better and better game contraption, a better mousetrap. But rather the thinking is, "playing games takes intelligence, what can we learn about intelligence by building machines that play games?"

Most games are also closed systems, and conveniently grokkable systems, with enumerable search spaces. Which gives us easily produceable measures of the contraptions' abilities.

Whether this is the most effective path to understanding deeper questions about intelligence is an open question.

But I don't think it's fair to say that deeper questions and problems are being foregone simply to play games.

I think most 'games researchers' are pursuing these paths because they themselves and no one else has put forth any other suggestion that makes them think, "hmm, that's a really good idea, that seems like it might be viable and there is probably something interesting we could learn from it."

Do you have any suggestions?

Erlich_Bachman · on May 30, 2018

This is so true, I can't understand why people miss this. The games are just games. It's intelligence that is the goal.

And comparing Alpha Go Zero against those "other chess programs that existed for 30 years" is exactly missing the point also. Those programs were not constructed with zero-knowledge. They were carefully crafted by human players to achieve the result. Are we also going to count in all the brain processing power and the time spent by those researchers to learn to play chess? Alpha Go Zero did not need any of that, besides the knowledge about the basic rules of the game. Who compare compute requirements for 2 programs that have fundamentally different goals and achievements? One is carefully crafted by human intervention. The other one learns a new game without prior knowledge...

sgt101 · on May 30, 2018

It shows something about the game, but it's clear that humans don't learn in the way that alpha zero does, do i don't think that alpha zero illuminated any aspect of human intelligence.

Erlich_Bachman · on May 30, 2018

I think that fundamentally the goal of research is not necessarily human-like intelligence, just any high-level general intelligence. It's just that the human brain (and the rest of the body) has been a great example of an intelligent entity which we could source of a lot inspiration from. Whether the final result will share a the technical and structural similarity (and how much) to a human, the future will tell.

sgt101 · on May 30, 2018

In principle you are right. In practice we will see. My bet is that attempts that focused on the human model will bear more fruit in the medium term because we have huge capability for observation at scale now which is v. exciting. Obviously ethics permitting!

pdimitar · on May 31, 2018

Not sure if I am reading you correctly but to me you basically are saying "we have no idea but we believe that one day it will make sense".

Sounds more like religion and less like science to me.

I guess we could argue until the end of the world that no intelligence will emerge from more and more clever ways of brute-forcing your way out of problems in a finite space with perfect information. But that's what I think.

hannasanarion · on May 30, 2018

But humans could learn in the same way that AlphaZero does. We have the same resources and the same capabilities, just running on million-year-old hardware. Humans might not be able to replicate the performance of AlphaZero, but that does not mean it is useless in the study of intelligence.

notahacker · on May 30, 2018

The problem is that outside perfect information games, most areas where intelligence is required have few obvious routes to allow the computer to learn by perfectly simulating strategies and potential outcomes. Cases where "intelligence" is required typically entail handling human approximations of a lot of unknown and barely known possibilities with an inadequate dataset, and advances in approaches to perfect information games which can be entirely simulated by a machine knowing the ruleset (and possibly actually perturbed by adding inputs of human approaches to the problem) might be at best orthogonal to that particular goal. One of the takeaways from AlphaGo Zero massively outperforming AlphaGo is that even very carefully designed training sets for a problem fairly well understood by humans might actually retard system performance...

JoshCole · on May 30, 2018

I totally agree with you and share your confusion.

On the topic of the different algorithmic approaches, I find it so fascinating how different these two approaches actually end up looking when analyzed by a professional commentator. When you watch the new style with a chess commentator, it feels a lot like listening to the analysis of a human game. The algorithm has very clearly captured strategic concepts in its neural network. Meanwhile, with older chess engines there is a tendency to get to positions where the computer clearly doesn't know what its doing. The game reaches a strategic point and the things its supposed to do are beyond the horizon of moves it can computer by brute force. So it plays stupid. These are the positions that, even now, human players can beat better than human old style chess engines at.

ankurdhama · on May 30, 2018

The thing is that you can learn new moves/strategies that were never thought about before in these games but you still doesn't understand anything about intelligence at all.

sriku · on May 30, 2018

A favour work by Rodney Brooks - "elephants don't play chess"

https://people.csail.mit.edu/brooks/papers/elephants.pdf

salty_biscuits · on May 30, 2018

It's not like the research on games is at the expense of other more worthy goals. It is a well constrained problem that lets you understand the limitations of your method. Great for making progress. Alpha zero didn't just play chess well, it learned how to play chess well (and could generalize to other games). I'd forgive it 10000 times the resources for that.

ttctciyf · on May 30, 2018

> It is a well constrained problem

But attacking not-well-constrained problems is what's needed to show real progress in AI these days, right?

salty_biscuits · on May 30, 2018

I'd say getting better sample efficiency is a bigger deal. It isn't like POMDP's are a huge step away theoretically from MDP's. But if you attach one of these things to a robot, taking 10^7 samples to learn a policy is a deal breaker. So fine, please keep using games to research with.

joshgel · on May 30, 2018

>it learned how to play chess well

This. Learning to play a game is one thing. Learning how to teach computers to learn a game is another thing. Yes chess programs have been good before, but that's missing the point a little bit. The novel bit is not that it can beat another computer, but how it learned how to do so.

icc97 · on May 30, 2018

Big Blue relied on humans to do all the training. Alpha Go zero didn't need humans at all to do the training.

That's a pretty major shift for humanity.

DonaldFisk · on May 30, 2018

It's Deep Blue, not Big Blue. The parameters used by its evaluation function were tuned by the system on games played by human masters.

But it's a mistake to think that a system learning by playing against itself is something new. Arthur Samuel's draughts (chequers) program did that in 1959.

icc97 · on May 30, 2018

Sorry, mix up, thanks for the correction.

It's not that it's new, it's that they've achieved it. Chess was orders of magnitude harder than draughts. The solution for draughts didn't scale to chess but Alpha Go zero showed that chess was ridiculously easy for it once it had learned Go.

DonaldFisk · on May 30, 2018

Both Samuel's chequer's program and Deep Blue used alpha-beta pruning for search, and a heuristic function. Deep Blue's heuristic function was necessarily more complex because chess is more complex than draughts. I think the reason master chess games were used in Deep Blue instead of self-play was the existence of a large database of such games, and because so much of its performance was the result of being able to look ahead so far.

sangnoir · on May 30, 2018

> It's Deep Blue, not Big Blue.

Big Blue is fine - it's referring to the company and not the machine. From Wikipedia "Big Blue is a nickname for IBM"

icc97 · on May 30, 2018

I meant Deep Blue, but yeah Deep Blue was a play on Big Blue.

batmansmk · on May 30, 2018

I guess there are reasons why researchers build chess programs: it is easy to compare performance between algorithms. When you can solve chess, you can solve a whole class of decision-making problems. Consider it as the perfect lab.

EliRivers · on May 30, 2018

What is that class of decision-making problems? It's nice to have a machine really good at playing chess, but it's not something I'd pay for. What decision-making problems are there, in the same class, that I'd pay for?

Consider it as the perfect lab.

Seems like a lab so simplified that I'm unconvinced of its general applicability. Perfect knowledge of the situation and a very limited set of valid moves at any one time.

jcelerier · on May 30, 2018

> What decision-making problems are there, in the same class, that I'd pay for?

an awful lot of graph and optimization problems. See for instance some examples in https://en.wikipedia.org/wiki/A*_search_algorithm

AstralStorm · on May 30, 2018

Perfect information problem solving is not interesting anymore.

Did they manage to extend it to games with hidden and imperfect information?

(Say, chess with fog of war also known as Dark Chess. Phantom Go. Pathfinding equivalent would be an incremental search.)

Edit: I see they are working on it, predictive state memory paper (MERLIN) is promising but not there yet.

dgacmu · on May 30, 2018

Strongly disagree. There are a lot of approximation algorithms and heuristics in wide use - to the tune of trillions of dollars, in fact, when you consider transportation and logistics, things like asic place & route, etc. These are all intractable perfect info problems that are so widespread and commercially important that they amplify the effect of even modest improvements.

(You said problems, not games...)

AstralStorm · on May 30, 2018

Indeed, there are a few problems where even with perfect information you will be hard pressed to solve them. But that is only a question of computational power or the issue when the algorithm does not allow efficient approximation (not in APX space or co-APX).

The thing is, an algorithm that can work with fewer samples and robustly tolerating mistakes in datasets (also known as imperfect information) will be vastly cheaper and easier to operate. Less tedious sample data collection and labelling.

Working with lacking and erroneous information (without known error value) is necessarily a crucial step towards AGI; as is extracting structure from such data.

This is the difference between an engineering problem and research problem.

dgacmu · on May 30, 2018

Perhaps a unifying way of saying this is: it's a research problem to figure out how to get ML techniques to the point they outperform existing heuristics on "hard" problems. Doing so will result in engineering improvements to the specific systems that need approximate solutions to those problems.

I completely agree about the importance of imperfect information problems. In practice, many techniques handle some label noise, but not optimally. Even MNIST is much easier to solve if you remove the one incorrectly-labeled training example. (one! Which is barely noise. Though as a reassuring example from the classification domain, JFT is noisy and still results in better real world performance than just training on imagenet.)

eitland · on May 30, 2018

> Perfect information problem solving is not interesting anymore.

I guess in the same way as lab chemistry isn't interesting anymore ? (Since it often happens in unrealistically clean equipment :-)

I think there is nothing preventing lab research from going on at the same time as industrialization of yesterday's results. Quite on the contrary: in the long run they often depend on each other.

nl · on May 30, 2018

There’s plenty of interesting work on poker bots.

AstralStorm · on May 30, 2018

Poker bots actually deal with a (simple) game with imperfect information. It is not the best test because short memory is sufficient to win at it.

The real challenge is to devise a general algorithm that will learn to be a good poker player in thousands of games, strategically, from just a bunch of games played. DeepStack AI required 10 million simulated games. Good human players outperform it at intermediate training stages.

And then the other part is figuring out actual rules of a harder game...

djoyal · on May 30, 2018

I think chess may actually be the worst lab. Decisions made in chess are done so with perfect knowledge of the current state and future possibilities. Most decisions are made without perfect knowledge.

Tepix · on May 30, 2018

For chess, the future possibilities are so vast, you can't call them "perfect knowledge" with a straight face.

throwawaymath · on May 30, 2018

This is not what the terminology "perfect knowledge" means. Perfect knowledge (more often called "perfect information") refers to games in which all parts of the game state are accessible to every other player. In theory, any player in the game has access to all information contained in every game state up to the present and can extrapolate possible forward states. Chess is a very good example of a game of perfect information, because the two players can readily observe the entire board and each other's moves.

A good example of a game of imperfect information is poker, because players have a private hand which is known only to them. Whereas all possible future states of a chess game can be narrowed down according to the current game state, the fundamental uncertainty of poker means there is a combinatorial explosion involved in predicting future states. There's also the element of chance in poker, which further muddies the waters.

Board games are often (but not always) games of perfect and complete information. Card games are typically games of imperfect and complete information. This latter term, "complete information", means that even if not all of the game state is public, the intrinsic rules and structure of the game are public. Both chess and poker are complete, because we know the rules, win conditions and incentives for all players.

This is all to say that games of perfect information are relatively easy for a computer to win, while games of imperfect information are harder. And of course, games of incomplete information can be much more difficult :)

djoyal · on May 30, 2018

A human might not be able to, but a computer can. Isn't the explicit reason research shifted to using Go the fact that you can't just number crunch your way through it?

DonaldFisk · on May 30, 2018

AlphaGo Zero did precisely that. Most of its computations were done on a huge array of GPUs. The problem with Go is that look-ahead is more of a problem than in Chess, as Go has roughly between five and ten times as many possible moves at each point in the game. So Go was more of a challenge, and master-level play was only made possible by advances in computer hardware.

Moru · on May 30, 2018

Chess was already easy for computers. That's why Arimaa came to be.

gaius · on May 30, 2018

When you can solve chess, you can solve a whole class of decision-making problems

If this were true, there would be a vast demand for grandmasters in commerce, government, the military... and there just isn’t. Poker players suffer from similar delusions about how their game can be generalised to other domains.

raverbashing · on May 30, 2018

> Poker players suffer from similar delusions about how their game can be generalised to other domains.

Oh that's so true

Poker players in the real life would give up more often than not, whenever they didn't know enough about a situation or they didn't have enough resources for a win with a high probability.

And people can call your bluff even if you fold.

_Tev · on May 30, 2018

Those traits seem to me like a thing most people desperately need ... Everyone being confident in their assessment of everything seems like one of major problems of today's population.

majewsky · on May 30, 2018

I think batmansmk doesn't mean "when X is good at chess, X is automatically good at lots of other things", but "the traits that make you a good chess player (given enough training) also make you good at lots of other things (given enough training)".

EliRivers · on May 30, 2018

I might suspect (but certainly cannot prove) that the traits that make a human good at playing chess are very different to the traits that make a machine good at playing chess, and as such I don't think we can assume that the machine skilled-chess-player will be good at lots of other things in an analagous way to the human skilled-chess-player.

SiempreViernes · on May 30, 2018

And Gaius point stands before this argument as well, chess is seen as such a weak predictor that playing a game of chess or requesting an official ELO rating isn't used for hiring screening for instance.

I suspect that chess as a metagame is just so far developed that being "good at chess" means your general ability is really overtrained for chess.

monktastic1 · on May 30, 2018

Second world chess champion Emanuel Lasker spent a couple years studying Go and by his own report was dejected by his progress. Maybe he would have eventually reached high levels, but I've always found this story fascinating.

taneq · on May 31, 2018

True, but I'd phrase it the other way around. The traits that make you (a human) good at general problem solving are also the traits that make you a good chess player. I do suspect, though, that there are some Chess-specific traits which boost your Chess performance but don't help much with general intelligence. (Consider, for example, the fact that Bobby Fischer wasn't considered a genius outside of his chosen field.)

some_account · on May 30, 2018

Tell me about it. The brightest minds are working on ads, and we have AI playing social games.

Can AI make the world better? It can, but it won't since we are humans, and humans will weaponize technology every chance it gets. Of course some positive uses will come, but the negative ones will be incredibly destructive.

hannasanarion · on May 30, 2018

Just because you haven't seen humongous publicity stunts involving pratical uses of AI doesn't mean they aren't being deployed. My company using similar methods to warn hospitals about patients with high probability of imminent heart attacks and sepsis.

The practical uses of these technologies don't always make national news.

I'm sure you would also have scoffed at the "pointless impractical, wasteful use of our brightest minds" to make the the Flyer hang in the air for 30 yards at Kitty Hawk.

iwintermute · on May 30, 2018

Let's start with defining "better"

kamaal · on May 30, 2018

>>20 years after IBM solved that problem

We solved nothing.

IBM Deep Blue doesn't exactly think like humans do.

Most of our algorithms really are 'better brute force'.

https://www.theatlantic.com/magazine/archive/2013/11/the-man...

pdimitar · on May 31, 2018

Exactly. To my not-very-well-informed self, even AlphaGo Zero is just a more clever way to brute-force board games.

Side observers are taking joy in the risker plays that it did -- reminded them of certain grand-masters I suppose -- but that still doesn't mean AGZ is close to any form of intelligence at all. Those "riskier moves" are probably just a way to more quickly reduce the problem space anyway.

It seriously reminds me more and more of religion, the AI area these days.

carlmr · on May 30, 2018

>Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Most humans don't live 2000 years. And realistically don't spend that much of their time or computing power on studying chess. Surely a computer can be more focused at this and the 4h are impressive. But this comparison seems flawed to me.

danielbarla · on May 30, 2018

You're right, though the distinction with the parent poster is that AlphaGo Zero had no input knowledge to learn from, unlike humans (who read books, listen to other players' wisdom, etc). It's a fairly well known phenomenon that e.g. current era chess players are far stronger than previous eras' players, and this probably has to do with the accumulation of knowledge over decades, or even hundreds of years. It's incredibly impressive for software to replicate that knowledge base so quickly.

Retric · on May 30, 2018

Not so much from the accumulation of knowledge because players can only study so many games. The difference is largely because their are more people today, they have more free time, and they could play vs high level opponents sooner.

Remember people reach peak play in ~15 years, but they don't nessisarily keep up with advances.

PS: You see this across a huge range of fields from running, figure skating, to music people simply spend more time and resources getting better.

kelnage · on May 30, 2018

But software is starting from the same base. To claim it isn't would be to claim that the computers programmed themselves completely (which is simply not true).

danielbarla · on May 30, 2018

Sure, there is some base there, and a fair bit of programming existed in the structure of the implementation. However, the heuristics themselves were not, and this is very significant. The software managed to reproduce and beat the previous best (both human and the previous iteration of itself), completely by playing against itself.

So, in this sense, it's kind of like taking a human, teaching them the exact rules of the game and showing them how to run calculations, and then telling them to sit in a room playing games against themselves. In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.

goatlover · on May 30, 2018

> In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.

One problem is that we can't play millions of games against ourselves in a few hours. We can play a few games, grow tired, and then need to go do something else. Come back the next day, repeat. It's a very slow process, and we have to worry about other things in life. How much of one's time and focus can be used on learning a game? You could spend 12 hours a day, if you had no other responsibilities, I guess. That might be counter productive, though. We just don't have the same capacity.

If you artificially limited AlphaGo to human capacity, then my money would be on the human being a superior player.

red75prime · on May 30, 2018

All software starts with a base of 4 billion years of evolution and thousands years of social progress and so on. But Alpha Zero doesn't require a knowledge of Go on top of that.

pleasecalllater · on May 30, 2018

> The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

In a not equal fight, and the results are still not published. I'm not claiming that AlphaZero wouldn't win, but that test was pure garbage.

tim333 · on May 30, 2018

The results were published - https://arxiv.org/abs/1712.01815

I agree AlphaZero had fancier hardware and so it wasn't really a fair fight.

red75prime · on May 30, 2018

Stockfish is not designed to scale to supercomputing clusters or TPUs, Alpha Zero wasn't designed to account for how long it takes to make a move, fair fight was hard to arrange.

pleasecalllater · on May 31, 2018

No, these are not full results. There are just 10 example games published. Where is the rest?

salvar · on May 30, 2018

How was it not equal?

tim333 · on May 30, 2018

There's discussion here https://chess.stackexchange.com/questions/19366/hardware-use... AlphaZero's hardware was faster and Stockfish had a year old version with non optimum settings. It was still an impressive win but it would be interesting to do it again with a more level playing field.

kenjackson · on May 30, 2018

And didn’t they just do all of this? It’s not like 5 years have passed. Does he expect results like this every month?

pX0r · on May 31, 2018

> Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Few would care. Your examiner doesn't give you extra marks on a given problem for finishing your homework quickly.

adynatos · on May 31, 2018

oh wow, it can play chess. can it efficiently stack shelves in warehouses yet?

agstewart · on June 1, 2018

"It" can reduce power consumption by 15%.

https://deepmind.com/blog/deepmind-ai-reduces-google-data-ce...

Just because alpha zero doesn't solve the problem you want it to doesn't mean that advancements aren't being made that matter to someone else. To ignore that seems disingenuous.

bryanrasmussen · on May 30, 2018

there is no human that has studied any of those games for 2000 years. So I think you mean 4 hours versus average human study of 40 years.

tensor · on May 30, 2018

I'm sure the same could be said for early computer graphics before the GPU race. You don't need Moore's Law to make machine learning fast, you can also do it with hardware tailored to the task. Look at Google's TPUs for an example of this.

If you want an idea of where machine learning is in the scheme of things, the best thing to do is listen to the experts. _None_ of them have promised wild general intelligence any time soon. All of them have said "this is just the beginning, it's a long process." Science is incremental and machine learning is no different in that regard.

You'll continue to see incremental progress in the field, with occasional demonstrations and applications that make you go "wow". But most of the advances will be of interest to academics, not the general public. That in no way makes them less valuable.

The field of ML/AI produces useful technologies with many real applications. Funding for this basic science isn't going away. The media will eventually tire of the AI hype once the "wow" factor of these new technologies wears off. Maybe the goal posts will move again and suddenly all the current technology won't be called "AI" anymore, but it will still be funded and the science will still advance.

It's not the exciting prediction you were looking for I'm sure, but a boring realistic one.

digitalzombie · on May 30, 2018

> Funding for this basic science isn't going away.

What make this 3rd/4th boom in AI different?

The other AI winter, the funding for these science went from well funded to little funding.

I'm skeptical, with respect of course, on your statement because it doesn't have anything to back that up other than it produce useful technologies. Wouldn't this statement imply that the other previous AI which experience AI Winter (expert system, and whatever else) didn't produce useful enough technologies to have funding?

I'm currently on the camp of there is going to be an AI Winter III coming.

> None_ of them have promised wild general intelligence any time soon.

The post talk about Andrew Ng wild expectation on other things such as radiologist tweet. While it's not wild general intelligence. What I think the main article and also I am thinking is the outrageous speculation. Another one is the tesla self driving, it doesn't seem to be there yet and perhaps we're hitting the point of over promise like we did in the past and then AI winter happen because we've found the limit.

PeterisP · on May 30, 2018

The previous AI winters were funded by speculative investments (both public research and industry) with the expectation that this might result in profitable technologies. And this didn't happen - yes, "the other previous AI which experience AI Winter (expert system, and whatever else) didn't produce useful enough technologies to have funding", the technologies developed didn't work sufficiently well to have widespread adoption in the industry; there were some use cases but the conclusion was "useful in theory but not in practice".

The current difference is that the technologies are actually useful right now. It's not about promised or expected technologies of tomorrow, but about what we have already researched, about known capabilities that need implementation, adoption, and lots of development work to apply it in lots and lots of particular use cases. If the core research hits a dead end tomorrow and stops producing any meaningful progress for the next 10 or 20 years, the obvious applications of neural-networks-as-we're-teaching-them-in-2018 work sufficiently well and are useful enough to deploy them in all kinds of industrial applications, and the demand is sufficient to employ every current ML practitioner and student even in absence of basic research funding, so a slump is not plausible.

kokey · on May 30, 2018

I've recently had a number of calls from recruiters about new startups in the UK in the AI space, some of them local and some of them extensions of US companies. Some of them were clearly less speculative (tracking shipping and footfall for hedge funds) while others were certainly more speculative sounding. The increase of the latter gives me the impression that there is a bit of speculation going on at the moment.

evgen · on May 30, 2018

A lot of this is because there is a somewhat mis-informed (which we will be polite and not call 'gullible') class of investors out there, primarily in the VC world, that thinks that most AI is magic pixie dust and so 'we will use AI/DL' and 'we will do it on the blockchain' has become the most recent version of 'we will do it on the web' in terms of helping get funding. Most of these ventures will flame out in 6-12 months and the consequences of this are going to be the source of the upcoming AI winter OP was talking about.

tigershark · on May 30, 2018

Strangely enough he didn’t speak at all about waymo self driving cars that are already hauling passengers without a safety driver. Given that he needs to hide the facts that go against his narrative I don’t really think that what he is convinced of will become reality.

majewsky · on May 30, 2018

In a very confined area. He mentions similar issues with Tesla's coast-to-coast autopilot ride: The software is not general enough yet to handle it. That seems to be the case for Waymo as well.

tigershark · on May 30, 2018

And how is this a failure of AI? The most optimistic opinions on where we would see autonomous car were on the 2020s. Instead we have autonomous car hauling people on the streets without any safety driver since 2017. And if everything goes accordingly their plan they will launch a commercial service by the end of the year in several US cities. To me it seems a resounding success, not a failure.

throwaway47861 · on May 31, 2018

> The most optimistic opinions on where we would see autonomous car were on the 2020s.

Sure, keep moving timelines. It's what makes you money in the area. I am sure when around mid-2019 hits, it will suddenly be "most experts agree that the first feasible self-driving cars will arrive circa 2025".

You guys are hilarious.

lithander · on May 30, 2018

> BUT it cost petabytes and petabytes of flops, expensive enough that they released a total of ten or twenty Alpha Go Zero game to the world

Training is expensive but inference is cheap enough for Alpha Zero inspired bots to beat human professionals while running on consumer hardware. DeepMind could have released thousands of pro-level games if they wanted to and others have: http://zero.sjeng.org/

norswap · on May 30, 2018

Bleh, no it isn't.

I am 100% in agreement with the author on the thesis: deep learning is overhyped and people project too much.

But the content of the post is in itself not enough to advocate for this position. It is guilty of the same sins: projection and following social noises.

The point about increasing compute power however, I found rather strong. New advances came at a high compute cost. Although it could be said that research often advances like that: new methods are found and then made efficient and (more) economical.

A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.

dpwm · on May 30, 2018

> A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.

I'm not even sure how you'd go about doing that. You could use information theory to debunk some of the more ludicrous claims, especially ones that involve creating "missing" information.

One of the things that disappoints me somewhat with the field, which I've arguably only scratched the surface of, is just how much of it is driven by headline results which fail to develop understanding. A lot of the theory seems to be retrofitted to explain the relatively narrow result improvement and seems only to develop the art of technical bullshitting.

There are obvious exceptions to this and they tend to be the papers that do advance the field. With a relatively shallow resnet it's possible to achieve 99.7% on MNIST and 93% on CIFAR10 on a last-gen mid-range GPU with almost no understanding of what is actually happening.

There's also low-hanging fruit that seems to have been left on the tree. Take OpenAI's paper on parametrization of weights, so that you have a normalized direction vector and a scalar. This makes intuitive sense for anybody familiar with high-dimensional spaces since nearly all of the volume of a hypersphere lies around the surface. That this works in practice is great news, but leaves many questions unanswered.

I'm not even sure how many practitioners are thinking in high dimensional spaces or aware of their properties. It feels like we get to the universal approximation theorem and just accept that as evidence that they'll work well anywhere and then just follow whatever the currently recognised state of the art model is and adapt that to our purposes.

zug_zug · on May 30, 2018

> A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.

Who's to say we won't improve this though? Right now, nets add a bunch of numbers and apply arbitrarily-picked limiting functions and arbitrarily-picked structures. Is it impossible that we find a way to train that is orders of magnitude more effective?

norswap · on May 30, 2018

To me, it's a bit like the question "Who's to say we wont find a way to travel faster than the speed of light?", by which I mean that in theory, many things are possible, but in practice, you need evidence to consider things likely.

Currently, people are projecting and saying that we are going to see huge AI advances soon. On which basis are these claims made? Showing fundamental limitations of deep learning is showing we have no idea how to get there. How to get there yet, indeed, just we have no idea how to do time travel yet.

jacksmith21006 · on May 30, 2018

Overhyped? There are cars driving around Arizona without safety drivers as I type this.

The end result of this advancement to our world is earth shattering.

On the high compute cost. There is an aspect of that being true but we have also seen advancement in silicon to support. We look at WaveNet using 16k cycles through a DNN and offering at scale and competitive price kind of proves the point.

nopinsight · on May 30, 2018

The brain most likely has much more than a petaflop of computing power and it takes at least a decade to train a human brain to achieve the grandmaster level on an advanced board game. In addition, as the other comment says, they learn from hundreds or thousands of years of knowledge that other humans have accumulated and still lose to AlphaZero with mere hours of training.

Current AIs have limitations but, at the tasks they are suited for, they can equal or exceed humans with years of experience. Computing power is not the key limit since it will be made cheaper over time. More importantly, new advances are still being made regularly by DeepMind, OpenAI, and other teams.

https://www.quora.com/Roughly-what-processing-power-does-the...

Unsupervised Predictive Memory in a Goal-Directed Agent

https://arxiv.org/abs/1803.10760

felippee · on May 30, 2018

Sure, but have you heard about Moravec's paradox? And if so, don't you find it curious that over the 30 years of Moore's law exponential progress in computing almost nothing improved on that side of things, and we kept playing fancier games?

dorgo · on May 30, 2018

to save some clicks:

https://en.wikipedia.org/wiki/Moravec%27s_paradox

Moravec's paradox is the discovery by artificial intelligence and robotics researchers that, contrary to traditional assumptions, high-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources.

nopinsight · on May 30, 2018

Yes, I am familiar with it.

What do you think of recent papers and demos by teams from Google Brain, OpenAI, and Pieter Abbeel's group on using simulations to help train physical robots? Recent advances are quite an improvement over those from the past.

felippee · on May 30, 2018

I'm skeptical, and side with Rodney Brooks on this one. First, reinforcement learning is incredibly inefficient. And sure, humans and animals have forms of reinforcement learning, but my hunch it that it works on an already incredibly semantically relevant representation and utilize the forward model. That model is generated by unsupervised learning (which is way more data efficient). Actually I side with Yann Lecun on this one, see some of his recent talks. But Yann is not a robotics guy, so I don't think he fully appreciates the role of a forward model.

Now using models for RL is the obvious choice, since trying to teach a robot a basic behavior with RL is just absurdly impractical. But the problem here, is that when somebody build that model (a 3d simulations) they put in a bunch of stuff they think is relevant to represent the reality. And that is the same trap as labeling a dataset. We only put in the stuff which is symbolically relevant to us, omitting a bunch of low level things we never even perceive.

This is a longer subject, and a HN is not enough to cover it, but there is also something about the complexity. Reality is not just more complicated than simulation, it is complex with all the consequences of that. Every attempt to put a human filtered input between AI and the world will inherently loose that complexity and ultimately the AI will not be able to immunize itself to it.

This is not an easy subject and if you read my entire blog you may get the gist of it, but I have not yet succeeded in verbalizing it concisely to my satisfaction.

empath75 · on May 30, 2018

What, no progress six months after achieving a goal thought impossible even just few years ago? Pack it up boys it’s all over but the crying.

raducu · on May 30, 2018

I was thinking just that when reading the paragraphs about the uber accident. There's absolutely nothing indicating that future progress is not possible, precisely because of how absurd it seems right now.

wslh · on May 30, 2018

Retrospectively it might sound that the Japanese were partially right in pursuing "high performance" computing with their fifth generation projects [1] but the Alpha Zero results are impressive beyond the computing performance achieved. It was a necessary element but not the only one.

[1] https://mobile.nytimes.com/1992/06/05/business/fifth-generat...

visarga · on May 30, 2018

> petabytes and petabytes of flops

Why not petaflops of bytes then?

YeGoblynQueenne · on May 30, 2018

>> Makov Tree Search

You mean Monte Carlo Tree Search, which is not at all like Ma(r)kov chains. You're probably mixing it up with Markov decision processes though.

Before criticising something it's a good idea to have a solid understanding of it.

gremlinsinc · on May 30, 2018

We very well might be in a deep-learning 'bubble' and the end of a cycle... but I don't think this time around it's really the end for a long-while, but more likely a pivot point.

The biggest minds everywhere are working on AI solutions, and there's also a lot in medical/science going on to map brains and if we can merge neuroscience with computer science we might have more luck with AI in the future...

So we could have a draught for a year or two, but there will be more research, and more breakthroughs. This won't be like the AI winters of the past where it lay dormant for 10+ years, I don't think.

nmca · on May 30, 2018

Moore's law (or at least, the diminishing one) is not relevant here because these are not single threaded programs. Google put 8x on their TPUv2 -> v3 upgrade; parallel matrix multiplies at reduced precision are a long way away from any theoretical limits, as I understand it.

jacksmith21006 · on May 30, 2018

Totally agree but why on earth down voted?

The first generation TPUs used 65536 very simple cores.

In the end you have so many transistors you can fit and there are options on how to arrange and use.

You might support very complex instructions and data types and then four cores. Or you might only support 8 bit ints, very, very simple instructions and use 65536 cores.

In the end what matters is the joules to get something done.

We can clearly see that we have big improvements by using new processor architectures.

nopinsight · on May 30, 2018

A different take by Google’s cofounder, Sergey Brin, in his most recent Founders’ Letter to investors:

“The new spring in artificial intelligence is the most significant development in computing in my lifetime.”

He listed many examples below the quote.

“understand images in Google Photos;

enable Waymo cars to recognize and distinguish objects safely;

significantly improve sound and camera quality in our hardware;

understand and produce speech for Google Home;

translate over 100 languages in Google Translate;

caption over a billion videos in 10 languages on YouTube;

improve the efficiency of our data centers;

help doctors diagnose diseases, such as diabetic retinopathy;

discover new planetary systems; ...”

https://abc.xyz/investor/founders-letters/2017/index.html

An example from another continent:

“To build the database, the hospital said it spent nearly two years to study more than 100,000 of its digital medical records spanning 12 years. The hospital also trained the AI tool using data from over 300 million medical records (link in Chinese) dating back to the 1990s from other hospitals in China. The tool has an accuracy rate of over 90% for diagnoses for more than 200 diseases, it said.“

https://qz.com/1244410/faced-with-a-doctor-shortage-a-chines...

felippee · on May 30, 2018

Hi, author here:

Well first off: letters to investors are among the most biased pieces of writing in existence.

Second: I'm not saying connectionism did not succeed in many areas! I'm a connectionist by heart! I love connectionism! But that being said there is disconnect between the expectations and reality. And it is huge. And it is particularly visible in autonomous driving. And it is not limited to media or CEO's, but it made its way into top researchers. And that is a dangerous sign, which historically preceded a winter event...

nopinsight · on May 30, 2018

I agree that self-driving had/have been overhyped over the previous few years. The problem is harder than many people realize.

The difference between the current AI renaissance and the past pre-winter AI ecosystems is the level of economic gain realized by the technology.

The late 80s-early 90s AI winter, for example, resulted from the limitations of expert systems which were useful but only in niche markets and their development and maintenance costs were quite high relative to alternatives.

The current AI systems do something that alternatives, like Mechanical Turks, can only accomplish with much greater costs and may not even have the scale necessary for global massive services like Google Photos or Youtube autocaptioning.

The spread of computing infrastructure and connectivity into the hands of billions of global population is a key contributing factor.

felippee · on May 30, 2018

> The difference between the current AI renaissance and the past pre-winter AI ecosystems is the level of economic gain realized by the technology

I would argue this is well discounted by level of investment made against the future. I don't think the winter depends on the amount that somebody makes today on AI, rather on how much people are expecting to make in the future. If these don't match, there will be a winter. My take is that there is a huge bet against the future. And if DL ends up bringing just as much profit as it does today, interest will die very, very quickly.

nopinsight · on May 30, 2018

Because there is a dearth of experts and a lack of deep technical knowledge among many business people, there are still a great many companies that have not yet started investing in deep learning or AI despite potential profits based on current technology. Non-tech sectors of the economy are probably underinvesting at the moment.

This is analogous to the way electricity took decades to realize productivity gains in the broad economy.

That said, the hype will dial down. I am just not sure the investment will decrease soon.

sanxiyn · on May 30, 2018

While I agree there is underinvestment in non-tech sectors, I don't see why that would change and they will use deep learning. There are lots of profitable things in non-tech sectors that can be done with linear regression but not done.

cm2187 · on May 30, 2018

There are lots of things in the non-tech sector that can be automated with simple vanilla software but isn't. To use AI instead, you need to have 1) sophisticated devs in place, 2) a management that gets the value added, 3) lots of data in a usable format, 4) willingness to invest & experiment. Lots of non-tech businesses are lacking one if not all of these.

xamuel · on May 30, 2018

This. And at the end of the day, deep learning is just a more sophisticated version of linear regression. (To listen to some people talking, you'd think if a machine just curve-fits enough data-points, it'll suddenly wake up and become self-aware or something? The delusion is just unbelievable!)

sgt101 · on May 30, 2018

Just like it did for ecommerce? Expectations we're wildly inflated, there was a bust, the market re adjusted and value was created

rusk · on May 30, 2018

Only after the dot-com bust killed off all the weaklings ... there was a brief "cold-snap" after 2001 ... then Web 2.0 happened.

So I guess we're waiting for something similar to happen with AI and then get AI 2.0?

Yaggo · on May 30, 2018

> I agree that self-driving had/have been overhyped over the previous few years. The problem is harder than many people realize.

The current road infrastructure (markings, signs) has been designed for humans. Once it has been modernized to better aid the self-driving systems, we don't probably need "perfect" AI.

dx034 · on May 30, 2018

But current signs designed for humans work well. They're machine readable (traffic sign detection is available from basically all manufacturers) and can (usually) be understood without prior knowledge and don't need much change over decades. I think there are few examples where messages were designed for computers but are easy to understand independent of the system, manufacturer. ASCII encoded text files are the only thing that come to mind.

tigershark · on May 30, 2018

Hi, why in your analysis you spoke only about the companies that are not doing so well in self driving leaving out waymo success story? They are already have been hauling passengers without a safety pilot since last October. I guess without the minimum problem otherwise we would have heard plenty in the news like it happened for Tesla and Uber accidents. Is it not too convenient to leave out the facts that contradict your hypothesis?

ankurdhama · on May 30, 2018

Is this happening inside google campus or in a city?

AndrewDucker · on May 30, 2018

In Arizona, on the streets: https://arstechnica.com/cars/2018/02/robotaxi-permit-gets-ar...

EdwardDiego · on May 30, 2018

Where are the Waymo cars running? Everywhere? Are they still veering into buses to avoid sandbags?

tigershark · on May 30, 2018

Phoenix, and they plan to start a commercial service by the end of the year in several US cities.

ptero · on May 30, 2018

Making cars that drive safely no current, busy roads is a very difficult task. It is not surprising that the current systems do not do that (yet). It is surprising to me how well they still do. The fact though that my phone understands my voice and my handwriting and does on the fly translation of menus and simple requests is a sign of a major progress, too.

AI is overhyped and overfunded at the moment, which is not unusual for a hot technology (synthetic biology; dotcoms). Those things go in cycles, but the down cycles are seldom all out winters. During the slowdowns best technologies still get funding (less lavish, but enough to work on) and one-hit wonders die, both of which is good in the long run. My friends working in biology are doing mostly fine even though there are no longer "this is the century of synthetic biology" posters at every airport and in every toilet.

ehsankia · on May 30, 2018

How can something be biased when it's listing facts?

Those are actual features that are available today to anyone, that were made possible by AI. Do you think it would be possible to type "pictures of me at the beach with my dog" without AI in such as short time frame? Or to have cars that drive themselves without a driver? These are concrete benefits of machine learning, I don't understand how that's biased.

jessaustin · on May 30, 2018

How can something be biased when it's listing facts?

If there are 100 facts that indicate a coming AI winter, and Brin just talks up the 15 facts that indicate AI's unalloyed dominance, that's definitely biased.

ehsankia · on May 30, 2018

First, what are said 100 facts? The article looks at fairly mundane metrics such as number of tweets or self-driving accidents...

Second, I'm not quite sure that's how it works. Like in mathematics, if your lemma is X, you can give a 100 examples of X being true, but I only need a single counter-example to break it.

In my opinion a single valid modern use-case of AI is enough to show that we're not in an AI winter. By definition an AI winter means that nothing substantial is coming out of AI for a long period of time, yet Brin listed that Google alone has had a dozen in the past few years.

majewsky · on May 30, 2018

> First, what are said 100 facts?

You cannot ask a generic question, then attack the answer based on absence of evidence for a specific example.

Grue3 · on May 30, 2018

Can't speak for the other items, but these

>translate over 100 languages in Google Translate;

>caption over a billion videos in 10 languages on YouTube;

barely even work. Yeah, it's a difficult problem but it's not even close to being solved.

SomethingStars · on May 30, 2018

YouTube captioning in English works surprisingly well, the improvement over the last few years is huge. It still chokes on proper nouns but in general it mostly works.

simias · on May 30, 2018

I think it's a bit like self-driving cars in the sense that it's good enough to be impressive but not good enough to be actually usable everywhere. Of course self-driving is worse because people seldom die of bad captions.

Google's captioning works well when people speak clearly and in English. Google translate works well when you translate well written straightforward text into English. It's impressive but it's got a long way to go to reach human grade transcription and translation.

I think when evaluating these things people underestimate how long the tail of these problems is. It's always those pesky diminishing returns. I think it's true for many AI problems today, for instance it looks like current self-driving car tech manages to handle, say, 95% of situations just fine. Thing is, in order to be actually usable you want something that critical to reach something like 99.999% success rate and bridging these last few percent might prove very difficult, maybe even impossible with current tech.

SomethingStars · on May 31, 2018

What's important to remember, I think, is that we should not compare YouTube auto captions to human made captions, because auto captions were not created as a substitute for human made captions - if it wasn't for auto captioning, all these videos wouldn't get any captions at all. They may never be perfect, but they're not designed to be, they're creating new value on their own. And IMO they crossed the threshold of being usable, at least for English.

mijamo · on May 30, 2018

Mh no it does not. It is just a source of hilarity apart from a few very specific cases (political speeches mostly, because of their slow pace, good english and prononciation I guess).

Every time I activate it I am in for a good laugh more than anything actually useful.

dx034 · on May 30, 2018

It works for general purpose videos. Transcripts of any kind appear to stop working whenever there's domain knowledge involved. That doesn't matter for most youtube videos but is crucial if you want to have a multi purpose translator/encoder.

rimliu · on May 31, 2018

A. Cooper had a nice example of this kind: a dancing bear. Sure, the fact that bear dances is very amusing, but let it not distract us from the fact that it dances very very badly.

crummy · on May 30, 2018

Google Translate is way, way better than it used to be (at least German > English which I suppose is probably an easier task than many languages).

lispm · on May 30, 2018

Google Translate of Goethe:

    Have now, ah! Philosophy,
    Law and medicine,
    And unfortunately also theology
    Thoroughly studied, with great effort.
    Here I am, I poor gate!
    And I'm as smart as before

word for word not completely bad, but then it breaks when we have to translate 'Tor'. Google Translate is clueless because it is unable to derive that here the 'fool' is meant.

It's unable to 'understand' that 'I poor gate' makes no sense at all.

Google Translate is the 'poor gate'.