Hacker News new | past | comments | ask | show | jobs | submit login
AI winter is well on its way (piekniewski.info)
993 points by wei_jok on May 30, 2018 | hide | past | favorite | 492 comments



I was recently "playing" with some radiology data. I had no chance to identify diagnoses myself with untrained eyes, something that probably takes years for a decent radiologist to master. Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training. In 4 out of 12 categories this classifier beat best human performing radiologists. Now the very same model with no other change than adjusting number of categories could be used in any image classification task, likely with state-of-art results. I was surprised when I applied it to another, completely unrelated dataset and got >92% accuracy right away.

If you think this is a symptom of AI winter, then you are probably wasting time on outdated/dysfunctional models or models that aren't suited for what you want to accomplish. Looking e.g. at Google Duplex (better voice synchronization than Vocaloid I use for making music), this pushed state-of-art to unbelievable levels in hard-to-address domains. I believe the whole SW industry will be living next 10 years from gradual addition of these concepts into production.

If you think Deep (Reinforcement) Learning is going to solve AGI, you are out of luck. If you however think it's useless and won't bring us anywhere, you are guaranteed to be wrong. Frankly, if you are daily working with Deep Learning, you are probably not seeing the big picture (i.e. how horrible methods used in real-life are and how you can easily get very economical 5% benefit of just plugging in Deep Learning somewhere in the pipeline; this might seem little but managers would kill for 5% of extra profit).


AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits. Just like an asset bubble, the value of the industry as a whole pops as people collectively realize that AI, while not being worthless, is worth significantly less than they thought.

Understand that in pop-sci circles over the past several years the general public is being exposed to stories warning about the singularity by well respected people like Stephen Hawking and Elon Musk (http://time.com/3614349/artificial-intelligence-singularity-...). Autonomous vehicles are on the roads and Boston Dynamics is showing very real robot demonstrations. Deep learning is breaking records in what we thought was possible with machine learning. All of this progress has excited an irrational exuberance in the general public.

But people don't have a good concept of what these technologies can't do, mainly because researchers, business people, and journalists don't want to tell them--they want the money and attention. But eventually the general public wises up to the unfulfillment of expectations, and drives their attention elsewhere. Here we have the AI winter.


I'd clarify that there is a specific delusion that any data scientist straight out of some sort of online degree program can go toe to toe with the likes of Andrej Karpathy or David Silver with the power of "teh durp lurnins'." And the predictable disappointment arising from the craptastic shovelware they create is what's finally creating the long overdue disappointment.

Further, I have repeatedly heard people who should know better, with very fancy advanced degrees, chant variants of "Deep Learning gets better with more data" and/or "Deep Learning makes feature engineering obsolete" as if they are trying to convince everyone around them as well as themselves that these two fallacious assumptions are the revealed truth handed down to mere mortals by the 4 horsemen of the field.

That said, if you put your ~10,000 hours into this, and keep up with the field, it's pretty impressive what high-dimensional classification and regression can do. Judea Pearl concurs: https://www.theatlantic.com/technology/archive/2018/05/machi...

My personal (and admittedly biased) belief is that if you combine DL with GOFAI and/or simulation, you can indeed work magic. AlphaZero is strong evidence of that, no? And the author of the article in this thread is apparently attempting to do the same sort of thing for self-driving cars. I wouldn't call this part of the field irrational exuberance, I'd call it amazing.


> Deep Learning makes feature engineering obsolete

I think even if you avoid constructing features, you are basically doing a similar process where a single change in a hyper-parameter can have significant effects:

- internal structure of a model (what types of blocks are you using and how do you connect them, what are they capable of together, how do gradients propagate?)

- loss function (great results come only if you use a fitting loss function)

- category weights (i.e. improving under-represented classes)

- image/data augmentation (self-driving car won't work without significant augmentation at all)

- properly set-up optimizer

The good thing here is that you can automate optimization of these to a large extent if you have a cluster of machines and a way to orchestrate meta-optimization of slightly changed models. With feature engineering you just have to do all the work upfront, thinking what might be important, and often you just miss important parts of features :-(


Yep, and in doing so, you just traded "feature engineering" for graph design and data prep, no? And that's my response to these sorts. And their usual response to me is to grumble that I don't know what I'm doing. I've started tuning them out of my existence because they seem to have nothing to contribute to it.


It's a huge difference in terms of the time invested to create something that performs well. Hand crafted feature engineering is better for some tasks but for quite a few of them automated methods perform very well indeed (at least, better than I expected).


> if you have a cluster of machines and a way to orchestrate meta-optimization of slightly changed models

curious, if there is any good quality open source project for this..


I'm not aware of actually written code/projects that does this, but try looking into neural architecture search, it should be useful https://github.com/markdtw/awesome-architecture-search


> But eventually the general public wises up to the unfulfillment of expectations, and drives their attention elsewhere. Here we have the AI winter.

And more importantly, business and government leaders wise up and turn off the money tap.


This is why it's wise for researchers and business leaders to temper expectations. Better a constant modest flow of money into the field than a boom-bust cycle with huge upfront investment followed by very bearish actions.


I think the problem is, that's absolutely against the specific interests of university departments, individual researchers, and newspapers - even if it's in the interest of the field as a whole.


Prisoner's Dilemma


That requires super rational agents in the game theory sense...


> AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits.

I think they also happen when the best ideas in the field run into the brick wall of insufficiently developed computer technology. I remember writing code for a perceptron in the '90s on an 8 bit system, 64 k RAM - it's laughable.

But right now compute power and data storage seem plentiful, so rumors of the current wave's demise appear exaggerated.


I wonder, though, what will happen with the demise of Moore's law... can we simply go with increased parallelism? How much can that scale?


That part will be harder than we can imagine.

Most of the software world will have to move on stuff like Haskell or functional language. As of now bulk(almost all) of our people are trained to program in C based languages.

It won't be easy. There will be a renewal for high demand software jobs.


I don't think Haskell/FP is a solution either... Even if it allows some beautiful straightforward parallelization in Spark for typical cases, more advanced cases become convoluted, require explicit caching, and decrease performance significantly, unless some nasty hacks are involved (resembling cut operator in Prolog). I guess bleeding edge will be always difficult and one should not restrict their choices to a single paradigm.


I wish GPUs were 1000x faster... Then I could do some crazy magic with Deep Learning instead of waiting weeks for training to be finished...


That's more a matter of budget than anything else. If you problem is valuable enough spending the money in a short time-frame rather than waiting for weeks can be well worth the investment.


I cannot fit a cluster of GPUs into a phone where I could make magic happen real-time though :(


Hm. Offload the job to a remote cluster? Or is comms then the limiting factor?


It won't give us that snappy feeling; imagine learning things in milliseconds and immediately displaying them on your phone.


Jeez. That would be faster than protein-and-water-based systems, which up until now are still the faster learners.


somebody is working on photonics-based ML http://www.lighton.io/our-technology


> AI winters are a result of a massive disparity between the expectations of the general public and the reality of where the technology currently sits.

A symptom of capitalism and marketing trying to push shit they don't understand


I don't think the claim is that AI isn't useful. It's that it's oversold. In any case, I don't think you can tell much about how well your classifier is working for something like cancer diagnoses unless you know how many false negatives you have (and how that compares to how many false negatives a radiologist makes).


There are two sides to this:

- how good humans are in detecting cancer (hint: not very good) and if having an automated system even as a "second opinion" next to an expert might not be useful?

- there are metrics for capturing true/false positives/negatives one can focus on during learning optimization

From studies you might have noticed that expert radiologists have e.g. F1-score at 0.45 and on average they score 0.39, which sounds really bad. Your system manages to push average to 0.44, which might be worse than the best radiologist out there, but better than an average radiologist [1]. Is this really being oversold? (I am not addressing possible problems with overly optimistic datasets etc. which are real concerns)

[1] https://stanfordmlgroup.github.io/projects/chexnet/


Alright. What is the cost of a false positive in that case?

The problem AI runs into is that with too much faith in the machine, people STOP thinking and believe the machine. Where you might get a .44 detection rate on radiology data alone, that radiologist with a .39 or a doctor can consult alternate streams of information. The AI may still be helpful in reinforcing a decision to continue scrutinizing a set of problem.

AI's as we call them today are better referred to as expert systems. AI carries too much baggage to be thrown around Willy nilly. An expert system may beat out a human at interpreting large unintuitive datasets, but they aren't generally testable, and like it or not, it will remain a tough sell in any situation where lives are on the line.

I'm not saying it isn't worth researching, but AI will continue to fight an uphill battle in terms of public acceptance outside of research or analytics spaces, and overselling or being anything but straightforward about what is going on under the hood will NOT help.


> The problem AI runs into is that with too much faith in the machine, people STOP thinking and believe the machine.

See https://youtu.be/R_rF4kcqLkI?t=2m51s

In medicine, I want everyone to apply appropriate skepticism to important results, and I don't want to enable lazy radiologists to zone out and press 'Y' all day. I want all the doctors to be maximally mentally engaged. Skepticism of an incorrect radiologist report recently saved my dad from some dangerous, and in his case unnecessary, treatment.


Or for a more mundane example, I tried to identify a particular plant by doing an image based I'd with Google. It was identified as a Broomrape because the pictures only had non-green portions of the plant in question. It was ACTUALLY a member of the thistle family.


The problem could be fixed by asking doctors to put their diagnosis into the machine before the machine reveals what it thinks. Then, a simple Bayesian calculation could be performed based on the historical performance of that algorithm, all doctors, and that specific doctor, leading to a final number that would be far more accurate. All of the thinking would happen before the device polluted the doctor's cognitve biases.


There is a problem with that approach that at some point hospital management starts rating doctors by how well their diagnoses match those automated ones, and punish those who deviate too much, removing any incentives to be better/different. I wouldn't underestimate this, dysfunctional management exhibits these traits in almost any mature business.


No, it's a "second opinion", and the human doctors are graded with how well their own take differs with the computer's advice, when the computer's advice is different from the ground truth.

And there's probably not even a boolean "ground truth" in complicated bio-medicine problems. Sometimes the right call is neither yes or no, but: this is not like anything I've seen before, I can't give a decision either way, I need further tests.


Is there a prevailing approach to thinking about (accounting for?) false negatives in ground truth data? I'm new to this area, and the question is relevant for my current work. By definition, you simply don't know anything about false negatives unless you have some estimate of specificity in addition to your labeled data, but can anything be done?


I don't get the sentiment of the article either. I can't speak for researchers but software engineers are living through very exciting times.

  State of the art in numbers:
  Image Classification - ~$55, 9hrs (ImageNet)
  Object Detection - ~$40, 6hrs (COCO)
  Machine Translation - ~$40, 6hrs (WMT '14 EN-DE)
  Question Answering - ~$5, 0.8hrs (SQuAD)
  Speech recognition - ~$90, 13hrs (LibriSpeech)
  Language Modeling - ~$490, 74hrs (LM1B)
"If you think Deep (Reinforcement) Learning is going to solve AGI, you are out of luck" --

I don't know. Duplex equipped with a way to minimize his own uncertainties sounds quite scary.


Duplex was impressive but cheap street magic: https://medium.com/@Michael_Spencer/google-duplex-demo-witch...

Microsoft OTOH quietly shipped the equivalent in China last month: https://www.theverge.com/2018/5/22/17379508/microsoft-xiaoic...

Google has lost a lot of steam lately IMO. Facebook is releasing better tools and Microsoft, the company they nearly vanquished a decade ago, is releasing better products. Google does remain the master of its own hype though.


> Microsoft, the company they nearly vanquished a decade ago, is releasing better products.

Google nearly vanquished Microsoft a decade ago? Where can I read more about this bit of history :) ?

IMO, Axios [0] seem to do a better job of criticizing Google's Duplex AI claims, as they repeatedly reached out to their contacts at Google for answers.

0: https://www.axios.com/google-ai-demo-questions-9a57afad-9854...


I think they are overselling Google's contributions a bit. It was more "Web 2.0" that shook Microsoft's dominance in tech. Google was a big curator and pushed state-of-the-art. Google was built on a large network of commodity hardware, they were able to do that because of the Open Source Software. Microsoft licensing would have been prohibitive to such innovation. There was some reenforcement that helped Linux gain momentum in other domains like Mobile and Desktop. Googled helped curate "Web 2.0" with developments / acquisitions like Maps and Gmail. When more of your life was spent on the web, the operating system meant less and that's also why Apple was able to make strides with their platforms. People weren't giving up as much when they switched to Mac as they would have previously.

Microsoft was previously the gatekeeper to almost every interaction with software (roughly 1992 - 2002). I don't know of good books on it but Tim O'Reilly wrote quite a bit about Web 2.0.


My question was actually tongue-in-cheek, which I tried to communicate with the smiley face.

I'm quite familiar with Google's history and would not characterize them as having vanquished Microsoft.

For the most part, Microsoft doesn't need to lose for Google to win (except of course in the realm of web search and office productivity).


You're right, it was Steve Ballmer who nearly vanquished Microsoft at a time when Google was the company to work for in tech and kept doing amazing things. At least IMO.

Unfortunately, by the time of my brief stint at Google, the place was a professional dead-end where most of the hirees got smoke blown up their patooties at orientation about how amazing they were to be accepted into Google, only to be blind allocated into me-too MVPs of stuff they'd read about on TechCrunch. All IMO of course.

That said, I met the early Google Brain team there and I apparently made a sufficiently negative first impression for one of their leaders to hold a grudge against me 6 years later, explaining at last who it was that had blacklisted me there. So at least that mystery is solved.

PS It was pretty obvious these were voice actors in a studio conversing with the AI. That is impressive, but speaking as a former DJ myself, when one has any degree of voice training, one pronounces words without much accent and without slurring them together. Google will likely never admit anything here: they don't have to.

But I will give Alphabet a point for Waymo being the most professionally-responsible self-driving car effort so far. Compare and contrast with Tesla and Uber.


My thoughts on AGI (at least in the sense of being indistinguishable from interaction with a human) are the same as my thoughts on extraterrestrial life: I'll believe it only when I see it (or at least when provided with proof that the mechanism is understood). This extrapolation on a sample size of one is something I don't understand. How is the fact that machine learning can do specific stuff better than humans different in principle than the fact that a hand calculator can do some specific stuff better than humans? On what evidence can we extrapolate from this to AGI?

We haven't found life outside this planet, and we haven't created life in a lab, therefore n=1 for assessing probability of life outside earth (which means we can't calculate a probability for this yet). Likewise, we haven't created anything remotely like animal intelligence (let alone human) and we have no good theory regarding how it works, so n=1 for existing forms of general intelligence.

Note that I'm not saying there can be no extraterrestrial life or that we will never develop AGI, just that I haven't seen any evidence at this point in time that any opinions for or against their possibility are anything more than baseless speculation.


This is what we know from Google about Duplex:

"To train the system in a new domain, we use real-time supervised training. This is comparable to the training practices of many disciplines, where an instructor supervises a student as they are doing their job, providing guidance as needed, and making sure that the task is performed at the instructor’s level of quality. In the Duplex system, experienced operators act as the instructors. By monitoring the system as it makes phone calls in a new domain, they can affect the behavior of the system in real time as needed. This continues until the system performs at the desired quality level, at which point the supervision stops and the system can make calls autonomously." --


If the dollar amounts refer to the training cost for the cheapest DL model, do you have references for them? A group of people at fast.ai trained an ImageNet model for 26$, presumably after spending a couple hundered on getting everything just right: http://www.fast.ai/2018/04/30/dawnbench-fastai/


Thats what you get with Google TPUs on reference models. The ImageNet numbers are from RiseML, the rest is from here - https://youtu.be/zEOtG-ChmZE?t=1079


"Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training."

OK, but 83% ROC/AUC is nothing to be bragging about. ROC/AUC routinely overstates the performance of a classifier anyway, and even so, ~80% values aren't that great in any domain. I wouldn't trust my life to that level of performance, unless I had no other choice.

You're basically making the author's case: deep learning clearly outperforms on certain classes of problems, and easily "generalizes" to modest performance on lots of others. But leaping from that to "radiology robots are almost here!" is folly.


Yeah, but the point here was that radiologists on average fared even worse. 83% is not impressive, but better than what we have right now in real-world with real people, as sad as it is. Obviously, best radiologists would outperform it right now, but average ones, likely stressed under heavy workload might not be able to beat it. And of course, this classifier probably works on certain visual structures better than humans and other ones easier detectable by humans would slip through.

There is also higher chance that next state-of-art model would push it significantly over 83% or best human radiologist at some point in the future, so it might not be very economical to train humans to become even better (i.e. dedicate your life to focus on radiology diagnostics only).


I think you're missing a very important part here that maybe you've considered: Domain knowledge. I'm assuming your radiologic images were hand labeled by other radiologists. How did they come to that diagnosis? By only looking at the image? This was a severe limitation of the Andrew Ng paper on CheXnet for detection of pneumonia from chest x Rays. CheXnet was able to outperform radiologists on detection of pneumonia from the chest x Rays, but the diagnosis of pneumonia is considered a clinical diagnosis that requires more information about the patient. My point is that while your results are impressive and indicative of where deep learning could help in medicine, these same results might be skewed since you're testing the model on hand labeled data. What happens if you apply this in the real world at a hospital where the radiologist gets the whole patient chart and your model only gets the x Ray?


There is a paper discussing higher-order interdependencies between diagnoses [1] on X-ray images (they seem to apply LSTM to derive those dependencies). This could be probably extended to include data outside X-ray images. My take is that it's pretty impressive what we can derive from a single image; now if we have multiple low-level Deep Learning-based diagnostic subsystems and combine them together via some glue (either Deep (Reinforcement) Learning, classical ML, logic-based expert system, PGM etc.), we might be able to represent/identify diagnoses with much more certainty than any single individual M.D. possibly could (also creating some blindspots that humans won't leave unaddressed). It could be difficult to estimate statistical properties of the whole system though, but that's a problem with any complex system, including a group of expert humans.

The main critique for CheXNet I've read was focused on the NIH dataset itself, not the model. The model generalizes quite well across multiple visual domains, given proper augmentation.

[1] https://arxiv.org/abs/1710.10501


"Yeah, but the point here was that radiologists on average fared even worse."

Except they don't. See the table in the original post. Also, comparing the "average" radiologist by F1 scores from a single experiment (as you've done in other comments here) is meaningless.

Unless my doctor is exactly average (and isn't incorporating additional information, or smart enough to be optimizing for false positive/negative rates relative to cost), comparison to average statistics is academic. But I don't really need to tell you this -- your comment has so many caveats that you're clearly already aware of the limits of the method.


This thread is a microcosm of this whole issue of overhyping.

On one hand, we have one commenter saying he can train a model to do a specific thing with a specific quantitative metric, to demonstrate how deep learning can incredibly powerful/useful.

On the other hand, we have another commenter saying "But this won't replace my doctor!" and therefore deep learning is overhyped.

The two sides aren't even talking about the same thing.


Agree that the thread is a microcosm of the debate, but ironically, I'm not trying to say anything like "this won't replace my doctor".

That kind of hyperventilating stuff is easy to brush off. The problem with deep-learning hype is that comments like "my classifier gets a ROC/AUC score of 0.8 with barely any work!" are presented as meaningful. The difference between a 0.8 AUC and a usable medical technology means that most of the work is ahead of you.


Agreed. I think it comes down to the presentation/interpretation of results. The response to "My classifier gets score of X" can be either "wow, that's a good score for a classifier, this method has merit" or "but X is not a good measure of [actual objective]".

So I think it's come down to conflict between

1. Which the author is trying to present 2. What an astute reader might interpret it as 3. What an astute reader might worry an uninformed reader might interpret it as

And my feeling is that, given all the talk about hype in pop-sci, we're actually on point 3 now, even when the author and reader are actually talking about something reasonable. Whereas personally I'm more interested in the research and interpretations from experts, which I find tend to be not so problematic.


> Unless my doctor is exactly average

Just to get back to this point: what if the vision system of your doctor is below average and you augment her by giving her a statistically better vision system, while allowing her to use the additional sources as she sees fit. Wouldn't be that an improvement? We are talking about vision subsystem here, not the whole "reasoning package" human doctors posses.


Again, check that table. It says a lot:

https://stanfordmlgroup.github.io/competitions/mura/

On just about every test set, the model is beaten by radiologists. Even the mean performance is underwhelming.


I was referring mainly to this one (from the same group and it actually surpassed humans on average):

https://stanfordmlgroup.github.io/projects/chexnet/

In their paper they even used "weaker" DenseNet-121 instead of DenseNet-169 for Mura/bones. DenseNet-BC I tried is another refinement of the same approach.


Those are some sketchy statistics. The evaluation procedure is questionable (F1 against the other 4 as ground truth? Mean of means?), and the 95% CI overlap pretty substantially. Even if their bootstrap sampling said the difference is significant, I don't believe them.

Basically, I see this as "everyone sucks, but the AI maybe sucks a little less than the worst of our radiologists, on average"


What would be the good metrics then? Of course metrics are just indicators that can be interpreted incorrectly. Still, we have to measure something tangible. What would you propose? I am aware of limitations and would gladly use something better...

Some people mention Matthews correlation coefficients, Youden's J statistic, Cohen's kappa etc. but I haven't seen them in any Deep Learning paper so far and I bet they have large blindspots as well.


> Just by using DenseNet-BC-100-12 I ended up with 83% ROC AUC after a few hours of training

Of course! Using DenseNet-BC-100-12 to increase ROC AUC, it was so obvious!


Would you mind sharing which other, unrelated dataset you have used the model on?



I can't unfortunately, proprietary stuff being plugged into existing business right now.


Next winter will probably going to be going over that 92% across all domains.


Possibly, but will it be called AI winter, if e.g. average human has 88% accuracy and best human 97%?


Yea this sounds extremely unlikely unless the other dataset has a fairly easy decision boundary. The kind of cross-domain transfer learning you seem to think deep neural networks have is nothing I've observed before in my formal studies of neural network


How much of this can we pin on IBM's overhype of Watson?


ROC AUC is fairly useless when you have disparate costs in the errors. Try precision-recall.


I mentioned F1 in some later comment.


This is a deep, significant post (pardon pun etc).

The author is clearly informed and takes a strong, historical view of the situation. Looking at what the really smart people who brought us this innovation have said and done lately is a good start imo (just one datum of course, but there are others in this interesting survey).

Deepmind hasn't shown anything breathtaking since their Alpha Go zero.

Another thing to consider about Alpha Go and Alpha Go Zero is the vast, vast amount of computing firepower that this application mobilized. While it was often repeated that ordinary Go program weren't making progress, this wasn't true - the best, amateur programs had gotten to about 2 Dan amateur using Makov Tree Search. Alpha Go added CNNs for it's weighting function and petabytes of power for it's process and got effectiveness up to best in the world, 9 Dan professional, (maybe 11 Dan amateur for pure comparison). [1]

Alpha Go Zero was supposedly even more powerful, learned without human intervention. BUT it cost petabytes and petabytes of flops, expensive enough that they released a total of ten or twenty Alpha Go Zero game to the world, labeled "A great gift".

The author convenniently reproduces the chart of power versus results. Look at it, consider it. Consider the chart in the context of Moore's Law retreating. The problems of Alpha Zero generalizes as described in the article.

The author could also have dived into the troubling question as of "AI as ordinary computer application" (what does testing, debugging, interface design, etc mean when the app is automatically generated in an ad-hoc fashion) or "explainability". But when you can paint a troubling picture without these gnawing problems appearing, you've done well.

[1] https://en.wikipedia.org/wiki/Go_ranks_and_ratings


>Deepmind hasn't shown anything breathtaking since their Alpha Go zero

They went on to make AlphaZero, a generalised version that could learn chess, shogi or any similar game. The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

That seemed impressive to me.

Also they used loads of compute during the training but not so much during play.(5000 TPUs, 4TPUs).

Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.


It's not like humanity really needs another chess playing program 20 years after IBM solved that problem (but now utilizing 1000x more compute power). I just find all these game playing contraptions really uninteresting. There are plenty real world problems to be solved of much higher practicality. Moravec's paradox in full glow.


The fact that it beat Stockfish9 is not what is impressive with AlphaZero.

What was impressive was the way Stockfish9 was beaten. AlphaZero played like a human player, making sacrifices for position that stockfish thought were detrimental. When it played as white, the fact that is mostly started with the Queen pawn (despite that the King pawn is "best by test") and the way AlphaZero used Stockfish pawnstructure and tempo to basicaly remove a bishop from the game was magical.

Yes, since its a game, it's "useless", but it allowed me (and i'm not the only one) to be a bit better at chess. It's not world hunger, not climate change, it's just a bit of distraction for some people.

PS: I was part of the people thinking that Genetic algorithm+deep learning was not enough to emulate human logical capacities, AlphaZero vs Stockfish games made me admit i was wrong (even if i still think it only works inside well-defined environments)


Two observations:

Just because Fischer preferred 1. e4, it doesn't make it better than other openings. https://en.chessbase.com/post/1-e4-best-by-test-part-1

Playing like a human for me also means making human mistakes. A chess-playing computer playing like a 4000 rated "human" is useless, one that can be configured to play at different ELOs is more interesting, although most can do that and there's no ML needed, nor huge amounts of computing power.


> What was impressive was the way Stockfish9 was beaten.

Without its opening database and without its endgame tablebase?

Frankly, the Stockfish vs AlphaZero match was the beginning of the AI Winter in my mind. The fact that they disabled Stockfish's primary databases was incredibly fishy IMO and is a major detriment to their paper.

Stockfish's engine is designed to only work in the midgame of Chess. Remove the opening database and remove the endgame database, and you're not really playing against Stockfish anymore.

The fact that Stockfish's opening was severely gimped is not a surprise to anybody in the Chess community. Stockfish didn't have its opening database enabled... for some reason.


I think for most people, the research interest in games of various sorts, is not simply a desire for a better and better game contraption, a better mousetrap. But rather the thinking is, "playing games takes intelligence, what can we learn about intelligence by building machines that play games?"

Most games are also closed systems, and conveniently grokkable systems, with enumerable search spaces. Which gives us easily produceable measures of the contraptions' abilities.

Whether this is the most effective path to understanding deeper questions about intelligence is an open question.

But I don't think it's fair to say that deeper questions and problems are being foregone simply to play games.

I think most 'games researchers' are pursuing these paths because they themselves and no one else has put forth any other suggestion that makes them think, "hmm, that's a really good idea, that seems like it might be viable and there is probably something interesting we could learn from it."

Do you have any suggestions?


This is so true, I can't understand why people miss this. The games are just games. It's intelligence that is the goal.

And comparing Alpha Go Zero against those "other chess programs that existed for 30 years" is exactly missing the point also. Those programs were not constructed with zero-knowledge. They were carefully crafted by human players to achieve the result. Are we also going to count in all the brain processing power and the time spent by those researchers to learn to play chess? Alpha Go Zero did not need any of that, besides the knowledge about the basic rules of the game. Who compare compute requirements for 2 programs that have fundamentally different goals and achievements? One is carefully crafted by human intervention. The other one learns a new game without prior knowledge...


It shows something about the game, but it's clear that humans don't learn in the way that alpha zero does, do i don't think that alpha zero illuminated any aspect of human intelligence.


I think that fundamentally the goal of research is not necessarily human-like intelligence, just any high-level general intelligence. It's just that the human brain (and the rest of the body) has been a great example of an intelligent entity which we could source of a lot inspiration from. Whether the final result will share a the technical and structural similarity (and how much) to a human, the future will tell.


In principle you are right. In practice we will see. My bet is that attempts that focused on the human model will bear more fruit in the medium term because we have huge capability for observation at scale now which is v. exciting. Obviously ethics permitting!


Not sure if I am reading you correctly but to me you basically are saying "we have no idea but we believe that one day it will make sense".

Sounds more like religion and less like science to me.

I guess we could argue until the end of the world that no intelligence will emerge from more and more clever ways of brute-forcing your way out of problems in a finite space with perfect information. But that's what I think.


But humans could learn in the same way that AlphaZero does. We have the same resources and the same capabilities, just running on million-year-old hardware. Humans might not be able to replicate the performance of AlphaZero, but that does not mean it is useless in the study of intelligence.


The problem is that outside perfect information games, most areas where intelligence is required have few obvious routes to allow the computer to learn by perfectly simulating strategies and potential outcomes. Cases where "intelligence" is required typically entail handling human approximations of a lot of unknown and barely known possibilities with an inadequate dataset, and advances in approaches to perfect information games which can be entirely simulated by a machine knowing the ruleset (and possibly actually perturbed by adding inputs of human approaches to the problem) might be at best orthogonal to that particular goal. One of the takeaways from AlphaGo Zero massively outperforming AlphaGo is that even very carefully designed training sets for a problem fairly well understood by humans might actually retard system performance...


I totally agree with you and share your confusion.

On the topic of the different algorithmic approaches, I find it so fascinating how different these two approaches actually end up looking when analyzed by a professional commentator. When you watch the new style with a chess commentator, it feels a lot like listening to the analysis of a human game. The algorithm has very clearly captured strategic concepts in its neural network. Meanwhile, with older chess engines there is a tendency to get to positions where the computer clearly doesn't know what its doing. The game reaches a strategic point and the things its supposed to do are beyond the horizon of moves it can computer by brute force. So it plays stupid. These are the positions that, even now, human players can beat better than human old style chess engines at.


The thing is that you can learn new moves/strategies that were never thought about before in these games but you still doesn't understand anything about intelligence at all.


A favour work by Rodney Brooks - "elephants don't play chess"

https://people.csail.mit.edu/brooks/papers/elephants.pdf


It's not like the research on games is at the expense of other more worthy goals. It is a well constrained problem that lets you understand the limitations of your method. Great for making progress. Alpha zero didn't just play chess well, it learned how to play chess well (and could generalize to other games). I'd forgive it 10000 times the resources for that.


> It is a well constrained problem

But attacking not-well-constrained problems is what's needed to show real progress in AI these days, right?


I'd say getting better sample efficiency is a bigger deal. It isn't like POMDP's are a huge step away theoretically from MDP's. But if you attach one of these things to a robot, taking 10^7 samples to learn a policy is a deal breaker. So fine, please keep using games to research with.


>it learned how to play chess well

This. Learning to play a game is one thing. Learning how to teach computers to learn a game is another thing. Yes chess programs have been good before, but that's missing the point a little bit. The novel bit is not that it can beat another computer, but how it learned how to do so.


Big Blue relied on humans to do all the training. Alpha Go zero didn't need humans at all to do the training.

That's a pretty major shift for humanity.


It's Deep Blue, not Big Blue. The parameters used by its evaluation function were tuned by the system on games played by human masters.

But it's a mistake to think that a system learning by playing against itself is something new. Arthur Samuel's draughts (chequers) program did that in 1959.


Sorry, mix up, thanks for the correction.

It's not that it's new, it's that they've achieved it. Chess was orders of magnitude harder than draughts. The solution for draughts didn't scale to chess but Alpha Go zero showed that chess was ridiculously easy for it once it had learned Go.


Both Samuel's chequer's program and Deep Blue used alpha-beta pruning for search, and a heuristic function. Deep Blue's heuristic function was necessarily more complex because chess is more complex than draughts. I think the reason master chess games were used in Deep Blue instead of self-play was the existence of a large database of such games, and because so much of its performance was the result of being able to look ahead so far.


> It's Deep Blue, not Big Blue.

Big Blue is fine - it's referring to the company and not the machine. From Wikipedia "Big Blue is a nickname for IBM"


I meant Deep Blue, but yeah Deep Blue was a play on Big Blue.


I guess there are reasons why researchers build chess programs: it is easy to compare performance between algorithms. When you can solve chess, you can solve a whole class of decision-making problems. Consider it as the perfect lab.


What is that class of decision-making problems? It's nice to have a machine really good at playing chess, but it's not something I'd pay for. What decision-making problems are there, in the same class, that I'd pay for?

Consider it as the perfect lab.

Seems like a lab so simplified that I'm unconvinced of its general applicability. Perfect knowledge of the situation and a very limited set of valid moves at any one time.


> What decision-making problems are there, in the same class, that I'd pay for?

an awful lot of graph and optimization problems. See for instance some examples in https://en.wikipedia.org/wiki/A*_search_algorithm


Perfect information problem solving is not interesting anymore.

Did they manage to extend it to games with hidden and imperfect information?

(Say, chess with fog of war also known as Dark Chess. Phantom Go. Pathfinding equivalent would be an incremental search.)

Edit: I see they are working on it, predictive state memory paper (MERLIN) is promising but not there yet.


Strongly disagree. There are a lot of approximation algorithms and heuristics in wide use - to the tune of trillions of dollars, in fact, when you consider transportation and logistics, things like asic place & route, etc. These are all intractable perfect info problems that are so widespread and commercially important that they amplify the effect of even modest improvements.

(You said problems, not games...)


Indeed, there are a few problems where even with perfect information you will be hard pressed to solve them. But that is only a question of computational power or the issue when the algorithm does not allow efficient approximation (not in APX space or co-APX).

The thing is, an algorithm that can work with fewer samples and robustly tolerating mistakes in datasets (also known as imperfect information) will be vastly cheaper and easier to operate. Less tedious sample data collection and labelling.

Working with lacking and erroneous information (without known error value) is necessarily a crucial step towards AGI; as is extracting structure from such data.

This is the difference between an engineering problem and research problem.


Perhaps a unifying way of saying this is: it's a research problem to figure out how to get ML techniques to the point they outperform existing heuristics on "hard" problems. Doing so will result in engineering improvements to the specific systems that need approximate solutions to those problems.

I completely agree about the importance of imperfect information problems. In practice, many techniques handle some label noise, but not optimally. Even MNIST is much easier to solve if you remove the one incorrectly-labeled training example. (one! Which is barely noise. Though as a reassuring example from the classification domain, JFT is noisy and still results in better real world performance than just training on imagenet.)


> Perfect information problem solving is not interesting anymore.

I guess in the same way as lab chemistry isn't interesting anymore ? (Since it often happens in unrealistically clean equipment :-)

I think there is nothing preventing lab research from going on at the same time as industrialization of yesterday's results. Quite on the contrary: in the long run they often depend on each other.


There’s plenty of interesting work on poker bots.


Poker bots actually deal with a (simple) game with imperfect information. It is not the best test because short memory is sufficient to win at it.

The real challenge is to devise a general algorithm that will learn to be a good poker player in thousands of games, strategically, from just a bunch of games played. DeepStack AI required 10 million simulated games. Good human players outperform it at intermediate training stages.

And then the other part is figuring out actual rules of a harder game...


I think chess may actually be the worst lab. Decisions made in chess are done so with perfect knowledge of the current state and future possibilities. Most decisions are made without perfect knowledge.


For chess, the future possibilities are so vast, you can't call them "perfect knowledge" with a straight face.


This is not what the terminology "perfect knowledge" means. Perfect knowledge (more often called "perfect information") refers to games in which all parts of the game state are accessible to every other player. In theory, any player in the game has access to all information contained in every game state up to the present and can extrapolate possible forward states. Chess is a very good example of a game of perfect information, because the two players can readily observe the entire board and each other's moves.

A good example of a game of imperfect information is poker, because players have a private hand which is known only to them. Whereas all possible future states of a chess game can be narrowed down according to the current game state, the fundamental uncertainty of poker means there is a combinatorial explosion involved in predicting future states. There's also the element of chance in poker, which further muddies the waters.

Board games are often (but not always) games of perfect and complete information. Card games are typically games of imperfect and complete information. This latter term, "complete information", means that even if not all of the game state is public, the intrinsic rules and structure of the game are public. Both chess and poker are complete, because we know the rules, win conditions and incentives for all players.

This is all to say that games of perfect information are relatively easy for a computer to win, while games of imperfect information are harder. And of course, games of incomplete information can be much more difficult :)


A human might not be able to, but a computer can. Isn't the explicit reason research shifted to using Go the fact that you can't just number crunch your way through it?


AlphaGo Zero did precisely that. Most of its computations were done on a huge array of GPUs. The problem with Go is that look-ahead is more of a problem than in Chess, as Go has roughly between five and ten times as many possible moves at each point in the game. So Go was more of a challenge, and master-level play was only made possible by advances in computer hardware.


Chess was already easy for computers. That's why Arimaa came to be.


When you can solve chess, you can solve a whole class of decision-making problems

If this were true, there would be a vast demand for grandmasters in commerce, government, the military... and there just isn’t. Poker players suffer from similar delusions about how their game can be generalised to other domains.


> Poker players suffer from similar delusions about how their game can be generalised to other domains.

Oh that's so true

Poker players in the real life would give up more often than not, whenever they didn't know enough about a situation or they didn't have enough resources for a win with a high probability.

And people can call your bluff even if you fold.


Those traits seem to me like a thing most people desperately need ... Everyone being confident in their assessment of everything seems like one of major problems of today's population.


I think batmansmk doesn't mean "when X is good at chess, X is automatically good at lots of other things", but "the traits that make you a good chess player (given enough training) also make you good at lots of other things (given enough training)".


I might suspect (but certainly cannot prove) that the traits that make a human good at playing chess are very different to the traits that make a machine good at playing chess, and as such I don't think we can assume that the machine skilled-chess-player will be good at lots of other things in an analagous way to the human skilled-chess-player.


And Gaius point stands before this argument as well, chess is seen as such a weak predictor that playing a game of chess or requesting an official ELO rating isn't used for hiring screening for instance.

I suspect that chess as a metagame is just so far developed that being "good at chess" means your general ability is really overtrained for chess.


Second world chess champion Emanuel Lasker spent a couple years studying Go and by his own report was dejected by his progress. Maybe he would have eventually reached high levels, but I've always found this story fascinating.


True, but I'd phrase it the other way around. The traits that make you (a human) good at general problem solving are also the traits that make you a good chess player. I do suspect, though, that there are some Chess-specific traits which boost your Chess performance but don't help much with general intelligence. (Consider, for example, the fact that Bobby Fischer wasn't considered a genius outside of his chosen field.)


Tell me about it. The brightest minds are working on ads, and we have AI playing social games.

Can AI make the world better? It can, but it won't since we are humans, and humans will weaponize technology every chance it gets. Of course some positive uses will come, but the negative ones will be incredibly destructive.


Just because you haven't seen humongous publicity stunts involving pratical uses of AI doesn't mean they aren't being deployed. My company using similar methods to warn hospitals about patients with high probability of imminent heart attacks and sepsis.

The practical uses of these technologies don't always make national news.

I'm sure you would also have scoffed at the "pointless impractical, wasteful use of our brightest minds" to make the the Flyer hang in the air for 30 yards at Kitty Hawk.


Let's start with defining "better"


>>20 years after IBM solved that problem

We solved nothing.

IBM Deep Blue doesn't exactly think like humans do.

Most of our algorithms really are 'better brute force'.

https://www.theatlantic.com/magazine/archive/2013/11/the-man...


Exactly. To my not-very-well-informed self, even AlphaGo Zero is just a more clever way to brute-force board games.

Side observers are taking joy in the risker plays that it did -- reminded them of certain grand-masters I suppose -- but that still doesn't mean AGZ is close to any form of intelligence at all. Those "riskier moves" are probably just a way to more quickly reduce the problem space anyway.

It seriously reminds me more and more of religion, the AI area these days.


>Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Most humans don't live 2000 years. And realistically don't spend that much of their time or computing power on studying chess. Surely a computer can be more focused at this and the 4h are impressive. But this comparison seems flawed to me.


You're right, though the distinction with the parent poster is that AlphaGo Zero had no input knowledge to learn from, unlike humans (who read books, listen to other players' wisdom, etc). It's a fairly well known phenomenon that e.g. current era chess players are far stronger than previous eras' players, and this probably has to do with the accumulation of knowledge over decades, or even hundreds of years. It's incredibly impressive for software to replicate that knowledge base so quickly.


Not so much from the accumulation of knowledge because players can only study so many games. The difference is largely because their are more people today, they have more free time, and they could play vs high level opponents sooner.

Remember people reach peak play in ~15 years, but they don't nessisarily keep up with advances.

PS: You see this across a huge range of fields from running, figure skating, to music people simply spend more time and resources getting better.


But software is starting from the same base. To claim it isn't would be to claim that the computers programmed themselves completely (which is simply not true).


Sure, there is some base there, and a fair bit of programming existed in the structure of the implementation. However, the heuristics themselves were not, and this is very significant. The software managed to reproduce and beat the previous best (both human and the previous iteration of itself), completely by playing against itself.

So, in this sense, it's kind of like taking a human, teaching them the exact rules of the game and showing them how to run calculations, and then telling them to sit in a room playing games against themselves. In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.


> In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.

One problem is that we can't play millions of games against ourselves in a few hours. We can play a few games, grow tired, and then need to go do something else. Come back the next day, repeat. It's a very slow process, and we have to worry about other things in life. How much of one's time and focus can be used on learning a game? You could spend 12 hours a day, if you had no other responsibilities, I guess. That might be counter productive, though. We just don't have the same capacity.

If you artificially limited AlphaGo to human capacity, then my money would be on the human being a superior player.


All software starts with a base of 4 billion years of evolution and thousands years of social progress and so on. But Alpha Zero doesn't require a knowledge of Go on top of that.


> The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

In a not equal fight, and the results are still not published. I'm not claiming that AlphaZero wouldn't win, but that test was pure garbage.


The results were published - https://arxiv.org/abs/1712.01815

I agree AlphaZero had fancier hardware and so it wasn't really a fair fight.


Stockfish is not designed to scale to supercomputing clusters or TPUs, Alpha Zero wasn't designed to account for how long it takes to make a move, fair fight was hard to arrange.


No, these are not full results. There are just 10 example games published. Where is the rest?


How was it not equal?


There's discussion here https://chess.stackexchange.com/questions/19366/hardware-use... AlphaZero's hardware was faster and Stockfish had a year old version with non optimum settings. It was still an impressive win but it would be interesting to do it again with a more level playing field.


And didn’t they just do all of this? It’s not like 5 years have passed. Does he expect results like this every month?


> Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Few would care. Your examiner doesn't give you extra marks on a given problem for finishing your homework quickly.


oh wow, it can play chess. can it efficiently stack shelves in warehouses yet?


"It" can reduce power consumption by 15%.

https://deepmind.com/blog/deepmind-ai-reduces-google-data-ce...

Just because alpha zero doesn't solve the problem you want it to doesn't mean that advancements aren't being made that matter to someone else. To ignore that seems disingenuous.


there is no human that has studied any of those games for 2000 years. So I think you mean 4 hours versus average human study of 40 years.


I'm sure the same could be said for early computer graphics before the GPU race. You don't need Moore's Law to make machine learning fast, you can also do it with hardware tailored to the task. Look at Google's TPUs for an example of this.

If you want an idea of where machine learning is in the scheme of things, the best thing to do is listen to the experts. _None_ of them have promised wild general intelligence any time soon. All of them have said "this is just the beginning, it's a long process." Science is incremental and machine learning is no different in that regard.

You'll continue to see incremental progress in the field, with occasional demonstrations and applications that make you go "wow". But most of the advances will be of interest to academics, not the general public. That in no way makes them less valuable.

The field of ML/AI produces useful technologies with many real applications. Funding for this basic science isn't going away. The media will eventually tire of the AI hype once the "wow" factor of these new technologies wears off. Maybe the goal posts will move again and suddenly all the current technology won't be called "AI" anymore, but it will still be funded and the science will still advance.

It's not the exciting prediction you were looking for I'm sure, but a boring realistic one.


> Funding for this basic science isn't going away.

What make this 3rd/4th boom in AI different?

The other AI winter, the funding for these science went from well funded to little funding.

I'm skeptical, with respect of course, on your statement because it doesn't have anything to back that up other than it produce useful technologies. Wouldn't this statement imply that the other previous AI which experience AI Winter (expert system, and whatever else) didn't produce useful enough technologies to have funding?

I'm currently on the camp of there is going to be an AI Winter III coming.

> None_ of them have promised wild general intelligence any time soon.

The post talk about Andrew Ng wild expectation on other things such as radiologist tweet. While it's not wild general intelligence. What I think the main article and also I am thinking is the outrageous speculation. Another one is the tesla self driving, it doesn't seem to be there yet and perhaps we're hitting the point of over promise like we did in the past and then AI winter happen because we've found the limit.


The previous AI winters were funded by speculative investments (both public research and industry) with the expectation that this might result in profitable technologies. And this didn't happen - yes, "the other previous AI which experience AI Winter (expert system, and whatever else) didn't produce useful enough technologies to have funding", the technologies developed didn't work sufficiently well to have widespread adoption in the industry; there were some use cases but the conclusion was "useful in theory but not in practice".

The current difference is that the technologies are actually useful right now. It's not about promised or expected technologies of tomorrow, but about what we have already researched, about known capabilities that need implementation, adoption, and lots of development work to apply it in lots and lots of particular use cases. If the core research hits a dead end tomorrow and stops producing any meaningful progress for the next 10 or 20 years, the obvious applications of neural-networks-as-we're-teaching-them-in-2018 work sufficiently well and are useful enough to deploy them in all kinds of industrial applications, and the demand is sufficient to employ every current ML practitioner and student even in absence of basic research funding, so a slump is not plausible.


I've recently had a number of calls from recruiters about new startups in the UK in the AI space, some of them local and some of them extensions of US companies. Some of them were clearly less speculative (tracking shipping and footfall for hedge funds) while others were certainly more speculative sounding. The increase of the latter gives me the impression that there is a bit of speculation going on at the moment.


A lot of this is because there is a somewhat mis-informed (which we will be polite and not call 'gullible') class of investors out there, primarily in the VC world, that thinks that most AI is magic pixie dust and so 'we will use AI/DL' and 'we will do it on the blockchain' has become the most recent version of 'we will do it on the web' in terms of helping get funding. Most of these ventures will flame out in 6-12 months and the consequences of this are going to be the source of the upcoming AI winter OP was talking about.


Strangely enough he didn’t speak at all about waymo self driving cars that are already hauling passengers without a safety driver. Given that he needs to hide the facts that go against his narrative I don’t really think that what he is convinced of will become reality.


In a very confined area. He mentions similar issues with Tesla's coast-to-coast autopilot ride: The software is not general enough yet to handle it. That seems to be the case for Waymo as well.


And how is this a failure of AI? The most optimistic opinions on where we would see autonomous car were on the 2020s. Instead we have autonomous car hauling people on the streets without any safety driver since 2017. And if everything goes accordingly their plan they will launch a commercial service by the end of the year in several US cities. To me it seems a resounding success, not a failure.


> The most optimistic opinions on where we would see autonomous car were on the 2020s.

Sure, keep moving timelines. It's what makes you money in the area. I am sure when around mid-2019 hits, it will suddenly be "most experts agree that the first feasible self-driving cars will arrive circa 2025".

You guys are hilarious.


> BUT it cost petabytes and petabytes of flops, expensive enough that they released a total of ten or twenty Alpha Go Zero game to the world

Training is expensive but inference is cheap enough for Alpha Zero inspired bots to beat human professionals while running on consumer hardware. DeepMind could have released thousands of pro-level games if they wanted to and others have: http://zero.sjeng.org/


Bleh, no it isn't.

I am 100% in agreement with the author on the thesis: deep learning is overhyped and people project too much.

But the content of the post is in itself not enough to advocate for this position. It is guilty of the same sins: projection and following social noises.

The point about increasing compute power however, I found rather strong. New advances came at a high compute cost. Although it could be said that research often advances like that: new methods are found and then made efficient and (more) economical.

A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.


> A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.

I'm not even sure how you'd go about doing that. You could use information theory to debunk some of the more ludicrous claims, especially ones that involve creating "missing" information.

One of the things that disappoints me somewhat with the field, which I've arguably only scratched the surface of, is just how much of it is driven by headline results which fail to develop understanding. A lot of the theory seems to be retrofitted to explain the relatively narrow result improvement and seems only to develop the art of technical bullshitting.

There are obvious exceptions to this and they tend to be the papers that do advance the field. With a relatively shallow resnet it's possible to achieve 99.7% on MNIST and 93% on CIFAR10 on a last-gen mid-range GPU with almost no understanding of what is actually happening.

There's also low-hanging fruit that seems to have been left on the tree. Take OpenAI's paper on parametrization of weights, so that you have a normalized direction vector and a scalar. This makes intuitive sense for anybody familiar with high-dimensional spaces since nearly all of the volume of a hypersphere lies around the surface. That this works in practice is great news, but leaves many questions unanswered.

I'm not even sure how many practitioners are thinking in high dimensional spaces or aware of their properties. It feels like we get to the universal approximation theorem and just accept that as evidence that they'll work well anywhere and then just follow whatever the currently recognised state of the art model is and adapt that to our purposes.


> A much stronger rebuttal of the hype would have been based on the technical limitations of deep learning.

Who's to say we won't improve this though? Right now, nets add a bunch of numbers and apply arbitrarily-picked limiting functions and arbitrarily-picked structures. Is it impossible that we find a way to train that is orders of magnitude more effective?


To me, it's a bit like the question "Who's to say we wont find a way to travel faster than the speed of light?", by which I mean that in theory, many things are possible, but in practice, you need evidence to consider things likely.

Currently, people are projecting and saying that we are going to see huge AI advances soon. On which basis are these claims made? Showing fundamental limitations of deep learning is showing we have no idea how to get there. How to get there yet, indeed, just we have no idea how to do time travel yet.


Overhyped? There are cars driving around Arizona without safety drivers as I type this.

The end result of this advancement to our world is earth shattering.

On the high compute cost. There is an aspect of that being true but we have also seen advancement in silicon to support. We look at WaveNet using 16k cycles through a DNN and offering at scale and competitive price kind of proves the point.


The brain most likely has much more than a petaflop of computing power and it takes at least a decade to train a human brain to achieve the grandmaster level on an advanced board game. In addition, as the other comment says, they learn from hundreds or thousands of years of knowledge that other humans have accumulated and still lose to AlphaZero with mere hours of training.

Current AIs have limitations but, at the tasks they are suited for, they can equal or exceed humans with years of experience. Computing power is not the key limit since it will be made cheaper over time. More importantly, new advances are still being made regularly by DeepMind, OpenAI, and other teams.

https://www.quora.com/Roughly-what-processing-power-does-the...

Unsupervised Predictive Memory in a Goal-Directed Agent

https://arxiv.org/abs/1803.10760


Sure, but have you heard about Moravec's paradox? And if so, don't you find it curious that over the 30 years of Moore's law exponential progress in computing almost nothing improved on that side of things, and we kept playing fancier games?


to save some clicks:

https://en.wikipedia.org/wiki/Moravec%27s_paradox

Moravec's paradox is the discovery by artificial intelligence and robotics researchers that, contrary to traditional assumptions, high-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources.


Yes, I am familiar with it.

What do you think of recent papers and demos by teams from Google Brain, OpenAI, and Pieter Abbeel's group on using simulations to help train physical robots? Recent advances are quite an improvement over those from the past.


I'm skeptical, and side with Rodney Brooks on this one. First, reinforcement learning is incredibly inefficient. And sure, humans and animals have forms of reinforcement learning, but my hunch it that it works on an already incredibly semantically relevant representation and utilize the forward model. That model is generated by unsupervised learning (which is way more data efficient). Actually I side with Yann Lecun on this one, see some of his recent talks. But Yann is not a robotics guy, so I don't think he fully appreciates the role of a forward model.

Now using models for RL is the obvious choice, since trying to teach a robot a basic behavior with RL is just absurdly impractical. But the problem here, is that when somebody build that model (a 3d simulations) they put in a bunch of stuff they think is relevant to represent the reality. And that is the same trap as labeling a dataset. We only put in the stuff which is symbolically relevant to us, omitting a bunch of low level things we never even perceive.

This is a longer subject, and a HN is not enough to cover it, but there is also something about the complexity. Reality is not just more complicated than simulation, it is complex with all the consequences of that. Every attempt to put a human filtered input between AI and the world will inherently loose that complexity and ultimately the AI will not be able to immunize itself to it.

This is not an easy subject and if you read my entire blog you may get the gist of it, but I have not yet succeeded in verbalizing it concisely to my satisfaction.


What, no progress six months after achieving a goal thought impossible even just few years ago? Pack it up boys it’s all over but the crying.


I was thinking just that when reading the paragraphs about the uber accident. There's absolutely nothing indicating that future progress is not possible, precisely because of how absurd it seems right now.


Retrospectively it might sound that the Japanese were partially right in pursuing "high performance" computing with their fifth generation projects [1] but the Alpha Zero results are impressive beyond the computing performance achieved. It was a necessary element but not the only one.

[1] https://mobile.nytimes.com/1992/06/05/business/fifth-generat...


> petabytes and petabytes of flops

Why not petaflops of bytes then?


>> Makov Tree Search

You mean Monte Carlo Tree Search, which is not at all like Ma(r)kov chains. You're probably mixing it up with Markov decision processes though.

Before criticising something it's a good idea to have a solid understanding of it.


We very well might be in a deep-learning 'bubble' and the end of a cycle... but I don't think this time around it's really the end for a long-while, but more likely a pivot point.

The biggest minds everywhere are working on AI solutions, and there's also a lot in medical/science going on to map brains and if we can merge neuroscience with computer science we might have more luck with AI in the future...

So we could have a draught for a year or two, but there will be more research, and more breakthroughs. This won't be like the AI winters of the past where it lay dormant for 10+ years, I don't think.


Moore's law (or at least, the diminishing one) is not relevant here because these are not single threaded programs. Google put 8x on their TPUv2 -> v3 upgrade; parallel matrix multiplies at reduced precision are a long way away from any theoretical limits, as I understand it.


Totally agree but why on earth down voted?

The first generation TPUs used 65536 very simple cores.

In the end you have so many transistors you can fit and there are options on how to arrange and use.

You might support very complex instructions and data types and then four cores. Or you might only support 8 bit ints, very, very simple instructions and use 65536 cores.

In the end what matters is the joules to get something done.

We can clearly see that we have big improvements by using new processor architectures.


A different take by Google’s cofounder, Sergey Brin, in his most recent Founders’ Letter to investors:

“The new spring in artificial intelligence is the most significant development in computing in my lifetime.”

He listed many examples below the quote.

“understand images in Google Photos;

enable Waymo cars to recognize and distinguish objects safely;

significantly improve sound and camera quality in our hardware;

understand and produce speech for Google Home;

translate over 100 languages in Google Translate;

caption over a billion videos in 10 languages on YouTube;

improve the efficiency of our data centers;

help doctors diagnose diseases, such as diabetic retinopathy;

discover new planetary systems; ...”

https://abc.xyz/investor/founders-letters/2017/index.html

An example from another continent:

“To build the database, the hospital said it spent nearly two years to study more than 100,000 of its digital medical records spanning 12 years. The hospital also trained the AI tool using data from over 300 million medical records (link in Chinese) dating back to the 1990s from other hospitals in China. The tool has an accuracy rate of over 90% for diagnoses for more than 200 diseases, it said.“

https://qz.com/1244410/faced-with-a-doctor-shortage-a-chines...


Hi, author here:

Well first off: letters to investors are among the most biased pieces of writing in existence.

Second: I'm not saying connectionism did not succeed in many areas! I'm a connectionist by heart! I love connectionism! But that being said there is disconnect between the expectations and reality. And it is huge. And it is particularly visible in autonomous driving. And it is not limited to media or CEO's, but it made its way into top researchers. And that is a dangerous sign, which historically preceded a winter event...


I agree that self-driving had/have been overhyped over the previous few years. The problem is harder than many people realize.

The difference between the current AI renaissance and the past pre-winter AI ecosystems is the level of economic gain realized by the technology.

The late 80s-early 90s AI winter, for example, resulted from the limitations of expert systems which were useful but only in niche markets and their development and maintenance costs were quite high relative to alternatives.

The current AI systems do something that alternatives, like Mechanical Turks, can only accomplish with much greater costs and may not even have the scale necessary for global massive services like Google Photos or Youtube autocaptioning.

The spread of computing infrastructure and connectivity into the hands of billions of global population is a key contributing factor.


> The difference between the current AI renaissance and the past pre-winter AI ecosystems is the level of economic gain realized by the technology

I would argue this is well discounted by level of investment made against the future. I don't think the winter depends on the amount that somebody makes today on AI, rather on how much people are expecting to make in the future. If these don't match, there will be a winter. My take is that there is a huge bet against the future. And if DL ends up bringing just as much profit as it does today, interest will die very, very quickly.


Because there is a dearth of experts and a lack of deep technical knowledge among many business people, there are still a great many companies that have not yet started investing in deep learning or AI despite potential profits based on current technology. Non-tech sectors of the economy are probably underinvesting at the moment.

This is analogous to the way electricity took decades to realize productivity gains in the broad economy.

That said, the hype will dial down. I am just not sure the investment will decrease soon.


While I agree there is underinvestment in non-tech sectors, I don't see why that would change and they will use deep learning. There are lots of profitable things in non-tech sectors that can be done with linear regression but not done.


There are lots of things in the non-tech sector that can be automated with simple vanilla software but isn't. To use AI instead, you need to have 1) sophisticated devs in place, 2) a management that gets the value added, 3) lots of data in a usable format, 4) willingness to invest & experiment. Lots of non-tech businesses are lacking one if not all of these.


This. And at the end of the day, deep learning is just a more sophisticated version of linear regression. (To listen to some people talking, you'd think if a machine just curve-fits enough data-points, it'll suddenly wake up and become self-aware or something? The delusion is just unbelievable!)


Just like it did for ecommerce? Expectations we're wildly inflated, there was a bust, the market re adjusted and value was created


Only after the dot-com bust killed off all the weaklings ... there was a brief "cold-snap" after 2001 ... then Web 2.0 happened.

So I guess we're waiting for something similar to happen with AI and then get AI 2.0?


> I agree that self-driving had/have been overhyped over the previous few years. The problem is harder than many people realize.

The current road infrastructure (markings, signs) has been designed for humans. Once it has been modernized to better aid the self-driving systems, we don't probably need "perfect" AI.


But current signs designed for humans work well. They're machine readable (traffic sign detection is available from basically all manufacturers) and can (usually) be understood without prior knowledge and don't need much change over decades. I think there are few examples where messages were designed for computers but are easy to understand independent of the system, manufacturer. ASCII encoded text files are the only thing that come to mind.


Hi, why in your analysis you spoke only about the companies that are not doing so well in self driving leaving out waymo success story? They are already have been hauling passengers without a safety pilot since last October. I guess without the minimum problem otherwise we would have heard plenty in the news like it happened for Tesla and Uber accidents. Is it not too convenient to leave out the facts that contradict your hypothesis?


Is this happening inside google campus or in a city?



Where are the Waymo cars running? Everywhere? Are they still veering into buses to avoid sandbags?


Phoenix, and they plan to start a commercial service by the end of the year in several US cities.


Making cars that drive safely no current, busy roads is a very difficult task. It is not surprising that the current systems do not do that (yet). It is surprising to me how well they still do. The fact though that my phone understands my voice and my handwriting and does on the fly translation of menus and simple requests is a sign of a major progress, too.

AI is overhyped and overfunded at the moment, which is not unusual for a hot technology (synthetic biology; dotcoms). Those things go in cycles, but the down cycles are seldom all out winters. During the slowdowns best technologies still get funding (less lavish, but enough to work on) and one-hit wonders die, both of which is good in the long run. My friends working in biology are doing mostly fine even though there are no longer "this is the century of synthetic biology" posters at every airport and in every toilet.


How can something be biased when it's listing facts?

Those are actual features that are available today to anyone, that were made possible by AI. Do you think it would be possible to type "pictures of me at the beach with my dog" without AI in such as short time frame? Or to have cars that drive themselves without a driver? These are concrete benefits of machine learning, I don't understand how that's biased.


How can something be biased when it's listing facts?

If there are 100 facts that indicate a coming AI winter, and Brin just talks up the 15 facts that indicate AI's unalloyed dominance, that's definitely biased.


First, what are said 100 facts? The article looks at fairly mundane metrics such as number of tweets or self-driving accidents...

Second, I'm not quite sure that's how it works. Like in mathematics, if your lemma is X, you can give a 100 examples of X being true, but I only need a single counter-example to break it.

In my opinion a single valid modern use-case of AI is enough to show that we're not in an AI winter. By definition an AI winter means that nothing substantial is coming out of AI for a long period of time, yet Brin listed that Google alone has had a dozen in the past few years.


> First, what are said 100 facts?

You cannot ask a generic question, then attack the answer based on absence of evidence for a specific example.


Can't speak for the other items, but these

>translate over 100 languages in Google Translate;

>caption over a billion videos in 10 languages on YouTube;

barely even work. Yeah, it's a difficult problem but it's not even close to being solved.


YouTube captioning in English works surprisingly well, the improvement over the last few years is huge. It still chokes on proper nouns but in general it mostly works.


I think it's a bit like self-driving cars in the sense that it's good enough to be impressive but not good enough to be actually usable everywhere. Of course self-driving is worse because people seldom die of bad captions.

Google's captioning works well when people speak clearly and in English. Google translate works well when you translate well written straightforward text into English. It's impressive but it's got a long way to go to reach human grade transcription and translation.

I think when evaluating these things people underestimate how long the tail of these problems is. It's always those pesky diminishing returns. I think it's true for many AI problems today, for instance it looks like current self-driving car tech manages to handle, say, 95% of situations just fine. Thing is, in order to be actually usable you want something that critical to reach something like 99.999% success rate and bridging these last few percent might prove very difficult, maybe even impossible with current tech.


What's important to remember, I think, is that we should not compare YouTube auto captions to human made captions, because auto captions were not created as a substitute for human made captions - if it wasn't for auto captioning, all these videos wouldn't get any captions at all. They may never be perfect, but they're not designed to be, they're creating new value on their own. And IMO they crossed the threshold of being usable, at least for English.


Mh no it does not. It is just a source of hilarity apart from a few very specific cases (political speeches mostly, because of their slow pace, good english and prononciation I guess).

Every time I activate it I am in for a good laugh more than anything actually useful.


It works for general purpose videos. Transcripts of any kind appear to stop working whenever there's domain knowledge involved. That doesn't matter for most youtube videos but is crucial if you want to have a multi purpose translator/encoder.


A. Cooper had a nice example of this kind: a dancing bear. Sure, the fact that bear dances is very amusing, but let it not distract us from the fact that it dances very very badly.


Google Translate is way, way better than it used to be (at least German > English which I suppose is probably an easier task than many languages).


Google Translate of Goethe:

    Have now, ah! Philosophy,
    Law and medicine,
    And unfortunately also theology
    Thoroughly studied, with great effort.
    Here I am, I poor gate!
    And I'm as smart as before
word for word not completely bad, but then it breaks when we have to translate 'Tor'. Google Translate is clueless because it is unable to derive that here the 'fool' is meant.

It's unable to 'understand' that 'I poor gate' makes no sense at all.

Google Translate is the 'poor gate'.


On the other hand Deepl gives translations for news articles that are of such quality that it allows me to read international news as if it were local. Definitely useful.


DeepL is better, but it basically has the same problems. I understand both German and English - and I can easily detect where DeepL also shifts the meaning of sentences, sometimes even to mean the opposite. DeepL like Google Translate has no concept for 'is the meaning preserved?'.

You may think that you can now read German news, but in fact you would not know if the sentence meaning has been preserved in the English translation. The words itself might look as if the sentence makes sense - but the meaning is actually shifted - slight differences, but also possibly the complete opposite.

The translation also does not give you any indication where this might be and where the translation is based on weak training material or where there is some inference needed for a successful translation.


Not sure if I agree. If you have some knowledge of the language is still mostly easier to translate the words you don't understand. Easy texts work, anything more complicated (e.g. science articles) not really.


" letters to investors are among the most biased pieces of writing in existence. "

Maybe true but they are words that are about things which are either true or not true. Has nothing to do where the words were shared. Saying they are on an investment letter so not relevant seems very short sighted.

But just looking at the last 12 months it is folly to say we are moving to a AI winter. Things are just flying.

Look at self driving cars without safety drivers or look at something like Google Duplex but there are so many other examples.


Of course Google (or any other company) aren't going to blatently lie in a letter to investors (that kind of thing gets you sued) but it's pretty easy to spin words to sound more impressive than they may actually be.

Using the list provided, one example

"caption over a billion videos in 10 languages on YouTube;" - This doesn't say how accurate the captions acutally are. In my experience youtube captioning even of english dialect isn't exactly great. For one example try turning on the captions on this https://www.youtube.com/watch?v=bQJrBSXSs6o

so it's true I'm sure to say they've captioned the videos AI based techniques, but that doesn't mean they're a perfected option.

Also (purely anecodtally) Google translate also isn't exactly perfect yet either...


... I don't think I understand that video even with my own ears. YouTube captioning has actually significantly improved from it's previous hilarious state


When I saw the Google demo of a CNN using video to split a single audio stream of two guys talking over each other, I became a believer.


Hey, a small advice for the future: never build your belief entirely on a youtube video of a demo. In fact, never build your belief based on a demo, period.

This is notorious with current technology: you can demonstrate anything. A few years ago Tesla demonstrated a driverless car. And what? Nothing. Absolutely nothing.

I'm willing to believe stuff I can test myself at home. If it works there, it likely actually works (though possibly needs more testing). But demo booths and youtube - never.


Your advice would be a lot more convincing if you had a youtube video of a demo to come with it. Just saying. :P


You can't test most of high end physics at home. I hope that doesn't mean you don't believe it!


A lab publishes an exciting result, what does the physics community do? Wait for replication. Or if you’re LHC, be very careful what you publish.


In theory, there are two separate experiments (ATLAS and CMS) on LHC because of this. In practice, they are probably not independent enough.


It's probably a good idea to ignore high-end physics results for a few years, though.

The BICEP2 fiasco is a good example why.


A rigorous evaluation with particular focus on where it doesn't work would be better.


Do you have a video of this?


https://youtu.be/ogfYd705cRs?t=6m54s

Edit: also, a blog post with more examples and a link to the related publication: https://ai.googleblog.com/2018/04/looking-to-listen-audio-vi...


Sorry, I went to bed!

The original google report was discussed here a few weeks ago:

https://news.ycombinator.com/item?id=16813766



> understand images in Google Photos

This is one of the areas I’m most enthusiastic about but … it’s still nowhere near the performance of untrained humans. Google has poured tons of resources into Photos and yet if I type “cat” into the search box I have to scroll past multiple pages of results to find the first picture which isn’t of my dog.

That raises an interesting question: Google has no way to report failures. Does anyone know why they aren’t collecting that training data?


They collect virtually everything you do on your phone. They probably notice that you scroll a long way after typing cat and so perhaps surmise the quality of search results was low.


Doesn’t that seem like a noisy signal since you’d have to disambiguate cases where someone was looking for a specific time/place and scrolling until they find it?

I’ve assumed that the reason is the same as why none of the voice assistants has an error reporting UI or even acknowledgement of low confidence levels: the marketing image is “the future is now” and this would detract from it.


> understand

what is this 'understand'?


Well, the way I see it: mostly, these are “improvements”, huge ones, but still. They ride the current AI tech wave, take it an optimize apps with it.

For most things, that people dream of and do marketing about need another leap forward, which we haven’t seen yet (it’ll come for sure)


Almost anything that has to do with image understanding is entirely AI. Good luck writing an algorithm to detect a bicycle in an image. This also includes disease diagnostic as most of those have to do with analyzing images for tumors and so on.

Also, while a lot of these can be seen as "improvements", in many cases, that improvement put it past the threshold of actually being usable or useful. Self-driving cars for example need to be at least a certain level before they can be deployed, and we would've never reached that without machine learning.


I agree, the effects can be very impressive. I meant, that what is achievable is quite clear now and that we need a major innovation/steps for the next leap


>caption over a billion videos in 10 languages on YouTube;

Utterly useless. And I don't think it is improving.


This is less useless than you think. Captioning video could allow for video to become searchable as easily as text is now searchable. This could lead to far better search results for video and a leap forward in the way people produce and consume video content.


I think he is stating that the quality of the transcription is poor.


You don't need amazing transcription to search a video. A video about X probably repeats X multiple times, and you only really need to detect it properly once.

As for the users, sure the translation may not be perfect, but I'm sure if you were deaf had no other way of watching a video, you would be just fine with the current quality of the transcription.


Often you need exactly that. Because it's the unique words the machine will get wrong. If you look for machine learning tutorials/presentations that mention a certain algorithm, the name of it must be correctly transcribed. At the moment, it appears to me that 95%+ of words work but exactly the ones that define a video often don't. But then again getting those right is hard, there's not much training data to base it on.


They mean useless in the end result. Of course having perfect captions could potentially allow indexable videos, but the case is that the captions suck. They're so bad in fact that it's a common meme on Youtube comments for people to say "Go to timestamp and turn on subtitles" so people can laugh at whatever garbled interpretation the speech recognition made.


Have you used/tried them recently? The improvement relative to 5 years ago is major.

At least in English, they are now good enough that I can read without listening to the audio and understand almost everything said. (There are still a few mistakes here and there but they often don’t matter.)


Yes I’ve had to turn them off on permanently. Felt I could follow video better without sound often than with subtitles.

I tried to help a couple channels to subtitle and the starting point was just sooo far from the finished product. I would guess I left 10% intact of the auto-translation. Maybe it would have been 5% five years ago; when things are this bad 100% improvement is hard to notice.

It is super cool how easy it is to edit and improve the subtitles for any channel that allows it.


>understand almost everything said...

If by 'almost everything', you mean stuff that a non native English speaker could have understood anyway, then yes.


I'd say the current Youtube autocaptioning system is at an advanced nonnative level (or a drunk native one :)) and it would take years of intensive studying or living in an English-speaking country to reach it.

The vast majority of English learners are not able to caption most Youtube videos as well as the current AI can.

You underestimate the amount of time required to learn another language and the expertise of a native speaker. (Have you tried learning another language to the level you can watch TV in it?)

Almost all native speakers are basically grandmasters of their mother tongue. The training time for a 15-year-old native speaker could be approx. 10 hours * 365 days * 15 years = 54,750 hours, more than the time many professional painists spent on practice.


Not true. The problem with Google captioning and translate is that unlike a weak speaker it makes critical mistakes completely misunderstanding the point.

A weak speaker may use a cognate, idiom borrowed from their native tongue or a similar wrong word more often. The translation app produces completely illegible word salad instead.


I was talking exclusively about auto-captioning, which has >95% accuracy for reasonably clear audio. Automatic translation still has a long way to go, I agree.


To be honest, as the other child comment said, I too have noticed they have gotten way better in the last 5 years. Also, the words of which it isn't 100% sure are in a slightly more transparent gray than the other words, which kind of helps.


I disagree, even with the high error rate, it provides a lot of context. Also, a lot of comedy.


I find the auto captions pretty useful.


I'm a scientist from a field outside ML who knows that ML can contribute to science. But I'm also really sad to see false claims in papers. For example, a good scientist can read an ML paper, see claims of 99% accuracy, and then probe further to figure out what the claims really mean. I do that a lot, and I find that accuracy inflation and careless mismanagement of data mars most "sexy" ML papers. To me, that's what's going to lead to a new AI winter.


I'm in the same situation and it's really worrying.

Deep learning is the method of choice for a number of concrete problems in vision, nlp, and some related disciplines. This is a great success story and worthy of attention. Another AI winter will just make it harder to secure funding for something that may well be a good solution to some problems.


You hear Facebook all the time saying how it "automatically blocks 99% of the terrorist content" with AI to the public and governments.

Nobody thought to ask: "How do you know all of that content is terrorist content? Does anyone check every video afterwards to ensure that all the blocked content was indeed terrorist content?" (assuming they even have an exact definition for it).


Also, how do they know how much terrorist content they aren't blocking (the 1%), since they by definition haven't found it yet?


I'm 100% convinced it can block 99% of all terrorist content that hasn't been effectively SEOed to get around their filters because that's just memorizing attributes of the training set data. Unfortunately, the world isn't a stationary system like these ML models (usually) require. I still get spam in my gmail account, nowhere near as much as I do elsewhere, but I still get it.


> Does anyone check every video afterwards to ensure that all the blocked content was indeed terrorist content?

They might not, but they could sample them to be statistically confident?


Yes, they have a test set. They don’t check every video, but they do check every video in a sample that is of statistically significant size.


FYI This post is about deep learning. It could be the case that neural networks stop getting so much hype soon, but the biggest driver of the current "AI" (ugh I hate the term) boom is the fact that everything happens on computers now, and that isn't changing any time soon.

We log everything and are even starting to automate decisions. Statistics, machine learning, and econometrics are booming fields. To talk about two topics dear to my heart, we're getting way better at modeling uncertainty (bayesianism is cool now, and resampling-esque procedures aged really well with a few decades of cheaper compute) and we're better at not only talking about what causes what (causal inference), but what causes what when (heterogeneous treatment effect estimation, e.g. giving you aspirin right now does something different from giving me aspirin now). We're learning to learn those things super efficiently (contextual bandits and active learning). The current data science boom goes far far far far beyond deep learning, and most of the field is doing great. Maybe those bits will even get better faster if deep learning stops hogging the glory. More likely, we'll learn to combine these things in cool ways (as is happening now).


Honestly as much as it is slightly irritating to see deep learning hogging all the glory, there's a lot of money being sloshed around and quite a bit of it is spilling over to non-deep learning too. Which is great. An AI winter may be coming, though I think it's at minimum several years off, since big enterprises are just getting started with the most hyped things. If the hype doesn't return on its promises enough for sustained investment (that's a rather big if since the low hanging fruit aren't yet all picked) then the companies and funding will eventually recede, maybe even trigger another winter, but just as it takes a while to ramp up, it will also take a while to course correct. In the meantime all the related areas get better funding and attention (and chance to positively contribute to secure further investment) that they'd otherwise not have since we'd still be stuck in the low funding model from the last winter.


I think the problem is the definition of AI. It appears most in the field define it as a superset of ML, encompassing all kinds of statistical methods and data analysis. For the general public, AI is a synonym for deep learning. When large companies speak about AI they always mean deep learning, never just a regression (probably also because many don't see a regression as intelligent). So AI in the public's perception could face a winter but much of the domain of machine learning would be unaffected.


>For the general public, AI is a synonym for deep learning

I'd contend for the general public, AI is a synonym for machines like: HAL; The Terminator; Star Trek's "Data"; the robots in the film "AI"; and so on.

We're nowhere remotely in the vicinity of that, and no-one even has any plausible ideas about how to start.

A random person outside of tech probably doesn't even know what deep learning is. They might have heard of it somewhere in passing.


Bayesian can be seen as a subset of deep learning or hell a superset.

AI is a superset and Machine learning is a subset of AI and most funding is in deep learning. Once Deep Learning hit the limit I believe there will be an AI winter.

Maybe there will be hype around statistic (cross fingers) which will lead to Bayesian and such.


>Bayesian can be seen as a subset of deep learning or hell a superset.

eh-hem

DIE, HERETIC!

eh-hem

Ok, with that out of my system, no, Bayesian methods are definitely not a subset of deep learning, in any way. Hierarchical Bayes could be labeled "deep Bayesian methods" if we're marketing jerks, but Bayesian methods mostly do not involve neural networks with >3 hidden layers. It's just a different paradigm of statistics.


My mentor was very very adamant about Bayesian network and hierarchical as being deep learning.

He sees the latent layer in the hierarchical model as the hidden layer and the Bayesian just have a strict restrictions/assumptions to the network where as the deep learning is more dumb and less assuming. A few of my professor thinks that PGM, probability graphical model is a super set of deep learning/neural network.

This is where my thinking come from.

IIRC, a paper have shown that gradient descent seems to exhibit MCMCs (blog with paper link inside that led to this conclusion of mine: http://www.inference.vc/everything-that-works-works-because-...).

But I am not an expert in Neural Network nor know the topic well enough to say such a thing. Other than was deferring to opinions of some one that's better than myself. So I'll keep this in mind and hopefully one day have the time to do more research into this topic.

Thank you.


I think your link, and your mentor, are somewhat fundamentalist about their Bayesianism.


How can Bayesian stuff be seen as a subset or superset of deep learning?


I guess the point that digitalzombie is trying to make is most of what we call AI or ML or even Deep learning is simply extension of statistics on computers.

Things like the German tank problem or the problem of hardening airplanes during WW2 have that very AI'esque feel to it. Where you use data to build a model, then let that data from the model to change the model as it fits.

Also the whole thing about 'decision making' is either bayesian or frequency based models in nature. Most of these algorithms and math has long existed before the current boom.

Its just that the raw computing power and resources that you have today make it possible for you to deal with large amounts of data to stress test your models.


Forget about self driving cars - the real killer application of deep learning is mass surveillance - there are big customer for that (advertising, policing, political technology - we better get used to the term) and its the only technique that can get the job done.

I sometimes think that there really was no AI winter as we got other technologies that implemented the ideas: SQL Databases can be seen as an application of many ideas in classical AI - for example its a declarative language for defining relations among tables; you can have rules in the form of SQL stored procedures; actually it was a big break (paradigm shift is the term) in how you deal with data - the database engine has to do some real behind the scenes optimization work in order to get a workable representation of the data definition (that is certainly bordering on classical AI in complexity).

these boring CRUD applications are light years ahead in how data was handled back in the beginning.


The BBC recently requested information about the use of facial recognition from UK police forces. Those that use facial recognition reported false positive rates of >95%. That led some to abandon the systems, others just use it as one form of pre-screening. Mass surveillance with facial recognition is nowhere near levels where it can be used unsupervised. And that's even before people actively try to deceive it.

For advertising, I'm also not sure if there's been a lot of progress. Maybe it's because I opted out too much but I have the feeling that ad targeting hasn't become more intelligent, rather the opposite. It's been a long time that I've been surprised at the accuracy of a model tracking me. Sure, targeted ads for political purposes can work very well but are nothing new and don't need any deep learning nor any other "new" technologies.

Where I really see progress is data visualisation. Often dismissed it can be surprisingly hard to get right and tools around that (esp for enterprise use) have developed a lot over recent years. And that's what companies need. No one's looking for a black-box algorithm to replace marketing, they just want to make some sense of their data and understand what's going on.


Aha, yeah I saw this in the news - pretty classic use of people horribly misunderstanding statistics and/or misrepresenting the facts. Let's say one person out of 60 million has sauron's ring. I use my DeepMagicNet to classify everyone, and get 100 positive results. Only one is the ringbearer, so I have a 99% error rate. Best abandon ship.


I do think of myself that I can read statistics. They didn't mention it as a false positive rate, that's what I interpreted. I don't have the article here but they said that the system gave out ~250 alerts, of which ~230 turned out to be incorrect. It didn't specify at all how big the database of potential suspects was. The number of scanned faces was ~50-100k each time (stadium). Nevertheless, the 230/250 is the issue here, simply because it destroys trust in the system. If the operator already knows that the chance of this being a false alarm is ~5%, will they really follow up all alerts?


> Those that use facial recognition reported false positive rates of >95%.... Mass surveillance with facial recognition is nowhere near levels where it can be used unsupervised.

Is this deep neural networks with the latest technologies?

While yes, deep learning isn't going to solve everything, we'll probably see significant changes in the products available as this technology discovered the past few years makes into the real world.

Most scanners that do OCR and most forms of facial recognition isn't using deep neural networks with transfer learning, YET.

This is not to say that discoveries will continue, winter is probably coming :)


95% false positive rate is extremely good for surveillance, as the cost of false positive is low (wasted efforts by the police). That means for every 20 people the police investigate, one is a target.


They didn't mention arrests but I think the database used was much larger. Probably all persons of interest. The Met (London Police) has been using experts in face recognition for years and their hit rate is much higher. Those people can also identify a high number of suspects on random CCTV while the software was used in stadiums where people tend to remain at one place for most of the time.

So 95% false positive can be better than nothing but it's nowhere near what trained policemen can do.


I disagree. Even if there are a number of big customers for mass surveillance, self driving cars fundamentally changes the platform of our economy for everyone.


That sounds like US-centric PoV. For large portion of (most?) Europeans self-driving won't change much in their lives, definitely not anything major.


> Deepmind hasn't shown anything breathtaking since their Alpha Go zero.

Didn't this just happen? Maybe my timescales are off, but I've been thinking about AI and Go since the late 90s, and plenty of real work was happening before then.

Outside a handful of specialists, I'd expect another 8-10 years before the current state of the art is generally understood, much less effectively applied elsewhere.


I had the same response. AlphaZero was published like 5 months ago. Saying they've reached the end of the line because they haven't matched AlphaZero in six months is lame.


It also marked the end of a major multi-year project. With Deepmind moving that team to focus on other problems I wouldn't expect immediate results.


Also why does every single result has to be breathtaking? Here's a quick example, at IO they announced that their work on Android improved battery life by up to 30%. That's pretty damn impressive.


> Also why does every single result has to be breathtaking?

If you build the hype like say Andrew Ng it better be. Also if you consume more money per month than all the CS departments of a mid sized country, it better be.


In terms of hype you may be right, but it doesn't mean that if something doesn't live up to the hype of Andrew Ng or Elon Musk it won't still be pretty good.

For instance: even if Elon Musk doesn't colonize Mars but instead just builds the BFR, that would still be amazing; even if BFR is never build but falcon 9 becomes fully reusable that would be great; even if falcon 9 won't be fully reusable, the fact that it cut the launching cost to space is still pretty good.

Even if we don't achieve any great breakthroughs with AGI, the fact that we started to use transfer learning to diagnose human disseases is pretty amazing; the fact that a japanese guy used tensorflow on a raspbery pi to categorize real cucumbers by shape is amazing.

All of this stuff won't go away; people will not say "hey, let's just forget about this deep learning thing and put it in some dusty shelf, it's useless for now". Maybe it will take 20 or 50 more years, maybe it's a slow thaw, but how could this be a winter?


Exactly! Thank you. There’s a delusion that every result needs to be Nobel worthy, But the Nobel prize-worthy discoveries are all founded upon the boring stuff we don’t hear about it.


Honestly I think the raspberry pi indicates the (short-term) future of AI. Most of the “easy” problems have been solved (image classification, game playing), but the hard ones like NLP are orders of magnitude more complex and therefore elusive.

I’m happy to leave the hard problems for the PhDs and the big tech researchers. Go nuts, folks.

In the meantime, the applications for small-scale, pre-trained neural networks seem limitless. Manufacturing, agriculture, retail, pretty much any industry could make use of portable neural networks.


I feel exactly the same way. Just wait 2-3 years before someone launches an embedded TPU and the sky will be the limit.


Because it's the only time we see it in action. The speech recognition of my Amazon Echo is still subpar (and it feels like it's getting worse each week) and ad targeting also hasn't really improved. Of all those claims that came with deep learning, Go was the only time where you really saw a result. I'm not sure which version of Android will bring the improved battery life (and which manufacturers) but I wouldn't be surprised if the 30% were a bit optimistic.

I get that a lot of services we use on a daily basis make use of deep learning to accomplish tasks. But I don't really see what has fundamentally changed over the past 5 years in the way I use services. Siri was introduced 7 years ago and while we have clearly made progress in voice recognition, it's nowhere close to what many had hoped.


Warning 23 year old CS grad angst ridden post:

I'm very sick of the AI hype train. I took a PR class for my last year of college, and they couldn't help but mention it. LG Smart TV ads mention it, Microsoft commercials, my 60 year old tech illiterate Dad. Do any end users really know what it's about? Probably not, nor should that matter, but it's very triggering to see something that was once a big part of CS turned into a marketable buzzword.

I get triggered when I can't even skim through the news without hearing Elon Musk and Steven Hawking ignorantly claim AI could potentially takeover humanity. People believe them because of their credentials, when professors who actually teach AI will say otherwise. I'll admit, I've never taken any course in the subject myself. An instructor I've had who teaches the course argues it doesn't even exist, it's merely a sequence of given instructions, much like any other computer program. But hey, people love conspiracies, so let their imagination run wild.

AI is today what Big Data was about 4 years ago. I do not look highly on any programmer that jumps bandwagons, especially for marketability. Not only is it impure in intention, it's foolish when their are 1000 idiots just like them over-saturating the market. Stick with what you love, even if it's esoteric. Then you won't have to worry about your career value.


You can finish a CS undergrad without taking any AI course? Or just haven't taken one yet? It's very helpful to go through even a tiny bit of AI: A Modern Approach to cut through a lot of the hype. What annoys me is that when people say "Machine Learning" these days they almost invariably mean deep learning, ignoring all the rest of AI.

> I can't even skim through the news without hearing Elon Musk and Steven Hawking ignorantly claim AI could potentially takeover humanity.

Have you considered that their claims may not in fact be ignorant, just the reporting around them? For some details perhaps you would start with this primer from a decade ago, section 4 & 5 if you're in a hurry: http://intelligence.org/files/AIPosNegFactor.pdf

Or if you want a professor's opinion, from one of the co-authors to the previously mentioned AI:AMA check out some of the linked pointers on his home page: http://people.eecs.berkeley.edu/~russell/


I know right, the school I went to wasn't exactly the best.

I skimmed through sections 4 & 5, optimization processing was difficult to understand.

When I was in elementary school, I remember pitying the mentally disabled children, knowing their financial success was destined, so I connected with the g-factor definition. I really think general intelligence is more of a sense of all clusters awareness, whether it be social or cognitive. I've met tons of great students in Math courses who simply cannot converse with the general public. I've also met tons of people on the streets of my city who would have a difficult time understanding high school algebra.

As for Section 5, I do think the rise of AI over humanity is completely in our grasp. I really should take a course on the subject before I sound like the people that I'm criticizing for ignorance, but from a general perspective, I cannot see it outside of our control. As Eliezer said, we can make predictions, but only time will clear the fog.


> What annoys me is that when people say "Machine Learning" these days they almost invariably mean deep learning, ignoring all the rest of AI.

But that's not people's fault. Companies only say AI if they mean deep learning. I've yet to hear a company advertising AI if they accomplished it with a linear regression. Maybe experts should stop talking about AI and use specific terms instead (Deep Learning in the case of this article).


I hate it when people say deep learning instead of neural networks.


Your post has two different points, and I think they should be separated. Yes, there's an AI hype train, and it's pretty tiring. I'm starting to wonder when the first AI-enhanced shoes are going to come out. Or an AI enhanced blockchain, right?

That said, the rest of the rant about AI is less solid. Sure, AI today is fairly boring and run of the mill data processing / optimisation stuff (I sort of know, though I only have an MSc in the topic). Much of the promise of near-future AI is that we can go from human designed, or shall we say human-bootsrapped AI to self-bootstrapping kind. The fact that AlphaGo pretty much does this (in a very limited capacity), and accomplished something which classical programming and game "AI" couldn't, should show us that we're pretty close to this type of AI being highly effective. How exactly the future unfolds from there is anyone's guess, but outright calling it ignorant is... pretty ignorant, IMHO.


> I'm starting to wonder when the first AI-enhanced shoes are going to come out.

Wait no more:

The first connected cycling insole with artificial intelligence: https://www.digitsole.com/store/connected-insoles-cycling-ru...

The World's First Intelligent Sneaker: https://www.kickstarter.com/projects/141658446/digitsole-sma...


I never claimed AI was boring or run of the mill, just that it's not in my current interest. It's when I hear Hawking make claims like this, I call ignorance.

"Computers can, in theory, emulate human intelligence, and exceed it,' he said. 'Success in creating effective AI, could be the biggest event in the history of our civilization. Or the worst. We just don't know. So we cannot know if we will be infinitely helped by AI, or ignored by it and side-lined, or conceivably destroyed by it.”

I feel like I'm in a high school stoner circle, yeeesh!


So, Hawking's preposition is that AI can surpass human intelligence. Is this the part you're disagreeing with? It's one thing to disagree, and it's another to call it ignorant. I for one agree that it can / will surpass human intelligence, it's just a question of when.


Unless natural disaster and/or war resets humanity, it seems inevitable. One only needs to concede that there is (probably) nothing special about the biological nature of the brain.


That's my reasoning as well, regarding the biological part. On the one hand, it seems to be an incredibly energy-efficient solution, but has several detractors as well (in a lighthearted fashion: lack of scalability and high-bandwidth networking, several billion years of legacy code, etc).

On the other hand, it's unclear whether computers will continue to get faster in the fashion that they are today for long enough. If not, then huge efficiency improvements would be necessary on the software side, though those do seem to be happening as well.


I think it's completely possible AI can surpass our intelligence. Whether it's outside of our awareness or control, not likely.


I'm in same boat, except now I have to deal with both this and blockchain.

It's pretty clear to everyone pushing this is being dishonest. Sometimes I wonder if they intentionally being dishonest or if they just don't know what they don't know. Very few people are doing useful true machine learning, and the applications are very specific with its own set of quirks. It's just to make some quick money and get out.

What somehow is never being said with all this hype is that you need to hire good software devs to make a great solution to a problem. All these buzzword driven ideas tend to confuse a bunch of people and die out after wasting everyone's money. In my region I keep seeing a bunch of business suits pushing X idea being solved with Y with no justification or engineers behind them.

Yesterday I have found out I have to have yet another meeting to explain why "blockchain" does not translate to a full solution given to a very superficial poorly thought out problem.

The only reason I've had a voice in my region with all this noise is because I've made things people see that actually work.


I work for a consultancy and I was brought in to talk about ‘our blockchain strategy’, because I was one of the few people at the company who actually read the white papers and had been invested in it. This was towards the end of the last bull run. I think they expected me to tell them to go all in on it, but I essentially said it was a bunch of bullshit hype and a solution in search of a problem and then I didn’t get invited to any more meetings. A few weeks later they announce we’re going to start selling block chain solutions right as the crypto market crashes.

Meanwhile in the real world they can’t even figure out the basics like containers and Cicd which we’re actually dealing with in our actual contract.


> Very few people are doing useful true machine learning, and the applications are very specific with its own set of quirks.

I worked on nudity detector in 2017. Deep learning works, and is useful. Although you are right it's very specific and quirky.

I found "How HBO's Silicon Valley built Not Hotdog" article very interesting, because it's basically the same problem. They found MobileNet better than SqueezeNet and ELU better than ReLU. You know what? We found SqueezeNet better than MobileNet and ReLU better than ELU for our problem and data. Who know why.

https://medium.com/@timanglade/how-hbos-silicon-valley-built...


But this is normal with any new technology -- there used to be car movie places and people thought of putting miniature nuclear reactors in everyday appliances, it doesn't mean there's a car winter coming or a nuclear winter coming(pun intended).

It's absolutely human nature to think of crazy ways of using things in new ways, probably most of those ways don't work out in the end.


The history of humanity is full of examples of people using technology to prevail over other groups of people. Applications of AI and ML will be no exception: computer vision, game theory, autonomous systems, material design, espionage, cryptography... You name it.

Supraintelligent AI is not required to cause severe problems.


I'm sure they have, but none of those fields have been exploited for marketability, at least nowhere near this degree.


Not sure what you mean, but for the most part AI and ML is being integrated into systems where there's too much data to be processed manually... like satellite/aerial imagery, tapped internet backbones, security cameras, etc.


AI taking over is the biggest threat facing humanity, but I don't think Hawking ever claimed it was imminent; it's likely 1000+ years away.

It should be obvious why a superior intelligence is something dangerous.


"it's likely 1000+ years away."

That seems like a pretty high number when you consider the exponential rate of technological advancement.

1000 years in the future is probably going to be completely unrecognizable to us given the current rate of change in society/tech.


If the response to AI existential threats was "it's too far in the future" then you'd have a point. But usually it's just "AI will obey it's programming" or "AI will never be as smart as us".


True exponentials exist very rarely. What we are used to call exponential is often at most quadratic and even then the trend usually exists only for some limited time window.

I wouldn't dare to make prediction about mankind 1000 years from now based on a relatively short time window of technological growth. There are many huge obstacles in the way.


Even the present would be mostly unrecognizable to someone who lived 1000 years ago.

1000 years from now will certainly be even stranger to us.


The day we create a superior intelligence will be the greatest day of humanity. It will be fantastic if there's something beyond humanity. A logical, evolutionary conclusion to us.

All our Darwinian ancestors never had the capacity for intellectual fear of their superior successors...

Hopefully we as a species can get beyond our fear.


That's at least a logical belief, but it's still a threat, because we don't know how it will behave and we'll be helpless to control it.


It would be great if people feared climate change the same way they fear AI takeover.


Climate change is not an existential threat - primitive humans survived for 200+ kyrs when temps were 10°C cooler and thrived during warmer temps.


They weren’t reliant on agriculture to sustain their civilisation.

Sure, the human race is unlikely to die out unless we get to mass-exinction levels of climate change, but our civilisation is a lot more fragile.


If agriculture is the biggest worry, it’s no big deal. Not only will more land enter temperate zones, but I don’t doubt we’ll be able to breed heat resistant varieties fast enough to keep up with 0.05C annual changes. That’s nothing.


The biggest threat facing humanity is 1000+ years away?


AI is already taking over humanity. The news you read, the products you buy, the advice you take, the friends you meet are all partially or fully supported by machine-learned algorithms.

Personally, I do not doubt AI-based methods are changing our language, our communication patterns and our transport infrastructure.

The problem with the statement "AI will take over humanity" is actually in:

- What exactly is AI? There are many definitions. Most researchers adopt the weakest forms, whereas the general public adopts the strongest form.

- What exactly is 'take-over'? Does this mean: in control? Like a dictator is in control over a country? Or: adopting us as slaves? As a gradual change, when does it 'take-over'? At 50%? Does this need to be a conscious action by an AI actor, or would an evolutionary transition suffice?

- What exactly is humanity? I would go for the definition: "the quality or state of being human", but most people probably read in it: "the human race". In the former case, technology is a part of the quality of being human. In Heideggerian fashion, we become the technology and the technology becomes us. Technology, and AI as part of it, has been taking over humanity since we started permanently adjusting our environments.


> People believe them because of their credentials, when professors who actually teach AI will say otherwise

While some experts like Andrew Ng are sceptical of AI risk, there are lots of others like Stuart Russell who are concerned.

Here is a big list of quotes from AI experts concerned about AI risk: http://slatestarcodex.com/2015/05/22/ai-researchers-on-ai-ri...


I mean at least you don't have to worry about Steven Hawking bothering you any more...


>I get triggered

You should be more considerate than to be throwing around this term.


Author's reasons:

1.Hype dies down (which is really good! Meaning the chance of burst, is actually lower!)

2.Doesn't scale is false claim. DL methods have scaled MUCH better than any other ML algorithms in recent history (scale SVM is no small task). Scaling for DL methods are much either as comparing to other traditional ML algorithms, where it can be naturally distributed and aggregated.

3. Partially true. But self-driving is a sophisticated area by itself, DL is part of it, it can't really put full claim on its potential future success or ultimate downfall.

4. Gary Marcus isn't an established figure in DL research.

AI winter will ultimately come. But it is because people will become more informed about DL's strengths and limits, thus becoming smarter to tell what is BS what is not. AGI is likely not going to happen just with DL, but that is no way meaning it is a winter. DL has revolutionized the paradigm of Machine Learning itself, the shift has now complete, it will stay for a very very long time, and the successor is likely to build upon it not subvert it completely as well.


Author here: I'm using deep learning daily so I have a bit of an idea on what I'm talking about.

1) Not my point. Hype is doing very well. But narrative begins to crack, actually indicative of a burst... 2) DL does not scale very well. It does scale better than other ML algorithm because those did not scale at all. If you want to know what scales very well, look at CFD (computational fluid dynamics). DL in nowhere near that ease in scaling. 3) self driving is the poster child of current "AI-revolution". And it is where by far most money is allocated. So if that falls, rest of DL does not matter. 4) Not that this matters, does it?


CFD is good at using big machines "efficiently", but the cost of DNS scales as the cube of the Reynolds number which will never be tractable for most engineering problems. Apart from niche basic research on the edge of tractability, all the effort goes into modeling (RANS, DES, wall, etc.) to deliver statistically calibrated estimates of functionals of interest at feasible cost. Those methods actually don't "scale as well" (though the state of research is ahead of commercial software), but also don't need to because they can solve the problem in less time with less hardware. This situation is actually pretty similar to your DL analogy where more hardware provides diminishing returns for solving the actual problem.


The scaling argument in the article doesn't make any sense. There are rhetorical queries like "does this model with 1000x as many parameters work 1000x as well?" but what it means to scale or perform are not clearly or consistently defined - let alone defined in a way that would make your point about the utility of the advances.

OpenAI's graph shows new architectures being used with more parameters because people are innovating on architecture and scale at the same time. Arguing that old methods "failed to scale" is like arguing that processor development was a failure because Intel had to develop a 486 instead of making a 386 work with more transistors (or more something).

And what does CFD have to do with anything, except maybe an odd attempt to argue from authority? Can you formalize from CFD a notion of "scaling well" well that anyone else agrees is useful for measuring AI research?


CFD was merely used as an example of something that does scale well. I'm not sure it was the best example, since CFD isn't very common. But basically you have a volume mesh and each cell iterates on the Navier-Stokes equation. So if you have N processor cores, you break the mesh in N pieces, each of which get processed in parallel. Doubling the number of cores allows you process double the amount in the same time, minus communication loses (each section of the mesh needs to communicate the results on its boundary to its neighbors).

I don't fully understand the graph, but it looks like his point is that Alpha Go Zero uses 1e5 times as many resources than AlexNet, but does not produce anywhere near 10,000 times better results. We saw that with CFDt 1e5 more cores resulted in 1e5 better results (= scales). The assertion is that DL's results are much less than 1e5 better, hence it does not scale.

Basically the argument is:

1. CFD produces N times better results given N times more resources [this is implied, requires a knowledge of CFD]. That is, f(ax) = a f(x). Or, f(ax) = 1 a * f(x).

2. Empirically, we see that DL has used 1e5 more resources, but is not producing 1e5 times better results. [No quantitative analysis of how much better the results are is given]

3. Since DL has f(a * x) = b * a * f(x), where b < 1, DL does not scale. [Presumably b << 1 but the article did not give any specific results]

This isn't a very rigorous argument and the article left out half the argument, but it is suggestive.


Thanks for that, that is essentially my point. Agree it is not very rigorous, but it gets the idea across. By scalable we'd typically think "you throw more gpu's at it and it works better by some measure". Deep learning does that only in extremely specific domains, e.g. games and self play as in alpha go. For majority of other applications it is architecture bound or data bound. You can't throw more layers, more basic DL primitives and expect better results. You need more data, and more phd students to tweak the architecture. That is not scalable.


More compute -> more precision is just one field's definition of scalable... Saying that DNNs can't get better just by adding GPUs is like complaining that an apple isn't very orange.

To generalize notions of scaling, you need to look at the economics of consumed resources and generated utility, and you haven't begun to make the argument that data acquisition and PhD student time hasn't created ROI, or that ROI on those activities hasn't grown over time.

Data acquisition and labeling is getting cheaper all the time for many applications. Plus, new architectures give ways to do transfer learning or encode domain bias that let you specialize a model with less new data. There is substantial progress and already good returns on these types of scalability which (unlike returns on more GPUs) influence ML economics.


OK, the definition of scalable is crucial here and it causes lots of trouble (this is also response to several other posts so forgive me if I don't address your points exactly).

Let me try once again: an algorithm is scalable if it can process bigger instances by adding more compute power.

E.g. I take a small perceptron and train it on pentium 100, and then take a perceptron with 10x parameters on Core I7 and get better output by some monotonic function of increase in instance size (it is typically a sub linear function but it is OK as long as it is not logarithmic).

DL does not have that property. It requires modifying the algorithm, modifying the task at hand and so on. And it is not that it requires some tiny tweaking. It requires quite a bit of tweaking. I mean if you need a scientific paper to make a bigger instance of your algorithm this algorithm is not scalable.

What many people here are talking about is whether an instance of the algorithm can be created (by a great human effort) in a very specific domain to saturate a given large compute resource. And yes, in that sense deep learning can show some success in very limited domains. Domains where there happens to be a boatload of data, particularly labeled data.

But you see there is a subtle difference here, similar in some sense to difference between Amdahl's law and Gustafson's law (though not literal).

The way many people (including investors) understand deep learning is that: you build a model A, show it a bunch of pictures and it understands something out of them. Then you buy 10x more GPU's, build model B that is 10x bigger, show it those same pictures and it understands 10x more from them. Look I, and many people here understand this is totally naive. But believe me, I talked to many people with big $ that have exactly that level of understanding.


I appreciate the engagement in making this argument more concrete. I understand that you are talking about returns on compute power.

However, your last paragraph about how investors view deep learning does not describe anyone in the community of academics, practitioners and investors that I know. People understand that the limiting inputs to improved performance are data, followed closely by PhD labor. Compute power is relevant mainly because it shortens the feedback loop on that PhD labor, making it more efficient.

Folks investing in AI believe the returns are worth it due to the potential to scale deployment, not (primarily) training. They may be wrong, but this is a straw man definition of scalability that doesn't contribute to that thesis.


You’re arguing around the point here.

Almost all reasearch domains live on a log curve; a little bit gets you a lot to start with, but eventually you exhaust the easy solutions and a lot of work gets you very little improvement.

You’re arguing we haven’t reached the plateau at the top yet, but you’ve offered no meaningful evidence that is the case.

There are real world indicators that we are reaching diminishing returns for investment in compute and research now.

The ‘winter’ becomes a thing when it becomes apparent to investors that their financial bets are based off nothing more concrete than opinions like yours, when they don’t work out.

Are we there yet? Not sure, myself, I think we can get some more wins from machine generated architectures... but I can’t see any indication that the ‘winter’ isn’t coming sooner or later.

Investment is massively outstripping returns right now... we’ll just have to see if that calms down gradually, or pops suddenly.

History does not have a good story to tell about responsible investors behaving in a reasonable manner and avoiding crashes.


Thanks for taking the time to render the more specific argument! I still don't think this is suggestive in a way that should influence readers. Here are some ways in which a naive "10x resources != 10x improvement" argument can err:

- Improvement is hard to define consistently. Sometimes, improving classification accuracy by 0.5% means reducing error by 20%, and makes economic applications that have 100x the value or frequency of use.

- Resources used in training can be amortized over billions of times the same model is reused (much more cheaply). So even achieving an epsilon improvement in the expected utility of each inference can justify a massive increase in training cost.

- Some other notions of "better results" or "less expensive" include amount of training data required, social fairness of results, memory required or power used during inference, and so on. And there are major advances in current research on each of these better formalized axes!

That last bit is what is so frustrating in reading an article like this. The author is sweeping aside with vague arguments a great deal of work that has been written and justified to a much much higher standard of rigor (not just the VCs we all like to snark about). Readers should beware of trusting a summary like this without engaging directly with the source material.


I certainly encourage everybody to consult the source material! Man, this is a blog, opinion by default not perfect.

But when I hear the keyword "major advances" I'm highly suspicious. I had seen already so many such "major advances" that never went beyond a circle of self citing clique.


As a very concrete "major advance" consider Google Translate's tiny language models [1] that can beam to your phone, live in a few megabytes, and translate photographed text for you with low power usage. This was done with incredibly expensive centralized training, but checks every meaningful box for "scalable" AI.

[1] https://ai.googleblog.com/2015/07/how-google-translate-squee...


This is a seriously flawed depiction of CFD.


2)Why do you think DL doesn't scale? I am curious. It can easily leverage thousands of GPUs, training on 300 millions of images (https://ai.googleblog.com/2017/07/revisiting-unreasonable-ef...). No other methods is even close to leverage that amount of computational power. I don't really know about CFD, but at least in ML land and dealing with ML problems, DL is very scalable, maybe only next to random forests style algorithm, where they effectively share nothing.

3)It does matter. In fact most valuable startup around DL are CV based startups, they are mainly located in China though.


What do you think of advances like those in major DeepMind papers? They seem to represent significant shifts in capabilities once matured.

Here's a recent example:

Unsupervised Predictive Memory in a Goal-Directed Agent

https://news.ycombinator.com/item?id=17177442


This paper is amazing, and exactly what I was thinking of posting in response to that part of the article. The amount of research Deepmind is putting out is astonishing, and even if you are paying attention it's hard to keep up with it all. Here are just a few papers I've been looking at from the last few months, maybe none of them are advancements on the level of AlphaGo & Zero but they still show significant progress in a wide variety of areas.

https://arxiv.org/abs/1802.10542

https://arxiv.org/abs/1802.07740

https://www.nature.com/articles/s41586-018-0102-6

https://arxiv.org/abs/1804.09401

https://arxiv.org/abs/1805.06370

https://arxiv.org/abs/1802.03006

https://arxiv.org/abs/1804.08617

https://arxiv.org/abs/1802.01561

And there are many others besides these, not to mention all the significant research being done by everyone else who isn't at Deepmind. The authors idea that interest and development of these topics is dying down or that Deepmind is running out of meaningful research to do just seems uninformed.


"Author here: I'm using deep learning daily so I have a bit of an idea on what I'm talking about."

Very weak to appeal to authority. The only true argument I can find against DL/ML/AI atm is the continuing appeal to authority by PhDs who have zero engineering knowledge, zero business sense and zero understanding of risk assessment.


I’m not sure how the hype wagon started but I for one am glad it’s about to pop.

I am working (founded) a startup and while we have AI on the roadmap for about a years time, it isn’t something that’s central to our product. (We already use some ML techniques but I’d not confidently boast its the same thing as AI).

Cue an informal lunch with a VC guy who takes a look, says we’re cool and tells us just to plaster the word AI in more places - he was sure we could raise a stupendous sum of cash doing that.

As an AI enthusiast I was bothered by this. We have everyone and their mother hyping AI into areas it’s not even relevant in, let alone effective at.

A toning down would be healthy. We could then focus on developing the right technology slowly and without all the lofty expectations to live up to.


> ...just to plaster the word AI in more places...

I bet that's how the new "AI-assisted" Intellisense in Visual Studio got greenlit:

https://blogs.msdn.microsoft.com/visualstudio/2018/05/07/int...

If an AI-infested text editor isn't a sure sign that the bubble is going to pop soon then I don't know ;)


I think part of this ridiculousness may just be that the common lexicon doesn't have enough words for AI. "AI-assisted Intellisense" at the moment seems to boil down to a marginally novel way of ordering autocomplete suggestions, and like...

It's not wrong? It's neat, it's potentially useful, and it's powered by something under the umbrella of "AI". The problem is that that umbrella is gigantic, and covers everything from the AI system providing routes for the AI system in a self-driving truck on AI-provided schedules for a hypothetical mostly-automated shipping business, to a script I whipped up in ten minutes to teach something 2+2 and literally nothing else.

So we get to the nonsense position where there isn't a better way to describe a minor improvement to what is essentially the ordering of a drop-down list except by comparison to the former example.


AI winter is not on its way. We constantly get new breakthroughs and there's no end in the view. For example, in the last year a number of improvements in GANs were introduced. This is really huge, since GANs are able to learn a dataset structure without explicit labels, and this is a large bottleneck in applying ML more widely.

IMO, we are far away from AGI, but even current technologies applied widely will lead to many interesting things.


I sure agree there are many interesting things going on, there is no question about that. Also most of them are toy problems focused in some restricted domains, while a huge bag of equally interesting real world problems is sitting untouched. And let me tell you, all those VC's that put in probably way north of $10B are not looking forward to more NIPS papers or yet another style transfer algorithm.


>Also most of them are toy problems focused in some restricted domains, while a huge bag of equally interesting real world problems is sitting untouched.

It always starts with toy problems. Recognizing pictures from imagenet was also a toy problem back then.


"Deepmind hasn't shown anything breathtaking since their Alpha Go zero"

... what about when the Google assistant near perfectly mimicked a human making a restaurant reservation .... the voice work was done at DeepMind.

All the problems in AI haven't been solved yet? Well no, of course not. Limitations exist and our solutions need to be evolved.

I think perhaps the biggest constraint is requiring huge amounts of training data so solve problem X. Humans simply don't need that, which must be some indication that what we're doing isn't quite right.


what about when the Google assistant near perfectly mimicked a human making a restaurant reservation

Any sufficiently advanced technology is indistinguishable from a rigged demo


The Google Duo demo was definitely very impressive. The only caveat I can think of is that they didn't say how often it works. That might have been the one success from hundreds of calls.


> what about when the Google assistant near perfectly mimicked a human making a restaurant reservation

That seemed pretty staged.


> Humans simply don't need that, which must be some indication that what we're doing isn't quite right.

Not really. DNNs are much simpler and need to be much more specialized towards specific tasks than the human brain. They're more like a nematode that was optimized by evolution for millions of years to tell cats and dogs apart because it preferentially infects dogs. It's the only thing it does. It does it via some shortcuts and it won't be able to learn chess without a radical redesign.

Not to mention that primate brains take some time to get bootstrapped. A toddler has to be fed visual input for many months before they can be left unsupervised for reasonable amounts of time. And have you seen what those "optical illusion" things do? It's a miracle that those humans can self-navigate at all considering those failure modes!


Disclaimer: I am a lay technical person and don't know much about AI.

I find this article somewhat condescending. I look at all the current development as stepping stones to progress, not an overnight success that does everything flawlessly. I imagine the future might be some combination of different solutions, and what the author proposes may or may not play a part in it.


I agree. Overhype is annoying, but it happens with every technological advance. So does the inevitable backlash jump from "some people claim too much" to "this whole field is a bubble with no real gains". Both are cliched viewpoints that give a false sense of being "in the know" without helping anyone navigate change effectively.


it's not a stepping stone, if you look closely it's a dead end


I don't see how systematically accurate image classifiers and facial recognition systems built on deep learning is a 'dead end'. Products are products. If deep learning has led to actual profits in actual companies, it's not a dead end. As to whether this leads to AGI is a completely different question.


The point is that the profits may not be as grand as the current level of hype may indicate

Edit: additionally it could be a dead end because the hype tends to narrow the directions we explore with ML. If everyone is obsessing about DL, we could be infuriatingly ignoring other research directions right under our noses.


What do you see exactly when you look closely and how do you know it is dead end?


An AI "winter" is a long period in which [edit: funding is cut because...] researchers are in disbelief about having a path to real intelligence. I think that is not the case at this time, because we have (or approaching) adequate tools to rationally dismiss that disbelief. The current AI "spring" has brought back the belief that connectionism may by the ultimate tool to explain the human brain. I mean you can't deny that DL models of vision look eerily like the early stages in visual processing in the brain (which is a very large part of it). Even if DL researchers lose their path in search for "true AI", the neuroscientists can keep probing the blueprint to find new clues to its intelligence. Even AI companies are starting to create plausible models that link to biology. So at this time, it's unlikely that progress will be abandoned any time soon.

E.g. https://arxiv.org/abs/1610.00161 https://arxiv.org/abs/1706.04698 https://www.ncbi.nlm.nih.gov/pubmed/28095195


No, AI winter was when the AI people oversold the tech, then failed to deliver, and lost their funding. This is well documented in histories of the field.


I think the scientific pessimism preceded the funding cuts:

https://en.wikipedia.org/wiki/AI_winter


The brain has a number of functional parts that we don't understand all that well. Research on the brain hits a wall every now and then, but you never hear the phrase "Neuroscience Winter".

We're starting to train models that match biological brain behavior, at least in some crude functional/structural sense. Maybe this is just a string of coincidences, but my guess is that discoveries of analogous biological/model components will continue to happen, and we'll be able to learn more about AI and the brain by linking related phenomenon in vitro and in silica.

A few more recent examples:

https://deepmind.com/blog/grid-cells/

https://www.nature.com/articles/nature04485


Neuroscience has been in constant winter, mainly because the methods are too crude and small scale, and that by observing 100 neurons out of billions makes it impossible to tell the whole story. The deep mind papers are interesting, but they are just a start imho. It may be that they are focusing on a mere coincidence. Nevertheless it is exciting to see progress in that direction, thats why i don't think DL research is going to lose steam soon. It would require a major show-stopper discovery (like the minsky-papert paper). Tesla crashes are no such thing.

It seems CS people are sick and tired of the hype, but i feel neuroscientists are now warming up to it.


I'd like to see neuroscience take more of a fundamental role in grounding/situating deep learning approaches. VGG is often mentioned as being roughly analogous to the visual cortex, but it differs in important ways. There's all kinds of why questions there. E.g. Why does deep learning work better with (e.g. ReLU) activation based on pooling while biological models use an inhibitory mechanism? Theory on why certain activations work better than others in DL is a little weak imho. Right now ML practitioners just throw a lot of parameter combinations at the wall and see what sticks. That's fine, but it's not really indicative of a robust understanding of model behavior.


I 'm actually of the opposite idea - let the two fields evolve by their own darwinian process as this will yield more interesting results. DL itself has created its own scientific questions and puzzles that may lead to important discoveries (which may transfer to neuroscience). E.g. "why does batch normalization / dropout work."


That’s not what the AI Winter was. It was when anything that used the term “AI” (including academic research) was unfundable.


From what I’ve read of the last AI winters, funding was more centralised, & projects were larger and fewer. Hardware was scarcer.

These days anyone with a few dollars to spend on compute time and some free software can do machine learning. I can’t see that going away.


Author here: seriously I'm here at the front page for the second day in the a row!?

The sheer viral popularity of this post, which really was just a bunch of relatively loose thoughts indicates that there is something in the air regarding AI winter. Maybe people are really sick of all that hype pumping...

Just a note: I'm a bit overwhelmed so I can't address all the criticism. One thing I would like to state however, is that I'm actually a fan of connectionism. I think we are doing it naively though and instead of focusing on the right problem we inflate a hype bubble. There are applications where DL really shines and there is no question about that. But in case of autonomy and robotics we have not even defined the problems well enough, not to mention solving anything. But unfortunately, those are the areas where most best/expectations sit, therefore I'm worried about the winter.


Do you think there's a balance to be struck between connectionism and computationalism?


The fact you use the word connectionism makes you alright in my book :)


The argument is that self-driving won't work because Uber and Tesla had well-publicized crashes. But I don't see how this tells us anything about other, apparently more cautious companies like Waymo. There seem to be significant differences in technology.

More generally, machine learning is a broad area and there's no reason to believe that different applications of it will all succeed or all fail for similar reasons. It seems more likely there will be more winners along with many failed attempts.


The argument is that self-driving won't work because Uber and Tesla had well-publicized crashes. But I don't see how this tells us anything about other, apparently more cautious companies like Waymo. There seem to be significant differences in technology.

Yes. I've been saying this for a while. Waymo's approach is about 80% geometry, 20% AI. Profile the terrain, and only drive where it's flat. The AI part is for trying to identify other road users and guess what they will do. When in doubt, assume worst case and stay far away from them.

I was amazed that anyone would try self-driving without profiling the road. Everybody in the DARPA Grand Challenge had to do that, including us, because it was off-road driving and you were not guaranteed a flat road. The Google/Waymo people understood this. Some of the others just tried dumping the raw sensor data into a deep learning system and getting out a steering wheel angle. Not good.


A lot of companies fear the ship's leaving without them. They try to rush ahead without thinking and then crashes. That's what it feels like every time a car company or another tech company say they're going to build the next self-driving car. I don't know why but I always feel like waymo is already 10,000 miles ahead.


Seriously. Frankly, based on Uber's culture I would have been surprised if they didn't kill at least one person with their self-driving efforts. It's a total non-data point. The fact that Uber got as far as they did without killing anyone is strong evidence that the problem is tractable.

As for Tesla - Tesla isn't even trying to make proper self-driving cars. Tesla's goal has always been assisted driving. However you feel about that, it's really not relevant to the success or failure of self-driving cars.

OP can't possibly have been ignorant of the fact that Waymo is the clear leader here with a substantial head start, and a proven record (and an actual fleet of self driving cars now on the road), and yet he chose not to mention it. That really undermines his credibility for me - he seems clearly more interested in making his point than in accurately engaging with reality.


Honestly, I think this is a good thing for both AI researchers as well as AI practitioners. One mans AI-winter is another mans stable platform.

While the number of world-shattering discoveries using DL may be on the decline (ImageNet, Playing Atari, Artistic Style Transfer, CycleGAN, DeepFakes, Pix2Pix etc), now both AI researchers and practitioners can work in relative peace to fix the problem of the last 10%, which is where Deep Learning has usually sucked. 90% accuracy is great for demos and papers, but not even close to useful in real life (as the Uber fiasco is showing).

As an AI practitioner, it was difficult to simply keep up with the latest game-changing paper (I have friends who call 2017 the Year of the GAN!), only to later discover new shortcomings of each. Of course, you may say, why bother keeping up? And the answer is simply that when we are investing time to build something that will be in use 5-10 years from now, we want to ensure the foundation is built upon the latest research, and the way most papers talk about their results makes you believe they are best suited for all use cases, which is rarely the case. But when the foundation itself keeps moving so fast, there is no stability to build upon at all.

That and what jarym said is perfectly true as well.

The revolution is done, now it's time to evolution of these core ideas for actual value generation , and I for one am glad about that.


AI winter? Hardly. Current methods have only been applied to a very tiny fraction of problems that they can help solve. And, this trend will only accelerate until computing resources become too expensive.

As long as there is ROI, AI projects will continue to be financed, top thinkers around the world will be paid to do more research, and engineers will implement the most recent techniques into their products and services to stay competitive. This is a classic feedback system that results in exponential progress.


This seems over negative. Just the opening argument, that companies were saying "that fully self driving car was very close" but "this narrative begins to crack"

Yet here they are self driving https://www.youtube.com/watch?v=QqRMTWqhwzM&feature=youtu.be and you should be able to hail one as a cab this year https://www.theregister.co.uk/2018/05/09/self_driving_taxis_...


You sure about that? https://www.youtube.com/watch?v=8IqpUK5teGM

Tesla self driving cars have crashed too. Arrogant people like Elon Musk are making a bad name for the hardworking AI developers who are actually trying to make self driving cars faulty proof.


That vid of the Uber crash - just because one company has a crap product doesn't mean they are all bad. Waymo is about the only one that seems about ready to go and I think partly because they don't just count on deep learning.


Like I said, Uber and Tesla have both been reported. If two companies of that status with their funding abilities are potentially dangerous, I wouldn't doubt any smaller company would.


The video is PR.

So far, no independent journalist or reviewer was allowed to test the system with the exception of a few tightly controlled rides.

Don't take it too seriously.


I thought there would be more of a backlash / winter onset when people realize that Alexa is so annoying to deal with (and you basically have to learn a set of commands) because AI isn't that clever yet. Also, when people realize that autocorrect took a dive for making edits when Google started putting a neural net in charge. (No! Stop deleting random words and squishing spaces during edits).

In other words I figured it would be the annoyances at what "should be easy by now" that would get Joe CEO to start thinking "Hm. Maybe this isn't such a good investment." When measurements are made and reliable algorithmic results attract and keep more users than narrowly trained kind of finicky AIs.

I don't want there to be an AI winter, and it won't be as bad as before. There are a lot of applications for limited scope image recognition, and other tasks that we couldn't do before. Unfortunately,I do agree with the post that winter is on its way.


The OP is obviously not keeping up with the field and has lot to learn about scientific approach. He basically uses the count of tweets from AndrewNg and crashes from risk-taking companies as indicator of "AI winter". He should have tried to look in to metrics such as number of papers, number of people getting in to field, number of dollars in VC money, number of commercial products using DL/RL etc. But you see, that's a lot of work and your conclusion might not align with whatever funky title you had in mind. Being an armchair opinion guy throwing link bait titles is much more easier.


I'll happily read your next post where you will include all of those. In fact amount of VC money spent in that field would only support my claim. And the number of papers is irrelevant. There were thousands of papers about Hopfield network in the 90's and where are all of them now? You see, all the things you point out is the surface. What really matters is that self driving cars crash and kill people, and no one has any idea how to fix it.


I think the most important question is what 'winter' really means in this context. The new concepts in AI tend to follow the hype cycle so the disillusionment will certainly come. One thing is the general public see the amazing things Tesla or Google do with deep learning and extrapolate this thinking we're on the brink of creating artificial general intelligence. The disappointment will be even bigger if DL fails to deliver its promises like self-driving cars.

Of course the situation now is different than 30 years ago because AI has proved to be effective in many areas so the research won't just stop. The way I understand this 'AI winter' is that deep learning might be the current local maximum of AI techniques and will soon reach the dead end where tweaking neural networks won't lead to any real progress.


AI winter is not "on its way". There is AI hype and anti-AI hype, and then there is actual practice. This article is anti-AI hype, just as bad as its opposite. In practice there are tons of useful applications. We haven't even begun to apply ML and DL to all the problems laying around us, some of which are quite accessible and impactful.

The hype cycle will pass with time, when we learn to align our expectations with reality.


> AI winter is not "on its way". There is AI hype and anti-AI hype, and then there is actual practice. This article is anti-AI hype, just as bad as its opposite. In practice there are tons of useful applications.

I don't think the article was saying that AI isn't useful, but just that deep learning specifically is not an AI panacea, and that the current hype around AI is on its way out. The hype dying down and the associated buzzwords starting to repel money instead of attract it is all that's meant by AI winter, I believe, not that we'll run out of places where the techniques would be useful.


Give us a list then, because what I’m seeing is shoehorning “AI” into everything but not with significant results.


It's not my job to supplement the press or to do paper reading for you. If you're really interested in AI and don't just read the clickbait press, then open Arxiv and look there.

http://www.arxiv-sanity.com/

If, on the other hand, your opinion that AI is in a winter has been already decided without reading the latest scientific papers, then there's nothing I can say to you that will change your mind.


I think you will see the "hype" cycle last for a very, very long time.

It is a very different thing when the computer can start doing things that only humans could do.

Look at all the drama after Google did the Duplex demo.

We have barely even got started with self driving cars.

But ultimately it is about money and the profits that can be made from AI/ML are just huge. So that will fuel the hype.

But also AI changes the calculus of companies competing. Before a product was sold and deteriorate over time.

Now a product is sold and gets better over time. Increasing moats and making it much more difficult to compete. So AI becomes a far more valuable thing than maybe we ever had in the past.


I think that we will have AI Winter once we see the true limitations that face us having a level 5 fully autonomous self driving car. The other thing we will see happen is the deflation of the AdTech bubble. Once we see both of these events occurring that should start the AI Winter.


I agree. The AdTech sphere is keeping the current hype alive more than anything else. There's some obvious imbalances in AdTech that should lead it to a damning end soon enough.


AI and machine learning is a tool. Like any other tool it's perfect for some problems and doesn't work well for other. Pick the right tools for the problem that you are working on. Don't follow the hype and don't use AI/ML just for sake of using it.



I think a lot of GOFAI approaches ought to be revisited to see whether they benefit from the new perceptual and decision capabilities of Deep Learning systems. Alex Graves's papers are particularly good at this.

Things like this reinforcement learner for theorem proving are pretty exciting possibilities. https://arxiv.org/pdf/1805.07563v1.pdf


I can recommend his new book, "The book of Why" very highly. Even though I am very familiar with Bayes nets, I discovered that that a lot of progress has been made in that side of AI.


There's a lot of good stuff coming from research in AI these days. Still, I think the author's right.

As with the onset of the previous AI winter a generation ago, the problem is this: Once a problem gets solved (be it OCR or Bayesian recommendation engines or speech recognition or autocomplete or whatever) it stops being AI and starts being software.

As for self-driving cars: I recently took a highway trip in my Tesla Model S. I love adaptive cruise control and steering assistance: they reduce driver workload greatly. But, even in the lab-like environment of summertime limited access highways, driverless cars are not close. Autosteer once misread the lane markings and started to steer the car into the side of a class 8 truck. For me to sit in the back seat and let the car do all the work, that kind of thing must happen never.

Courtesy is an issue. I like to exit truck blindspots very soon after I enter them, for example. Autosteer isn't yet capable of shifting slightly to the left or right so a driver ahead can see the car in a mirror. Maybe when everything is autonomous that won't be an issue. But how do we get there.

Construction zones are problems too: lane markings are confusing and sometimes just plain wrong, and the margin for error is much less. Maybe the Mobileye rig in my car can detect orange barrels, but it certainly doesn't detect orange temporary speed limit signs.

This author is right. AI is hype-prone. The fruits of AI generally function as they were designed, though, once people stop overselling them.


While I basically agree, really it ought to be called "AI autumn is well on its way", since I'm not sure we're into actual winter (i.e. dramatic reduction in $$ available for research) quite yet. But, probably soon.


Author here, yeah, it is the autumn. But I guess not many people would recognize the meaning, winter on the other hand is not ambiguous...


True.


"it is striking that the system spent long seconds trying to decide what exactly is sees in front (whether that be a pedestrian, bike, vehicle or whatever else) rather than making the only logical decision in these circumstances, which was to make sure not to hit it."

That is striking. It always sort of bothered me that AI is really a big conglomeration of many different concepts. What people are working on is deep learning for machines, but we think that means "replicating human skill/behavior". It's not. Machines will be good at what they are good at, and humans good at what they're good at. It's an uphill battle if your expectation is for a machine that processes like a human, because the human brain does not process things like computer architectures do.

Now, if some aspiring scientist wanted to skip all that and really try to replicate (in a machine) how the human brain does things, I think such a person would be starting from a very different perspective than even modern AI computing.


That's why Augmented Intelligence is a better term. It doesn't scare up visions of Skynet or Hal 9000 run amok. Nor does it promise utopian singularity right around the corner.

It just means better tools to increase human capacity. But it's not nearly as good at getting headlines in the media.


We call it deep learning, but it is deep pattern matching. Extremely useful, but don't expect it to result in thinking machines.


Are our brains magic? If they aren't then surely they must be doing something that we can reproduce. We've built so many things that we considered "thinking machines" in the recent past (realistic speech synthesis, image recognition and captioning, human-level translation, elaborate recommender systems, robust question answering) on "deep pattern recognition".


Brains are not magic, and will be reproduced eventually, but DNNs are a fundamentally weaker architecture and won't be enough. Neural nets can solve some problems that brains can solve easily and lots of other ML methods couldn't solve, which is great. But the space of problems that brains can solve and neural nets can't is still rich, and will remain so until better methods are developed.


I believe we can make thinking machines. And "deep pattern matching" will be part of it. But it will need to fit in another architecture, most likely a lot like (deep) reinforcement learning. Probably with strong focus on modeling and predicting the environment. But also with some kind of goal modeling.

See also for this not-magic: https://psyarxiv.com/387h9


The discussion on radiology is extremely sloppy.

Andrew Ng claimed human level performance on one radiology task (pneumonia). This claim seems to hold up pretty well as far as I can tell. Then the person criticizing him on twitter posts results on a completely different set of tasks which are just baseline results in order to launch a competition. These results are already close to human level performance, and after the competition it's very possible they will exceed human level performance.

Yes it's true that doing well at only Pneumonia doesn't mean that the nets are ready to replace radiologists. However, it does mean that we now have reason to think that all of the other tasks can be conquered in a reasonably short time frame such that someone going into the field should at least consider how AI is going to shape the field going forward.


Well, the breathless hype around deep learning (with and without reinforcement learning) is bound to subside sooner or later, and attendance to staid academic conferences like NIPS sooner or later will revert back to a smaller group of academics and intellectuals who are truly interested in the subject over the long term.[a] That much is certain.

But we're still in the early stages of a gigantic wave of investment over the next decade or two, as organizations of all sizes find ways to use deep learning in a growing number of applications. Most small businesses, large corporations, nonprofits, and governments are not using deep learning for anything yet.

[a] https://twitter.com/lxbrun/status/908712249379966977


Well, now that the cat's out of the bag in regards to AI/ML, we can all get in on the ground floor of the next hype wave - quantum computing!


IMHO Quantum computing is as well hyped as Cold Fusion and shares some of its properties. Until "quantum supremacy" occurs or something that will show a real speedup we won't hear that much from it.


Cold Fusion was outright scientific misconduct. I'm not optimistic about QC working as intended, but I think the hope around it is honest.


Stopped reading after the first half. The evidence for the idea that deep learning is failing is that Deep Mind haven't produced anything revolutionary since Alpha Go Zero which was published not even a year ago? And that preformance doesn't scale linearly with the number of parameters? And speculation about why Lecun made a certain career decision? Not very convincing.


You're forgetting that Andrew Ng is tweeting 30% less this year! Isn't that enough to convince even the staunchest critic?


Only tangentially related to the article, but it's always struck me as a little unethical that Demis Hassabis' name goes on every paper that's written by Deepmind. No-one produces that much research output.


No, but wait! We're just on the verge of replacing doctors! ;-)

There's still a lot of space for the improvement of "curve-fitting" AI in the workplace. The potential of existing tech is far from being thoroughly exploited right now. I believe the next big improvements will come more from better integration in the workplace (or road system) than new scientific advances, so that might seem less sexy. But I also believe this will be a sufficient impetus to drive the field forward for the years to come.


I would not call it the “AI winter”. If you look at what people have called AI over time, the definition and the approaches have evolved (sometimes drastically) over time.

Instead of being stuck on the fact that deep learning and the current methods seem to have hit a limit I think I am actually excited about the fact that this opens the door for experimenting other approaches that may or may not build on top of what we call AI today.


Yeah, the problem is deep learning sucked a bunch of money - essentially took a loan against the future in the form of VC investments. And if that loan does not get payed, for the next few years you may not afford to explore all that other stuff.


Technically, VCs are not loaning you money. They’re more likely betting on you. It’s true that maybe they’ll be more reluctant to place big bets, but as with everything in life the VC optimism is cyclical.


Sure it is not technically a loan. But it carries the same sentiment change when it blows up. People get extremely cautious, to the point of skipping some really good ideas. And not just those VC's that made the bets but everyone else too. Fear spreads just as effectively as hype.


Perhaps it'd be more correct to call it a "Strong AI Winter". We're no closer to "aware" machines. We've simply gotten very good at automating tasks that were once difficult to automate.


A friend that’s more optimistic about Strong AI once said that the ML that goes on today will probably serve the purpose of driving the peripheral sense organs of a future AI. Although it stretches a bit what’s possible today I could see that. I would call this a win if this ends up happening although I still belive we’re hundreds of years away from Strong AI.


I'm inclined to agree with your friend.

This ability of DL to convert streams of raw noisy data into labeled objects seems like exactly what's needed to solve an intelligent agent's perceptual grounding problem, where an agent that's new to the world must bootstrap its perception systems, converting raw sensory input into meaningful objects with physical dynamics. Only then can the agent reason about objects and better understand them by physical interaction and exploration. This is one of the areas where symbolic AI failed hardest, but DL does best.

With some engineering, it's easy to imagine how active learning could use DL to ground robot senses - much like an infant human explores the world for the first year of life, adding new labels and understanding their dynamics as it goes.

I suspect the potential for DL's many uses will continue to grow and surprise us for at least another decade. If we've learned anything from the past decade of DL, it's that probabilistic AI is surprisingly capable.


There never was a Strong AI Summer to begin with.


This reminds me of a recent Twitter thread [1] from Zachary Lipton (new machine learning faculty at CMU) arguing that radiologists have a more complex job than we, as machine learning enthusiasts, think.

[1] https://mobile.twitter.com/zacharylipton/status/999395902996...


I think all talk about computer intelligence and learning is bullshit. If I'm right, then AI is probably the most /dangerous/ field in computer science because it sounds just likely enough that it lures in great minds, just like a sitcom startup idea[0].

[0] http://paulgraham.com/startupideas.html


You could actually make a reasonable argument for the opposite of a winter, that we are heading into an unprecedented AI boom.

The article's main argument for a winter is that deep learning is becoming played out. But this misses the once in history event of computer hardware reaching approximate parity with and overtaking the computing power of the human brain. I remember writing about that for my uni entrance exam 35 years ago and have been following things a bit since and the time is roughly now. You can make a reasonable argument the the computational equivalent of the brain is about 100 TFLOPS which was hard to access or not available in the past but you can now rent a 180 TFLOP TPU from Google for $6.50/hr. While the current algorithms may be limited there are probably going to be loads of bright people trying new stuff on the newly powerful hardware, perhaps including the authors PVM and some of that will likely get interesting results.


Deep learning maybe not the complete answer to gai, but it’s moving down the right path. Computers though are still years/decades away from approaching human brain power and efficiency, so my take is that current ai hype is 10 years too early - a good time to get in.


> but it’s moving down the right path

Time will tell. I think DL is amazing, but is no the right path towards solving problems such as autonomy. I think if you enter this field today, you should definitely take a look at other methods than DL. I actually spent a few years reading neuroscience. It was painful, and I certainly can't tell I learned how the brain works, but I'm pretty certain it has nothing to do with DL.


A lot of people said the same thing in the '90's about "Neural Networks". Senior colleagues of mine have vivid memories of consultants coming in then saying much the same thing "in 10 years this will completely revolutionize your industry (manufacturing)."

When I first start working in 2004 "data mining" was the big thing and it was going to solve all our problems. Nowadays I'm hearing the same thing again about "Machine Learning".

It's pretty natural to be skeptical people make big promises it ends up being a lot of hot air.


Great essay but this "Deep learning (does not) scale" I think is missing an important point.

There are many ways to think about scale.

If you think about a learned skill then that skill actually scales extremely well to other machines and thus to other industries that might benefit from the same skill.

The primary problem with technology is that society doesn't just implement it as fast as it gets developed so you will have these natural bottlenecks where society can't actually absorb the benefits fast enough.

In other words, Deep Learning scales as long as society can absorb it and apply it.


Has anyone done something genuinely useful with ml/ai/whatever outside of advertising or stock trading? I am genuinely curious if it has really been applied to real commercial applications.


Improvements in search, translation, image recognition and categorization, voice recognition, and text to speech off the top of my head. I'm sure there are a lot more.


Yeah but those are all pretty terrible to the actual end consumer. They might be cool technologies but at the end of the day, I am a user that hates dealing with them. IVRs are terrible a experience. Image recognition is iffy at best. Text to speak is terrible. In 10 years maybe they will have it hashed out... just like 10 years ago, or 20 years ago


I'd have to disagree. While IVR sucks (most current implementations don't use ML by the way), image recognition and categorization is at or better than human levels in most cases now. Cutting edge TTS is now nearly indistinguishable from a human. Just check out some samples [1]. And while translation still sucks, ML based translation is still far better than previous approaches.

[1] https://www.theverge.com/2018/3/27/17167200/google-ai-speech...


What are you talking about? Google assistant is entirely ML driven text to speech and sounds great, wavenet by deepmind is almost indistinguishable from human speech.


Have you spent real money on something because it used Deep Learning?

Me not.

Maybe indirectly, ads have become more targeted, but we cant be sure how much. It might be just a dtandard small optimization


A lot of problems that involve risk or uncertainty, and that have measurable inputs. Thinks like market segmentation, customer retention and conversion, identifying potential bad debts and fraud, targeting products, identification of product features etc.

Not as sexy as the headlines, and less likely to involve DNN, but still profitable.


Sure the thing is overhyped, but the problem is that we cannot be sure about the next big thing. The advances are slow but then a giant step forwards happen all of a sudden.

Everyone dropped their jaws when they saw the first self driving car video or when alpha go started to win. This was totally unthinkable 10 years ago.

Some guy may come up with a computer model that incorporates together intentionality, some short term/long term memory, and some reasoning, who knows?


AI is favorable for big companies to better scale their services. It seems that Facebook have also faced AI scaling drawbacks and they are developing there own AI hardware for it https://www.theverge.com/2018/4/18/17254236/facebook-designi...


AI has a lot to offer to the industry right now I think where you don't need good worst case performance (ex., information retrieval, optimization, biology, etc.). The big problems in terms of application start appearing when you try and remove humans from the loop completely. That's not even close to possible yet but that doesn't mean the economic utility of even current AI is close to being maximized.


I know this about the state of Deep Learning but I like to point out:

While autonomous driving systems aren't perfect, statistically they are much better at driving than humans. Tesla's autonomous system has had, what, 3 or 4 fatal incidents? Out of the thousands of cars on the road that's less than 0.001%.

There will always be a margin of error in systems engineered by man, just hopefully moving forward fewer and fewer fatal ones.


Depending upon your statistical sources, US traffic fatalities are around 1.25-1.50 per 100 million miles. [1] All forms of real-world autonomous driving across all manufacturers across the world are still somewhere below 200 million miles, conservatively estimated. [2] [3] Between the Tesla and Uber fatalities, by these rough back-of-the-envelope numbers, autonomous of various grades is still roughly 2X higher than human drivers. Maybe 1X if you squint at the numbers hard enough, but likely not orders of magnitude lower. I don't anticipate rapid legislative and insurance liability protections for autonomous systems until we see orders of magnitude differences on a per 100 million miles driven basis, and that will take time.

Waymo racks up about 10,000 miles per day across about 600 vehicles spread in about 25 cities. [4] Roughly 3.6 million miles per year if they stay level, but they're anticipated to rapidly add more vehicles to their fleet. In the US alone, about 3.22 trillion miles were driven in 2016. [5] Don't know what a statistically valid sample size is based upon that (I get nonsensical results below 2000 miles, so I'm doing something stupid), though. If Waymo puts two orders of magnitude more cars out there, they'll still "only" rack up about 365 million miles per year, and not all the miles on the same version of software.

[1] https://en.wikipedia.org/wiki/Transportation_safety_in_the_U...

[2] https://www.theverge.com/2016/5/24/11761098/tesla-autopilot-...

[3] https://www.theverge.com/2017/5/10/15609844/waymo-google-sel...

[4] https://medium.com/waymo/waymo-reaches-5-million-self-driven...

[5] https://www.npr.org/sections/thetwo-way/2017/02/21/516512439...


Woah, I was prepared to be all gung-ho for this post, given that I've suspected the winter was going to be here for quite a while now. But strangely, this post actually caused the opposite effect for me. The winter will probably come one day, but is all the evidence the poster can find? Andrew NG tweeting less and a statement that DNNs doesn't scale based on flimsy data is not at all convincing to me.


Is this AI Winter 2.0? I was hopeful that logic programming would have developed more and spread to a larger audience at this point.


As a beginner in deep learning space, I am a bit baffled about the case "You need a lot of computational power". Good models learn fast, so if potential model looks promising on local machine, one can do training on gcloud for 100$ on high end machines. Where am I wrong in this line of thinking?


No, you are absolutely right. And modern transfer learning improves this even more in many domains.


Thank god. We're definitely not ready and perhaps could never be ready for true general purpose ai.


This is something I always wondered about AI and it promises. Sometimes, the last 1% is the hardest or can be even impossible. Self-driving cars, in particular, are a good case. We get to solve 99% of the use cases but achieving full autonomous vehicles might be just out of reach.


But, they're getting more and more data every year, right? All those almost millions of teslas running around could provide enough video input for the training data

Besides "Good software takes 10 years", according to Joel Spolsky. As I see it, we're, what 5 year into ML.


5 years ha! Wow the rebranding has worked well.



> Nvidia car could not drive literally ten miles without a disengagement.

From the same source as the author cites, that's because their test runs are typically 5 miles and resuming manual control at the end of a test counts as a disengagement.


Deep Learning was a noticeable improvement over previous neural models, sure. But deep learning is not the entire field of AI and ML. There has been more stuff going on like neural turing machines and differentiable neural computers.


We are beginning to see some sweet differential embeddings of discrete things like stacks and context free grammars. This is where deep learning gets really fun because it is learning to program.



For me Google is attacking on 2 main fronts 1. Quntam computing 2. Machine Learning/AI

If they are able to combine the 2. A big if though the cost analysis will change for AI quite dramatically.


Number of tweets as reliable data points? Very dubious. Simple explanation: They are busy working, so less time to tweet.

Maybe they're working on something so cool, that the AI winter may not even come. Sure, there's a lot of marketing-speak around AI at the moment.

But this wave of AI seems a lot stronger with better fundamentals than 20 years ago. At the very least, at least we have the hardware to actually RUN NN's cost effectively now as oppose to grinding your system to a halt back then.

Before AlphaGo, it wasn't even clear when a computer could beat a top professional in go, let alone crush humans in the game - low bound guesses were 50 years.



Low hanging fruits are scarce now. With 3 orders of magnitude difference in power (MW over few watts), clearly this is not the right way for reaching the tree top.


/shrug people need time to research;

Anyway i also don't get what the issue is with the model from radiology. It is already that good?! This is impressive. One model is close to well trained experts.

Just today i had an small idea for a new product based on what google was showing with the capabilities to distinguis two people talking in parallel.

At the last Google IO i was impressed because in comparision to the previous years, ML created better and more impressive products.

I was listing for years at key nodes about big data and was never impressed. I hear now about ML and im getting impressed more and more.


Google IO is a developer's conference with an emphasis on marketing its own products and tools. We have to take the news from it with a grain of salt.


If only there were some technology that might enable us to discern patterns so that we could better predict fluctuations in demand for AI software.


Truly, I agree.

I've long been interested in learning about AI and deep learning, but to this day haven't done much that truly excites me within the field. It feels more or less impossible to make anything significant without Google-scale databases and Google-scale computers. AI really does make it easier for the few to jump far ahead, leaving everyone behind.

I also agree that a lot the news around AI is just hype.

Honestly, I'm yet to see anything practical come out of AI.

But hey, if something eventually does, I'm all for it.


Speaking of Google-scale databases, I've always felt the unhyped hero of deep learning was the data sets themselves. I'd be happy to see one less book about deep learning, and one more book on how you can gather and build a quality dataset yourself. Granted, I think gathering and preparing the data is a broader and most difficult task in many ways.


> be Google


I'm kinda curious now where those giant datasets will come from now that there's a big push for privacy with things like GDPR preventing some random researcher from just buying data off whatever data mining corp is most relevant to their AI's purpose


Many traditional businesses (e.g. banks, insurance) collect customer data internally, and don’t share it with anyone. That’s not likely to change much. For many of these businesses, the customer is other businesses, and privacy rules aren’t even applicable.

You won’t hear about these projects outside industry specific publications, if at all.


I’m kinda curious now which researchers have been buying PII from data mining corps.


Cambridge analytica comes to mind


Bet your house on it if it’s “well on it’s way”


Yeah this is a dumb article. Number of tweets by AndrewNg? Really? All those articles denying the reality of the revolution brought by AI have an emotional basis, but I don't understand what it is. Are they feeling threatened? Or is it an undergrad/early 20s thing, like a complete lack of understanding of the dynamics coupled with abnormally strong opinions?


This reinforces the need to benchmark any 'human expert equivalent' project against the wattage of the human brain.


How much of this can we pin on IBM's overhype of Watson?


Yawn. Contrarianism is easy and this article offers little. The real world application you’re speaking of has a comically small amount of data (a few million miles?). You hear about a handful of accidents that still average to better than human performance and suddenly the sky is falling.

When machine learning stops successfully solving new problems daily, then maybe a thread like this will be warranted.


without being an expert just by reading articles it seems to me that some people wish foe an AI winter. It makes them feel better somehow


Oh, I thought "AI winter" would refer to a state of ruin after AI had come into existence and destroyed everything, analogous to nuclear winter.


AI Winter is a very well-known term in the industry referring to a general lack of funding of AI research, after the last time AI was overhyped.


I guess it could be used about anything that experiences a low level of interest, then.


A real Winter is a lack of warmth. An AI winter is a lack of ______


If we would stop calling this stuff "AI" it would make all our lives a lot easier, but people can't resist.

When computers first came on the scene a lot of people had a very poor conception of what it was the human mind did, computationally. So when computers turned out to be good at things that were challenging "intellectual" tasks for humans like chess and calculus many were duped into thinking that computers were somehow on a similar level to human brains and "AI" was just around the corner. The reality was that one of the most important tasks that the human brain performs: contextualization, categorization, and abstraction was taken for granted. We've since discovered that task to be enormously computationally difficult, and one of the key roadblocks towards "true AI" development.

Now, of course, we're at it again. We have the computational muscle to make inference engines that work nothing like the human brain good at tasks that are difficult to program explicitly (such as image and speech recognition) and we've built other tools that leverage huge data sets to produce answers that seem very human or intelligent (using bayesian methods, for example). We look at this tool and too many say "Is this AI?" No, it might be related to AI, but it's just a tool. Meanwhile, because of all the AI hype people overpromise on neural networks / "deep learning" projects and people get lazy about programming. Why bother sitting down for 15 minutes to figure out the right SQL queries and post processing when you can just throw your raw data at a neural network and call it the future?

One of the consistently terrible aspects of software development as a field is that it continues to look for shortcuts and continues to shirk the basic responsibilities of building anything (e.g. being mindful of industry best practices, understanding the dangers and risks of various technologies and systems and being diligent in mitigating them, etc.) Instead the field consistently and perversely ignores all of the hard-won lessons of its history. Consistently ignores and shirks its responsibilities (in terms of ethics, public safety, etc.) And consistently looks for the short cut and the silver bullet that will allow them to shirk even the small vestiges of responsibility they labor under currently. There's a great phrase on AI that goes: "machine learning is money laundering for bias", which points to just one facet among so many of what's wrong with "AI" as it's practiced today. We see "AI" used to sell snake oil. We see "AI" used to avoid responsibility for the ethical implications inherent in many software projects. We see "AI" integrated into life critical systems (like self-driving cars) without putting in the effort to ensure it's robust or protect against its failures, with the result being loss of life.

AI is just the latest excuse by software developers to avoid responsibility and rigor while cashing checks in the meantime. At some point this is going to become obvious and there is going to be a backlash. Responsible developers should be out in front driving for accountability and responsibility now instead of waiting until a hostile public forces it to happen.


I've always understood the claim that deep learning scales to be a claim about deployment and use of trained models, not about training. The whole point is that you can invest (substantial) resources upfront to train a sufficiently good model, but then the results of that initial investment can be used with very small marginal costs.

OP's argument on this front seems disingenous to me.

His focus on Uber and Tesla (while not even mentioning Waymo) is also a truly strange omission. Uber's practices and culture have historically been so toxic that their failures here are truly irrelevant, and Tesla isn't even in the business of making actual self driving cars.

I'm the first to argue that right now AI is overhyped, but this is just sensationalist garbage from the other end of the spectrum.


Hi, it appears that "sensationalist garbage" triggered quite a bit of a discussion. This is typically indicative that the topic is "sensitive". Perhaps because many people feel the winter coming as well. Maybe, maybe not, time will tell.

And FYI, Tesla is in the business of making self driving car. If you read the article, you might learn that Tesla is actually the first company to sell that option to customers. You can go to their website right now and check that out.

Uber, like it or not is one of the big players of this game. I agree they may have somewhat toxic culture, but I guarantee you there are plenty of really smart people there who know exactly the state of the art. And their failure is therefore indicative of that state of the art.

I also omitted Cruise automation and a bunch of other companies, perhaps because they have more responsible backup drivers that so far avoided fatal crashes. But I analyze the California DMV disengagement reports in another post if you care to look. And by no means any of these cars is safe for deployment yet.


> Hi, it appears that "sensationalist garbage" triggered quite a bit of a discussion.

Yes. Sensationalist.

> I also omitted Cruise automation and a bunch of other companies, perhaps because they have more responsible backup drivers that so far avoided fatal crashes.

So your explicit reason for omitting Waymo, as I understand it, is that it didn't support your argument?


> Yes. Sensationalist.

Yes, perhaps. But I'm entitled to my opinion just as you are entitled to yours. And time will tell who was right.

> So your explicit reason for omitting Waymo, as I understand it, is that it didn't support your argument?

You see, when you make any argument, you always omit the infinite number of things that don't support it and focus on the few things that do. The fact that something does not support my argument, does not mean it contradicts it.

You might also note that this is not a scientific paper, but an opinion. Yes, nothing more than an opinion. May I be wrong? Sure. And yet this opinion appears to shared by quite a few people, and makes a bunch of other people feel insecure. Perhaps there is something to it? We will see.

But in the worst case it will make some people think a bit and make an argument either for or against it. I may learn today a good argument against it, that will make me think about it more and perhaps I will change my opinion, or I'll be able to defend it.

So far you have not provided such an argument, but I wholeheartedly encourage you to do so.


This is a list of your phrases in this comment that I find, in my opinion, condescending.

> And time will tell who was right.

> You see, when you make any argument

> You might also note that this is not a scientific paper, but an opinion. Yes, nothing more than an opinion.

> And yet this opinion appears to shared by quite a few people, and makes a bunch of other people feel insecure. Perhaps there is something to it? We will see.

> So far you have not provided such an argument

I immediately identified this same tone in your paper. In your argumentation, you quite agressively hinted hat people which don't share your views are not very intelligent. You also have a tendency to present your sayings as prophetic, which appeared multiple times both in the paper and in this comment.

These observations put me in alarm towards your arguments, which I found mostly weak, sometimes used in bad faith. I flagged as such the Twitter argument, analysing the frequency of A. Ng's tweets, and denouncing its "outrageous claims", with an example where the AI score is overall only 0.025 less accurate than a practician.

I also thought that you used a different (your own) definition of scaling than most, and used it to make an argument, which was therefore unconvincing (but parent said that already).

Overall, to me, this was not a very pleasant read, and I dislike the fact that you attack the hype on machine learning by enjoying the polarization that comes with anti-hype articles such as yours. I also don't think that making people feel insecure is such a great indicator that what you're saying is relevant or prophetic.

I hope this helps you prophecies https://www.physics.ohio-state.edu/~kagan/AS1138/Lectures/Go... ;)


> You see, when you make any argument, you always omit the infinite number of things that don't support it and focus on the few things that do.

No. When I make an argument, I try to omit the infinite number of things I think are unlikely to be important, and focus on the few things that I think are most important whether they support my position or not.

Everyone's fallible, and I do my share of focusing too much on points that support my position over more important counter points, but I see that as a failing, not as the reasonable thing to do.


One of the more silly articles on HN in a while. Waymo has cars as I type this driving around Arizona without safety drivers.

People were freaked out by the Google demo of Duplex a couple of weeks ago as it was just too human sounding.

Can give so many other example. One is foundational. The voice used with Google Duplex is using a DNN at 16k cycles a second in real-time and able to offer at a competitive price.

That was done by creating the TPU 3.0 silicon. The old way of piecing together was NOT compute intensive and therefore doing it using a DNN requires proprietary hardware to be able to offer at a competitive price to the old way.

But what else can be done when you can do a 16k cycles through a DNN in real-time? Things have barely even got started and they are flying right now. All you have to do is open your eyes.

DNN - Deep Neural Network.


It's the same story again like exaggerating the influence of IoT 5 years ago. The whole thing is exaggerated to raise money from investors and attract customers instead of actually buidling superior product


It's 99% marketing and places like HN and reddit eat it up and try to hype it up even more. When you confront these characters about the basis on which they claim AI will solve whatever problem or evolve to whichever point, they only reply "it'll only keep getting better (given time, data, resources, brains, etc)"

It's a buzzword people brainlessly use to fetishize technological progress without understanding the inherent limitations of the technology or the actual practicality and real-life results outside of crafted demos or specific problem domains (for example Alpha Go beating a grandmaster has almost no bearing on a problem like speech cognition).

It's turned me off a lot from reading about advances in the field because I know like a lot of science releases that most of it is empty air that won't really have bearing on the actual software I use (I've watched the past two Google's I/O where pretty much every presentation mentions AI, but the Android experience still remains relatively stale).


Deep Recession ‘18


Winter is coming.


[flagged]


Overweigh people drive fine. Being a military test pilot is a completely different thing.


[flagged]


The difference between Blockchain advocates and the thieves from the tale "The Emperor's New Clothes" is that the weavers from the tale knew they were bullshitting the king. Sadly blockchains will not revolutionize mankind.


As the saying goes, never attribute to malice what can be adequately explained by extreme stupidity and the first-worlders' googels of prosperity that is sustained by silent injustices and extreme malice that is literally not recognized and certainly not acted upon by like all of them.


That's not entirely correct. People repeatedly like to take hard yes/no stances on blockchain because they seem to confuse the issue with cryptocurrencies. Cryptocurrencies are quite obviously overhyped and self-destructing. But blockchain is actually being implemented in many effective ways around the world - just in areas that you wouldn't generally hear about. Blockchain has its place, just not at the current moment.


I don't know any problem where blockchains are the best solution - it is possible there are some, I don't know.

I'm just talking about the very inflated social/political expectations, the "blockchain will change everything" mantra: I don't see this happening.


How can we know if a solution is effective if we have not studied it thoroughly enough? There might be use cases for blockchain that have yet to be probed. It's certainly being taken seriously in the finance industry, especially in terms of financial records-keeping and accounting for governance by regulatory bodies.


> There might be use cases for blockchain that have yet to be probed.

This is like saying "you can't prove god does not exist". It is true but pointless.

> It's certainly being taken seriously in the finance industry, especially in terms of financial records-keeping and accounting for governance by regulatory bodies.

Precisely tasks where there are so many solutions both more efficient and time-proven! With so much hype around the idea the industry has to take a peek but sooner or later they will face reality: the king is naked.


What a relief.


The inconvenient but amazing truth about deep learning is that, unlike neural networks, the brain does not learn complex patterns. It can see new complex patterns and objects instantly without learning them. Besides, there are not enough neurons in the brain to learn every pattern we encounter in life. Not even close.

The brain does not model the world. It learns to see it.


This post is very uninformed.

"It can see new complex patterns and objects instantly without learning them."

Except, it doesn't. It is clearly false. When animals grow up in an environment without certain patterns, they will be unable to see these patterns (or complex combinations of these) at a later stage. We see complex patterns as combinations of patterns we have seen before and semantically encode them as such. This is very similar to how neural networks work at the last fully connected layers.

"Besides, there are not enough neurons in the brain to learn every pattern we encounter in life."

There is a lot of self-similarity in our environment. Compression algorithms (and NN auto-encoders) are able to leverage this self-similarity to encode information in a very small number of data-points / neurons.

"The brain does not model the world. It learns to see it."

Except, it doesn't. Your brain continually makes abstractions of the world. When you 'see' the world you see a (lossy) compressed version of it, compressed towards utility. Similar to how MP3 compression works: the information gain of higher frequencies is low, so your brain can safely filter these out.


We learn to see patterns, but we see through physical and cultural action patterns that are simply present, not learned.

It’s like a river flowing... yes, the water molecules each “discover” the their path, but the path of the river is a property of the landscape. It is not learned.


OK. See you around.


You’re close.

“learns to act in it”

we don’t even see the world, we see a hyperdimensional action space. Anything we can’t relate analogically back to some embodied action will literally be invisible to us.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: