Hacker News new | past | comments | ask | show | jobs | submit login
Simple questions to help reduce AI hype (smerity.com)
208 points by apathy on July 9, 2016 | hide | past | favorite | 33 comments



Stephen makes a lot of good points. And the list of questions he says people should ask is also useful. I have a few to add:

* What algorithm(s) are you using? Nobody should be able to brag publicly about their AI without simply naming the algorithms.

* What is your objective function? That is, what are you trying to optimize; what value you are you trying to maximize or minimize? Is it classification error? Is it measuring similarities on unstructured and unlabeled data through reconstruction?

* Do you already have the data that you can train on to minimize that error? If not, do you have a realistic plan to gather it? Have you thought about how you'll store it and make it accessible to your algorithms?

We have a series of questions we suggest that people ask themselves when approaching a machine- or deep learning problem:

http://deeplearning4j.org/questions.html

I talk to a lot of startups making claims about their "AI", and I can't stop them from jumping on the bandwagon, but not all have the ability to build a machine-learning system and gather the data it needs.

Every good thing gets hyped, but that doesn't make it less good. It just means readers have more work to do.... Like Francois Chollet, I happen to believe that AI is even bigger than the hype, but in ways that the hype can't imagine yet.

https://twitter.com/fchollet/status/751483046436548608


Great additions. I can't believe I missed "what algorithm(s) are you using" - it didn't even occur to me people could skip those yet still be taken seriously though that was (obviously) the case with the company I referred to in the article given they pitched AI when they had none.

I agree with you and Francois about the future potential enabled by AI. I do think we're at the early days of the internet where we can see it enables many things but we're not really accurate about predicting more than a year or two in the future.

Hype is just noise distracting us from the important contributions and issues we should be concerned over. I'm more worried about the immediate future of ML algorithms being misused, resulting in modern day redlining[1] and so on, than I am about any SkyNet prophecy.

[1]: https://en.wikipedia.org/wiki/Redlining


I agree with you about redlining. I hadn't really thought about that... A recent scandal involving a predictive analytics firm called Northpointe shows how stupidly people use algorithms, and how much they are abused when there is no transparency: https://www.propublica.org/article/machine-bias-risk-assessm...


The scandal is that propublica wrote a story which directly contradicts their own statistical analysis. Their analysis was unable to show statistically significant bias (and had other flaws that might be genuine mistakes - no multiple comparison correction, possible misinterpretation of the model).

https://www.chrisstucchio.com/blog/2016/propublica_is_lying....

In fact, their r-script suggests that the secret algorithm is probably reducing racism relative to whatever secret algorithm human judges would use.


Isn't that explicitly what OpenAI was formed to prevent?


The OpenAI team would be best to comment but I've not seen them note any specific intent[1] to research topics related to fairness, accountability and transparency in ML. I use those words specifically as FAT ML 2015 was part of ICML 2015[2]. While a great start, that area of research doesn't get anywhere near the attention it should given the gravity of the potential findings.

These issues are already in existence. Given ML systems are being used to filter resumes, decide whether someone should get a loan (and how much their interest rate should be), and find matches on dating services, the fact that ML systems may inadvertently feature racial or gender bias is hugely disturbing. This is likely to only get worse over time as more and more systems feature ML components.

As a simple example, word vectors are used by just about every deep learning system and recent research has found gender and racial stereotypes strongly embedded in them[3]. There's very confusing debate as to what this implies for the data, the models, and predictions[4]... :S

father:doctor :: mother:nurse, man:programmer :: woman:housemaker

black_male>assaulted, whilte_male>entitled_to

I wish I was making those examples up but they're straight out of their analysis. They also only focus on gender stereotypes but you can imagine how many similar issues might be hidden away just beneath the surface.

Even if OpenAI indicated they were explicitly interested in that direction, which afaik they haven't, it's still an area that should be of interest to the field broadly. OpenAI is still a small team and I'm certain they'd appreciate the help, especially as they already have a lot on their plate :)

A disaster scenario for me is to have ML systems help reinforce the negative aspects of our society, conveniently hidden in a black box which can never be properly inspected.

[1]: I'm primarily running off what I've seen in the recent past and their technical goals that they recently published at https://openai.com/blog/openai-technical-goals/

[2]: http://www.fatml.org/

[3]: "Quantifying and Reducing Stereotypes in Word Embeddings" - https://arxiv.org/abs/1606.06121

[4]: https://twitter.com/jackclarksf/status/746039805595762688


According to this paper https://arxiv.org/pdf/1606.06565.pdf, OpenAI (John Schulman and Paul Christiano) explicitly indicate interest in researching transparency and strongly support work on fairness.


Thanks for the response, I had misinterpreted what you meant by redlining. I thought you were referring to the unequal distribution of the technology among various populations (access) as opposed to the reinforcement of existing redlines (misuse).


That's where the right to privacy comes in. The input should not be there to begin with.


> What algorithm(s) are you using? Nobody should be able to brag publicly about their AI without simply naming the algorithms.

What if the algorithms are original from, say, new theorems and proofs in applied math, so far have no names, and, really, are not related to well known AI algorithms at all?


Whatever they are, they're either similar to algorithms that some human has used previously, or the employ mathematical ideas that others would have named and understood. So you could explain it by saying it's like X but with Y modifications.


Right. But in practice here I have to expect, from the rest the author wrote, that what he meant was one of supervised learning, unsupervised learning, machine learning, convolutional neural networks, natural language processing, computer vision, etc.


There is a simple antidote for hype. It doesn't require questions, and it will mute it completely. Just ignore the hyped word and see if there is anything there. The author does this throughout the article, but the business plan argument is the best example:

> asking if the business plan would work with free human labor replacing the automated component

Just ignore the word AI, and measure the value of the business or the research or the software for what it is. That's it.

But the author still resorts to the word AI as a common reference. This is what feeds hype, because we introduce it with every mention.

Hype isn't about substance. It's about the energy and excitement infused into these words that tends to build up. That's why we call them buzz words. People who know nothing technical about the subject begin to react to them, but they will not react to a different word.

Salesmen will always proceed to monetize this attention and excitement with things people can click or buy or say yes to. The market is driven by hype. Whether something actually happens in the field will have nothing to do with hype, but most money in a capitalist market is still driven by it. So although hype may be distracting for those doing real science or real entrepreneurship or real work, it's your best shot for promoting whatever it is you're selling, be it to consumers, to investors, or for funding.


> The real world is not a dataset.

I've seen this cut both ways. Particularly in neural nets for video, which I'm working on, datasets such as HMDB51 are extremely difficult to the point where they may not represent the difficulty of the real world application. Many vision tasks in the real world can actually be constrained by camera angles, assumptions based on experience about what kinds of objects are in the frame, lighting considerations, or other things that make the real world task actually easier than the dataset.

> AI won't save a broken business plan.

Best point in the article and the thing I worry most about day to day. Even if your AI is crazy good, if you don't have a market and people willing to pay for your product then it doesn't matter.


    > Many vision tasks in the real world can actually be constrained by camera angles
Great example - because it can be extended to show that even human brains have great difficulty with that. Just turn the scene upside down and see how the brain has to go into overdrive and still miss a lot of stuff. There are quite a few optical illusions photos based on that, for example the face that seems to smile - until you flip it.


>There are quite a few optical illusions photos based on that, for example the face that seems to smile - until you flip it.

Ah, the Thatcher effect! [0][1] Although IMO this is a bad example because (er, AFAIK) we don't have the same sort of hardcoded facial-feature-recognizer systems in computer-vision neural nets as we do in boring old hew-mons. The Thatcher effect works on people with prosopagnosia [2] ('face blindness', people who have to work out who you are by your clothing, gait, voice and body language). These, taken together, proved that the known full-face-detector in humans, the fusiform gyrus, is aided by other systems that recognize features of a face (eyes, mouth) independently of that face's vertical orientation.

[0] https://en.wikipedia.org/wiki/Thatcher_effect

[1] http://static.independent.co.uk/s3fs-public/styles/story_med... < here's the trope-namer, straight outta the 80s. The effect is much more pronounced in this image than in the wiki one; it's way more obvious that the mouth/eyes are 'properly oriented' despite being upside down

[2] https://en.wikipedia.org/wiki/Prosopagnosia


The brain is amazing in that if you for example, feed it images that have been flipped by some inversion glasses, it will learn to either do the proper correcting transformation or compensate for the distortions further up, on its own.

Check out neural prosthesis to see how clusters of neurons in the brain are at just excellent at learning whatever signals you throw at them on the fly.


The brain does not really have difficulty with camera angles. Almost every single optical illusion / visual trick / visual quirk is actually an example of how incredibly clever and well-optimized the brain is.

The fact is the brain takes convenient shortcuts whenever possible. The ability to detect smiles is indeed conducive to social behavior in humans. But the ability to distinguish smiles on upside down faces is not particularly useful for anything in real life. It's just an interesting quirk. The brain just recognizes the smile, and shortcuts any additional (useless) post-attentive processing.


One question that's not explicitly asked on this list: How much preparation does the training data need?

Knowledge representation is still one of the main bottlenecks on AI/ML in general, a deep learning network is not going to prepare its food by itself.


I haven't experienced a hype more warranted.

Yes, the underlying "magic" is decades old algorithms, often relatively simple concepts like LMS or nearest neighbors. The ML isn't the exciting part, it's the application. The pieces are finally in place: simple acquisition of training/test data (big data), powerful computing for the masses, larger crossover of ML into the computing/robotics fields. There is much hype but also much on the cusp of changing huge parts of human life.


We'll see whether this is true when ML has actually changed huge parts of human life. So far it hasn't changed much, to be honest.


The recent advances in speech and image recognition alone have had tremendous impact. Many areas of day to day life have been impacted.


This is generally overblown. What has mainly happened so far is that already-existing products have been incrementally improved by deep learning - speech recognition, spam detection, search ranking, etc. all worked reasonably well before deep learning came along; deep learning just improved them.

There aren't actually that many actually massively-impactful image recognition products out there yet (there are certainly products with the potential to be impactful, e.g. medical diagnostics, but it is too early to call).


> Achieving human-level performance on any real-world task is an exceedingly difficult endeavor when moving away from a few simple and well-defined tasks.

It's an exceedingly difficult endeavor to try to get all humans to produce "human-level performance" while doing simple and well-defined tasks. Evidence = Mechanical Turk.


An interesting thing to note there is that research shows that untrained individuals have significant variance in task performance which decreases as they get better at the task. Mechanical Turk might be an extreme case, since all tasks are done by untrained individuals.

Whilst I agree with the quote that AI has difficulties when moving away from well-defined tasks, so do we. Most expert performance is constricted to narrow task domains, does not transfer well to other domains, and takes a long time to acquire. In a sense, that's not too dissimilar to AI.

I do disagree with the "simple". Games such as Chess are the stereotypical example for things AI performs well in. Their rulesets are well defined, but the games are incredibly complex. It is amazing that AI outperforms even grandmasters in this area.


> "How much does the accuracy fall as the input story gets longer?"

As a beginner of text classification, I find the contrary, if the sentence is very short, the result is more likely suboptimal. E.g. classification of short tweets vs a detailed NYT article.


I intended these questions as broad examples though some are specific to a given AI system or architecture. You're quite right in that it depends on the task and the model.

I was thinking specifically about neural machine translation (NMT) with the encoder-decoder architecture. The encoder-decoder architectures converts a sentence in language X into a fixed size vector representation then aims to "expand" that vector into the equivalent sentence in language Y. As the vector size is fixed, short sentences can be adequately represented but longer sentences end up being lossily compressed. This realization illustrated a fundamental limitation in the encoder-decoder architecture and motivated the use of attention mechanisms.

There's a great figure (Figure 1) in Nvidia's Introduction to Neural Machine Translation[1] that shows the dramatic drop in accuracy with respect to the sentence's length.

[1]: https://devblogs.nvidia.com/parallelforall/introduction-neur...


quoted from the blog post*

  Achieving human level performance on any real world task is an exceedingly difficult endeavour when moving away from a few simple and well defined tasks.
May be related or off topic

I have developed a small messaging application which categorise the incoming messages by HUMANs (instead of AI) to achieve less false positives.

HTTPS://www.formalapp.com

I admit this app's solution is not technically complex.

And I understand building AI based solution is complex and needs more expertise.

But in this use case relying on human intelligence works or am I missing something ?



AI is 60 years old and gone through hype cycles before. Nural networks are nearly as old as AI but work better these days.


Short reaction: No!

In a little more detail, the author is talking about cases of "an amazing system" but, really, has in mind a quite narrow view of such systems and how they work and can be, and have been, built. Really, the author is looking at less than, say, 5% of the great examples of how to build "an amazing system".

In more detail:

The article has its definition of AI:

> In this article, I won't quibble over definitions, simply taking the broadest term as used in the media: artificial intelligence is whenever a system appears more intelligent than we expect it to be.

Okay, from my usual understanding of AI, say, the peer-reviewed papers I published in AI, this is a relatively broad definition, but while considering this article, I'll accept that definition.

But, oops, right away in the sentence before we have

> If there's any promise I can make about the field of AI,

So, the author is saying that AI is a field. From the author's definition of AI, that is a very broad field indeed! Maybe the most broad of all?

Still I will try to consider the article:

So, my next concern is

> Ask what cases the system will fail on

> "Revealing the failure cases of your models is a way of avoiding over-hype"

> Kevin Murphy at ICML 2015

Now what is AI is starting to look quite narrow, narrow enough to have a "model" and "failure cases".

So, oops: Consider a simple computer program that plays the little matchstick game Nim: There are two players. The game starts with three piles of matchsticks. For a move a player picks a pile and removes from it as many matchsticks as they want but at least one. The players alternate moves. The player who picks up the last matchstick loses.

Okay, there is a simple algorithms for perfect play. Then if both players play perfect games, who wins depends just on who moves first. If only one player, A, is playing a perfect game but on that instance of the game with perfect play by the other player, B, should lose, the player A will still win as soon as player B makes a mistake.

How to play a perfect game is in

Courant and Robbins, What is Mathematics.

So back to the last quote, there are no real "failure cases" and, really, no "model". Instead, there's some applied math.

But, I'll keep reading:

> No AI system yet developed will magically fit every use case.

Well, the little algorithm for playing Nim fits "every use case".

> If a researcher tells you that a model got state of the art results out of the box, they're either exceedingly lucky or giving you a fairy tale version of reality.

No. As in a program to play perfect Nim, there is another explanation: They have a perfect solution for the problem.

Again, by this far into the article, the author seems to have in mind a limited view of what AI is.

> Even if their model is arbitrarily flexible, there are still fundamental limits placed on us by information theory.

Not for a program written using the Courant and Robbins algorithm. Information theory has essentially nothing to do with it.

Continuing:

> "How much training data is required?"

For the Nim program, none.

> "Can this work unsupervised (= without labeling the examples)?"

Well, since there are no examples, apparently so.

> "Can the system predict out of vocabulary names?"

WHAT? I doubt I understand the question. But with the Nim program, maybe so.

The author seems to be assuming that the program is about natural language understanding. Not all computer programs are.

> "How stable is the model's performance over time?"

With the Nim program, perfectly stable.

> Very few datasets are entirely representative of what the task looks like in the real world.

"Dataset"? The Nim program doesn't have one.

Again, it appears that the author is thinking of a narrow view of AI, much more narrow than the definition the author gave.

> Any claim of advanced research without publication is suspect at best

> "If you do research in isolation, the quality goes down. Which is why military research sucks."

> Yann LeCun at ICML 2015

Uh, sorry, but I've done all my original research "in isolation" -- in a small, quiet room, sometimes on my back looking at the ceiling. For my Ph.D. dissertation in applied math, the first of the research was on an airplane ride from NY to TN. The rest of the research was independently in my first summer in graduate school.

> The rate of change in the field of AI is such that anyone on the sidelines is at best keeping up.

No, not really: I call my research applied mathematics, but it meets the author's definition of AI. But I pay essentially no serious attention to the field of AI (e.g., the work of people who claim to be in AI) at all. And the number of researchers in the field of AI, computer science, and information technology entrepreneurship that have the mathematical prerequisites to read my research is tiny. So, I'm not paying attention to the field of AI, and AI researchers are paying no attention to my work, some of which is published and, in one case, for one important problem, really, a significantly large set of important problems, totally knocks the socks off anything in AI. That case? Sure, some applied math, right, complete with theorems and proofs. In more detail, the work is some new results in mathematical statistics.

> It is certainly not impossible for an entity to come out of nowhere with an amazing system but not it is far less likely.

Naw! Essentially the opposite is true: If want "an amazing system", usually stay far away from AI -- just totally ignore the field. Instead, pursue applied math. E.g., molecular spectroscopy for identifying chemical molecules is terrific stuff, but is has essentially no "training data". Instead, the work is based on some quantum mechanics and group theory.

Again, the author is thinking in a way quite narrow compared with the author's definition of AI and compared with what is done to build "an amazing system".

> It also means they haven't been put through the standard evaluations that academia would have placed on them.

Well, as I've seen in some of my peer-reviewed publications, my work in applied mathematics is more advanced and with standards commonly higher, e.g., theorems and proofs, than the academic review process is able to follow.

Besides, any work I publish means I've given away the intellectual property, and now mostly I don't want to do that.

> AI won't save a broken business plan. An easy upper bound is asking if the business plan would work with free human labor replacing the automated component.

Going back decades, for many cases of "an amazing system", the idea that human labor could do the work at all is essentially absurd. Here the author is being narrow again.

> Achieving human level performance on any real world task is an exceedingly difficult endeavor when moving away from a few simple and well defined tasks.

No: Totally knocking the socks off human level performance is very common in applications of applied mathematics to complicated real problems, e.g., critical mass calculations for nuclear weapons, target detection via phased array passive sonar, also with Wiener filtering, to track the target, maybe with Kalman filtering, differential game theory for saying how a missile should catch a fighter plane, deterministic optimal control theory for, say, least time control to get an airplane to a given altitude, matching anti-ballistic missiles to incoming warheads, a huge range of applications of deterministic optimization, e.g., linear programming and integer linear programming, going way back to, say, even vacuum tube computers.

> All of this is to say that if the business plan doesn't work with free humans, AI won't save it.

Well, for my startup, there's no way humans could do the crucial, core applied mathematics, but my software based in my original applied math works just fine!

Again the author is thinking about cases of "an amazing system" much more narrow than the author's definition of AI and much, much more narrow than good work in applied math going back at least 50 years.


I have not read the entire thing, but skimming I found references to math and an "amazing system", so, are you Stephen Wolfram?


No.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: