Stephen makes a lot of good points. And the list of questions he says people should ask is also useful. I have a few to add:
* What algorithm(s) are you using? Nobody should be able to brag publicly about their AI without simply naming the algorithms.
* What is your objective function? That is, what are you trying to optimize; what value you are you trying to maximize or minimize? Is it classification error? Is it measuring similarities on unstructured and unlabeled data through reconstruction?
* Do you already have the data that you can train on to minimize that error? If not, do you have a realistic plan to gather it? Have you thought about how you'll store it and make it accessible to your algorithms?
We have a series of questions we suggest that people ask themselves when approaching a machine- or deep learning problem:
I talk to a lot of startups making claims about their "AI", and I can't stop them from jumping on the bandwagon, but not all have the ability to build a machine-learning system and gather the data it needs.
Every good thing gets hyped, but that doesn't make it less good. It just means readers have more work to do.... Like Francois Chollet, I happen to believe that AI is even bigger than the hype, but in ways that the hype can't imagine yet.
Great additions. I can't believe I missed "what algorithm(s) are you using" - it didn't even occur to me people could skip those yet still be taken seriously though that was (obviously) the case with the company I referred to in the article given they pitched AI when they had none.
I agree with you and Francois about the future potential enabled by AI. I do think we're at the early days of the internet where we can see it enables many things but we're not really accurate about predicting more than a year or two in the future.
Hype is just noise distracting us from the important contributions and issues we should be concerned over. I'm more worried about the immediate future of ML algorithms being misused, resulting in modern day redlining[1] and so on, than I am about any SkyNet prophecy.
I agree with you about redlining. I hadn't really thought about that... A recent scandal involving a predictive analytics firm called Northpointe shows how stupidly people use algorithms, and how much they are abused when there is no transparency: https://www.propublica.org/article/machine-bias-risk-assessm...
The scandal is that propublica wrote a story which directly contradicts their own statistical analysis. Their analysis was unable to show statistically significant bias (and had other flaws that might be genuine mistakes - no multiple comparison correction, possible misinterpretation of the model).
The OpenAI team would be best to comment but I've not seen them note any specific intent[1] to research topics related to fairness, accountability and transparency in ML. I use those words specifically as FAT ML 2015 was part of ICML 2015[2]. While a great start, that area of research doesn't get anywhere near the attention it should given the gravity of the potential findings.
These issues are already in existence. Given ML systems are being used to filter resumes, decide whether someone should get a loan (and how much their interest rate should be), and find matches on dating services, the fact that ML systems may inadvertently feature racial or gender bias is hugely disturbing. This is likely to only get worse over time as more and more systems feature ML components.
As a simple example, word vectors are used by just about every deep learning system and recent research has found gender and racial stereotypes strongly embedded in them[3]. There's very confusing debate as to what this implies for the data, the models, and predictions[4]... :S
I wish I was making those examples up but they're straight out of their analysis. They also only focus on gender stereotypes but you can imagine how many similar issues might be hidden away just beneath the surface.
Even if OpenAI indicated they were explicitly interested in that direction, which afaik they haven't, it's still an area that should be of interest to the field broadly. OpenAI is still a small team and I'm certain they'd appreciate the help, especially as they already have a lot on their plate :)
A disaster scenario for me is to have ML systems help reinforce the negative aspects of our society, conveniently hidden in a black box which can never be properly inspected.
According to this paper https://arxiv.org/pdf/1606.06565.pdf, OpenAI (John Schulman and Paul Christiano) explicitly indicate interest in researching transparency and strongly support work on fairness.
Thanks for the response, I had misinterpreted what you meant by redlining. I thought you were referring to the unequal distribution of the technology among various populations (access) as opposed to the reinforcement of existing redlines (misuse).
> What algorithm(s) are you using? Nobody should be able to brag publicly about their AI without simply naming the algorithms.
What if the algorithms are original from, say, new
theorems and proofs in applied
math, so far have no names, and, really, are
not related to well known
AI algorithms at all?
Whatever they are, they're either similar to algorithms that some human has used previously, or the employ mathematical ideas that others would have named and understood. So you could explain it by saying it's like X but with Y modifications.
Right. But in practice here I have to expect, from the rest the author wrote, that what he meant was one of supervised learning, unsupervised learning, machine learning, convolutional neural networks, natural language processing, computer vision, etc.
There is a simple antidote for hype. It doesn't require questions, and it will mute it completely. Just ignore the hyped word and see if there is anything there. The author does this throughout the article, but the business plan argument is the best example:
> asking if the business plan would work with free human labor replacing the automated component
Just ignore the word AI, and measure the value of the business or the research or the software for what it is. That's it.
But the author still resorts to the word AI as a common reference. This is what feeds hype, because we introduce it with every mention.
Hype isn't about substance. It's about the energy and excitement infused into these words that tends to build up. That's why we call them buzz words. People who know nothing technical about the subject begin to react to them, but they will not react to a different word.
Salesmen will always proceed to monetize this attention and excitement with things people can click or buy or say yes to. The market is driven by hype. Whether something actually happens in the field will have nothing to do with hype, but most money in a capitalist market is still driven by it. So although hype may be distracting for those doing real science or real entrepreneurship or real work, it's your best shot for promoting whatever it is you're selling, be it to consumers, to investors, or for funding.
I've seen this cut both ways. Particularly in neural nets for video, which I'm working on, datasets such as HMDB51 are extremely difficult to the point where they may not represent the difficulty of the real world application. Many vision tasks in the real world can actually be constrained by camera angles, assumptions based on experience about what kinds of objects are in the frame, lighting considerations, or other things that make the real world task actually easier than the dataset.
> AI won't save a broken business plan.
Best point in the article and the thing I worry most about day to day. Even if your AI is crazy good, if you don't have a market and people willing to pay for your product then it doesn't matter.
> Many vision tasks in the real world can actually be constrained by camera angles
Great example - because it can be extended to show that even human brains have great difficulty with that. Just turn the scene upside down and see how the brain has to go into overdrive and still miss a lot of stuff. There are quite a few optical illusions photos based on that, for example the face that seems to smile - until you flip it.
>There are quite a few optical illusions photos based on that, for example the face that seems to smile - until you flip it.
Ah, the Thatcher effect! [0][1] Although IMO this is a bad example because (er, AFAIK) we don't have the same sort of hardcoded facial-feature-recognizer systems in computer-vision neural nets as we do in boring old hew-mons. The Thatcher effect works on people with prosopagnosia [2] ('face blindness', people who have to work out who you are by your clothing, gait, voice and body language). These, taken together, proved that the known full-face-detector in humans, the fusiform gyrus, is aided by other systems that recognize features of a face (eyes, mouth) independently of that face's vertical orientation.
[1] http://static.independent.co.uk/s3fs-public/styles/story_med...
< here's the trope-namer, straight outta the 80s. The effect is much more pronounced in this image than in the wiki one; it's way more obvious that the mouth/eyes are 'properly oriented' despite being upside down
The brain is amazing in that if you for example, feed it images that have been flipped by some inversion glasses, it will learn to either do the proper correcting transformation or compensate for the distortions further up, on its own.
Check out neural prosthesis to see how clusters of neurons in the brain are at just excellent at learning whatever signals you throw at them on the fly.
The brain does not really have difficulty with camera angles. Almost every single optical illusion / visual trick / visual quirk is actually an example of how incredibly clever and well-optimized the brain is.
The fact is the brain takes convenient shortcuts whenever possible. The ability to detect smiles is indeed conducive to social behavior in humans. But the ability to distinguish smiles on upside down faces is not particularly useful for anything in real life. It's just an interesting quirk. The brain just recognizes the smile, and shortcuts any additional (useless) post-attentive processing.
Yes, the underlying "magic" is decades old algorithms, often relatively simple concepts like LMS or nearest neighbors. The ML isn't the exciting part, it's the application. The pieces are finally in place: simple acquisition of training/test data (big data), powerful computing for the masses, larger crossover of ML into the computing/robotics fields. There is much hype but also much on the cusp of changing huge parts of human life.
This is generally overblown. What has mainly happened so far is that already-existing products have been incrementally improved by deep learning - speech recognition, spam detection, search ranking, etc. all worked reasonably well before deep learning came along; deep learning just improved them.
There aren't actually that many actually massively-impactful image recognition products out there yet (there are certainly products with the potential to be impactful, e.g. medical diagnostics, but it is too early to call).
> Achieving human-level performance on any real-world task is an exceedingly difficult endeavor when moving away from a few simple and well-defined tasks.
It's an exceedingly difficult endeavor to try to get all humans to produce "human-level performance" while doing simple and well-defined tasks. Evidence = Mechanical Turk.
An interesting thing to note there is that research shows that untrained individuals have significant variance in task performance which decreases as they get better at the task. Mechanical Turk might be an extreme case, since all tasks are done by untrained individuals.
Whilst I agree with the quote that AI has difficulties when moving away from well-defined tasks, so do we. Most expert performance is constricted to narrow task domains, does not transfer well to other domains, and takes a long time to acquire. In a sense, that's not too dissimilar to AI.
I do disagree with the "simple". Games such as Chess are the stereotypical example for things AI performs well in. Their rulesets are well defined, but the games are incredibly complex. It is amazing that AI outperforms even grandmasters in this area.
> "How much does the accuracy fall as the input story gets longer?"
As a beginner of text classification, I find the contrary, if the sentence is very short, the result is more likely suboptimal. E.g. classification of short tweets vs a detailed NYT article.
I intended these questions as broad examples though some are specific to a given AI system or architecture. You're quite right in that it depends on the task and the model.
I was thinking specifically about neural machine translation (NMT) with the encoder-decoder architecture. The encoder-decoder architectures converts a sentence in language X into a fixed size vector representation then aims to "expand" that vector into the equivalent sentence in language Y. As the vector size is fixed, short sentences can be adequately represented but longer sentences end up being lossily compressed. This realization illustrated a fundamental limitation in the encoder-decoder architecture and motivated the use of attention mechanisms.
There's a great figure (Figure 1) in Nvidia's Introduction to Neural Machine Translation[1] that shows the dramatic drop in accuracy with respect to the sentence's length.
Achieving human level performance on any real world task is an exceedingly difficult endeavour when moving away from a few simple and well defined tasks.
May be related or off topic
I have developed a small messaging application which categorise the incoming messages by HUMANs (instead of AI) to achieve less false positives.
HTTPS://www.formalapp.com
I admit this app's solution is not technically complex.
And I understand building AI based solution is complex and needs more expertise.
But in this use case relying on human intelligence works
or am I missing something ?
In a little more detail, the author is
talking about cases of "an amazing system"
but, really, has in mind a quite narrow
view of such systems and how they work and
can be, and have been, built. Really, the
author is looking at less than, say, 5% of
the great examples of how to build "an
amazing system".
In more detail:
The article has its definition of AI:
> In this article, I won't quibble over
definitions, simply taking the broadest
term as used in the media: artificial
intelligence is whenever a system appears
more intelligent than we expect it to be.
Okay, from my usual understanding of AI,
say, the peer-reviewed papers I published
in AI, this is a relatively broad
definition, but while considering this
article, I'll accept that definition.
But, oops, right away in the sentence
before we have
> If there's any promise I can make about
the field of AI,
So, the author is saying that AI is a
field. From the author's definition of
AI, that is a very broad field indeed!
Maybe the most broad of all?
Still I will try to consider the article:
So, my next concern is
> Ask what cases the system will fail on
> "Revealing the failure cases of your
models is a way of avoiding over-hype"
> Kevin Murphy at ICML 2015
Now what is AI is starting to look quite
narrow, narrow enough to have a "model"
and "failure cases".
So, oops: Consider a simple computer
program that plays the little matchstick
game Nim: There are two players. The
game starts with three piles of
matchsticks. For a move a player picks
a pile and removes from it as many
matchsticks as they want but at least one.
The players alternate moves. The player
who picks up the last matchstick loses.
Okay, there is a simple algorithms for
perfect play. Then if both players play
perfect games, who wins depends just on
who moves first. If only one player, A,
is playing a perfect game but on that
instance of the game with perfect play by
the other player, B, should lose, the
player A will still win as soon as player
B makes a mistake.
How to play a perfect game is in
Courant and Robbins, What is
Mathematics.
So back to the last quote, there are no
real "failure cases" and, really, no
"model". Instead, there's some applied
math.
But, I'll keep reading:
> No AI system yet developed will
magically fit every use case.
Well, the little algorithm for playing Nim
fits "every use case".
> If a researcher tells you that a model
got state of the art results out of the
box, they're either exceedingly lucky or
giving you a fairy tale version of
reality.
No. As in a program to play perfect Nim,
there is another explanation: They have a
perfect solution for the problem.
Again, by this far into the article, the
author seems to have in mind a limited
view of what AI is.
> Even if their model is arbitrarily
flexible, there are still fundamental
limits placed on us by information theory.
Not for a program written using the
Courant and Robbins algorithm.
Information theory has essentially nothing
to do with it.
Continuing:
> "How much training data is required?"
For the Nim program, none.
> "Can this work unsupervised (= without
labeling the examples)?"
Well, since there are no examples,
apparently so.
> "Can the system predict out of
vocabulary names?"
WHAT? I doubt I understand the question.
But with the Nim program, maybe so.
The author seems to be assuming that the
program is about natural language
understanding. Not all computer programs
are.
> "How stable is the model's performance
over time?"
With the Nim program, perfectly stable.
> Very few datasets are entirely
representative of what the task looks like
in the real world.
"Dataset"? The Nim program doesn't have
one.
Again, it appears that the author is
thinking of a narrow view of AI, much
more narrow than the definition the author
gave.
> Any claim of advanced research without
publication is suspect at best
> "If you do research in isolation, the
quality goes down. Which is why military
research sucks."
> Yann LeCun at ICML 2015
Uh, sorry, but I've done all my original
research "in isolation" -- in a small,
quiet room, sometimes on my back looking
at the ceiling. For my Ph.D. dissertation
in applied math, the first of the research
was on an airplane ride from NY to TN.
The rest of the research was independently
in my first summer in graduate school.
> The rate of change in the field of AI is
such that anyone on the sidelines is at
best keeping up.
No, not really: I call my research
applied mathematics, but it meets the
author's definition of AI. But I pay
essentially no serious attention to the
field of AI (e.g., the work of people
who claim to be in AI) at all. And the
number of researchers in the field of AI,
computer science, and information
technology entrepreneurship that have the
mathematical prerequisites to read my
research is tiny. So, I'm not paying
attention to the field of AI, and AI
researchers are paying no attention to my
work, some of which is published and, in
one case, for one important problem,
really, a significantly large set of
important problems, totally knocks the
socks off anything in AI. That case?
Sure, some applied math, right, complete
with theorems and proofs. In more detail,
the work is some new results in
mathematical statistics.
> It is certainly not impossible for an
entity to come out of nowhere with an
amazing system but not it is far less
likely.
Naw! Essentially the opposite is true:
If want "an amazing system", usually stay
far away from AI -- just totally ignore
the field. Instead, pursue applied math.
E.g., molecular spectroscopy for
identifying chemical molecules is terrific
stuff, but is has essentially no "training
data". Instead, the work is based on some
quantum mechanics and group theory.
Again, the author is thinking in a way
quite narrow compared with the author's
definition of AI and compared with what is
done to build "an amazing system".
> It also means they haven't been put
through the standard evaluations that
academia would have placed on them.
Well, as I've seen in some of my
peer-reviewed publications, my work in
applied mathematics is more advanced and
with standards commonly higher, e.g.,
theorems and proofs, than the academic
review process is able to follow.
Besides, any work I publish means I've
given away the intellectual property, and
now mostly I don't want to do that.
> AI won't save a broken business plan.
An easy upper bound is asking if the
business plan would work with free human
labor replacing the automated component.
Going back decades, for many cases of "an
amazing system", the idea that human labor
could do the work at all is essentially
absurd. Here the author is being narrow
again.
> Achieving human level performance on any
real world task is an exceedingly
difficult endeavor when moving away from a
few simple and well defined tasks.
No: Totally knocking the socks off human
level performance is very common in
applications of applied mathematics to
complicated real problems, e.g., critical
mass calculations for nuclear weapons,
target detection via phased array passive
sonar, also with Wiener filtering, to
track the target, maybe with Kalman
filtering, differential game theory for
saying how a missile should catch a
fighter plane, deterministic optimal
control theory for, say, least time
control to get an airplane to a given
altitude, matching anti-ballistic missiles
to incoming warheads, a huge range of
applications of deterministic
optimization, e.g., linear programming and
integer linear programming, going way back
to, say, even vacuum tube computers.
> All of this is to say that if the
business plan doesn't work with free
humans, AI won't save it.
Well, for my startup, there's no way
humans could do the crucial, core applied
mathematics, but my software based in my
original applied math works just fine!
Again the author is thinking about cases
of "an amazing system" much more narrow
than the author's definition of AI and
much, much more narrow than good work in
applied math going back at least 50 years.
* What algorithm(s) are you using? Nobody should be able to brag publicly about their AI without simply naming the algorithms.
* What is your objective function? That is, what are you trying to optimize; what value you are you trying to maximize or minimize? Is it classification error? Is it measuring similarities on unstructured and unlabeled data through reconstruction?
* Do you already have the data that you can train on to minimize that error? If not, do you have a realistic plan to gather it? Have you thought about how you'll store it and make it accessible to your algorithms?
We have a series of questions we suggest that people ask themselves when approaching a machine- or deep learning problem:
http://deeplearning4j.org/questions.html
I talk to a lot of startups making claims about their "AI", and I can't stop them from jumping on the bandwagon, but not all have the ability to build a machine-learning system and gather the data it needs.
Every good thing gets hyped, but that doesn't make it less good. It just means readers have more work to do.... Like Francois Chollet, I happen to believe that AI is even bigger than the hype, but in ways that the hype can't imagine yet.
https://twitter.com/fchollet/status/751483046436548608