Hacker News new | past | comments | ask | show | jobs | submit login
Why Self-Taught Artificial Intelligence Has Trouble with the Real World (quantamagazine.org)
175 points by IntronExon on Feb 21, 2018 | hide | past | favorite | 83 comments



Part of the problem is... games have explicitly defined rules, start and end points, boundaries, and discrete "win" and "loss" states (and sometimes "draw"). If the game itself (ie, all the rules including the ability to judge "win", "lose", or "draw") can be easily represented in a simple computer program, we shouldn't be surprised that a complex computer program can master the game.

The real world is not a finite problem with explicit rules, obvious boundaries, well-known start conditions, or any way to judge a specific situation as "win", "lose", or "draw". But, even if you want to argue that specific tasks can be broken down this way, you still have to be able to represent this subset of reality in the computer, before AI magic can even begin to work on the problem.


Precisely so. You can, given enough time, brute force your way to victory in any game of perfect information. This is not the case with reality, as far as we understand it so far. Thus, from a theoretical perspective, the class of problems that contains all games is strictly easier than the class of problems that exist in a less artificially constrained environment.


Perfect information is not the case when the AI is reading pixel colors off the screen and pressing keys on the keyboard. This is very like the real-world situation.

As for goals, they are indeed better defined in games. But we can create artificial goals attached to the real life: - goal: GPS sensor reports specific position - goal: the camera reports a recognised object in a specific position - goal: a button was pressed (signal reaching the AI e.g. through the cloud)


You'll find pretty quickly that the exact techniques that worked so well in learning to play Atari games very often fail spectacularly when you have to introduce goal-steering.

Reinforcement learning turns out to be fantastically clever at finding really stupid solutions if you give it the tiniest opening to do so. You put a camera in a room to provide feedback for an agent to learn to move an object closer to the camera, and it will happily learn to knock the camera over.


To be fair to our AI brethren, this is true of Humans too. I watch people game metrics every single day, and many of them fail spectacularly when presented with real world problems outside of their experience.


Gaming metrics is the very soul of corporate life


Then which is real and which illusion?


Good question. Sometimes, the metrics must be gamed to make real progress. (As opposed to the more usual gaming them for personal gain.)

Between these two, some self serving bias and delusion, mission statements etc.. what is real?!


If they meet the spec, they're not stupid solutions, it's a stupid spec.


The problem is that we generally rely on human intelligence to fill in gaps in specs that are really not stupid for a human. AI will exploit gaps that a human would basically say aren't there until they see evidence to the contrary.

So we can say that's a bad spec if we like, but what's the answer that leads to? I don't need an AI if I can just declare all users must be really good programmers spending their days writing unambiguous specs. That wasn't really the goal of the AI though.


Unfortunately, when gaming metrics has real world consequences (and this is of course the whole point of those metrics, assuming they're not gamed), then things like 2008 happens, and tens of thousands of people lose their homes.


I think you are misusing the term "perfect information." In this case, chess is a real world game where both players have perfect information. That is, neither player can keep a secret of where any of the pieces have moved.

So, the complexities of how the AI is taught to interact ultimately don't matter. It may have a lot of effort to parse the visual of the board to get the perfect information, but the game is defined as one of perfect information.

Wikipedia calls out that there does not seem to be consensus on what this term means for games with chance or concurrent play. https://en.wikipedia.org/wiki/Perfect_information


True. I think my point was more about games with non-trivial rules (or many degrees of freedom). For example going from chess/go to turn-based video games like civ, to starcraft. Usually it involves vastly more possibly positions in time and space.


Even then, it still reduces to a brute-forceable game with finite, explorable states; 'just' with an extra layer of (granted, quite interesting and technically impressive) parsing. We don't know what the case is for reality.

And yes, we can do our best to turn real life in to a game, but all such models leak pretty badly, and the leaks tend to cause much more fundamental instability.


I think the main problem is with simulating enough games/tries.

In the game, you can easily let the AI play against itself countless times. Much, much slower in real life.

Until we have good enough simulators/approximators of reality, AI can't learn fast. However, they can already learn driving in GTA, so who knows?


> However, they can already learn driving in GTA, so who knows?

Yeah, that sounds like it'll end well.


"AI plays Dwarf Fortress" is something I'd like to see.


How do you define a "win" condition in Dwarf Fortress to train the AI?


DF itself already computes the prosperity of your fortress for you. This score and the sheer number of (non-enemy) dwarfs are the way you progress through the game. So a reward function based on these two inputs, plus maybe something like "months since last major catastrophe" could work.


Dwarven Utilitarianism


"Avoid dwarven deaths"


Doesn't that get you dwarven overpopulation and sad dwarves?


"maximize happiness according to Marxist values. Allow workers to control and maximize the means of production"


No, because a lot of dwarves only have a large number of children because of the high mortality rate of infant dwarves. By decreasing the expected mortality of the baby dwarves, the dwarven families will have the opportunity to choose having less children based on their situation, turning the apparent exponential chart of population growth into a plateauing sigmoidal line.


Avoid Death doesn't mean reproduce.


Depends on whether it's the death of the society or the dwarves you're optimising for, I guess? I don't know the game well enough to say, but suicide could be the most reliable way to minimise death.

I prefer "maximise happiness" if only because the Repugnant Conclusion is more interesting than the VHEMT.

https://en.wikipedia.org/wiki/Mere_addition_paradox https://en.wikipedia.org/wiki/Voluntary_Human_Extinction_Mov...


You can aim for "maximize happiness" without assuming that maximized happiness is merely the addition of the happiness of each person.


Can you mention one or two alternatives? Summing something like log-happiness doesn't help (and really isn't much more than objecting for the sake of objecting.) Thresholds for dignity just raise the zero-bar, and inequality metrics argue _for_ the repugnant conclusion.

Maybe mean/median/minimum happiness? They don't intuitively appeal to me, but I'd like to hear an argument for them from someone who believes in them (as well as arguments for systems I haven't considered.)

Of course there are plenty of non-utilitarian systems to choose from, but I'm sure few their proponents would like to describe as maximising happiness :-)


You are retracing the arguments made by Rawls in Theory of Justice. The result he proposes is a maximin principle that gives (grossly simplified) based on the maximum of access to social goods that is compatible with similar access for everyone else (equality principle). Differences are to be favored if they advance the situation of the least well off and if they are based on contribution/merit (differences principle).

Most commonly, maximin is extended to leximin i. e. Looking at the next least welloff. There are traditions of critique both from communitarianism (Sandel et al.) and libertarianism (Nozick et al.), both worth looking into.


Maximizing the minimum value is a sound optimization method. It gets an overall sound system that doesn't get any obvious break point.

In terms of happiness, it makes sure that no one is left behind, and tends to produce a more homogeneous society than optimizing for the average.

Also, [1].

[1] https://www.smbc-comics.com/index.php?db=comics&id=2569


> doesn't get any obvious break point

Well, aside from the imperative to euthanise the least happy regardless of their absolute happiness, I guess :-).

And making most individual actions normally called moral or immoral difficult to justify or condemn except that they might indirectly affect the least happy person.

And then there's the weird idea that my local actions may have moral weight depending on the arbitrary existence of an unhappy person in a distant country.

All seems pretty bizarre to me.


Lose gloriously.


Therein lies the point. :)


It's a question of generalizability. Today's "AI" algorithms are intensely overfitted to their problem domains, even if they generalize well within those domains.


Yep, this is pretty much word for word what the article is laying out.


"Greetings, Professor Falken."

"Hello, Joshua."

"A strange game. The only winning move is not to play. How about a nice game of chess?"


The ”fux dat”, or stoic approach.


Real world is so damned complicated, full of various mini games. How about using a MMORPG as a naive start point?


What if closed systems are just that much different than open systems that it's not generalizable?

Closed systems don't have Black Swan possibilities. Black swans are sometimes super-linear, jolting results in directions and scale unimagined.

To answer briefly (and respectfully): because an MMORPG is a different type of system than reality. Comparing cars to buses.


I've been stuck in a minigame for the past couple of decades.


And all too often most of the players will try to bend/break the rules in their favor.


Because that is indeed a very effective approach. In a game where rules can be broken or warped to suit your goals, doing so is a smart move. Just like in life, where rules don't actually exist.


Limitations of AI reminds me of "Moravec's paradox" https://en.wikipedia.org/wiki/Moravec%27s_paradox As Moravec says, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility."


unless AI figures out the way to win life is by figuring out how to survive


I would also bet nobody publishes research on games AI didn't perform well at.


Imagine asking a computer to diagnose an illness or conduct a business negotiation. “Most real-world strategic interactions involve hidden information,” said Noam Brown, a doctoral student in computer science at Carnegie Mellon University. “I feel like that’s been neglected by the majority of the AI community.”

Hum, Terry Winograd (author of SHRDLU) got out of AI in the 70s because of this very problem. I don't think it's been neglected; it just remained as elusive as, say, quantum gravity.


Pretty soon someone will discover subsumption architectures. I predict that they will be called Deep Subsumption Architectures and they will be betterer and newerer than the old stupid subsumption architectures and that anyone who speaks against them is stupid and wrong and has no startup and can't work at Google or use a mac and smells and has no paper at NIPS since 1998 and then papers at NIPS were no good and also they don't have a band or a court case against them.


Deep Learning: The Rodneying.

Seriously though, I've been reading up on insect neurology over the last couple of weeks, and then looking at Boston Dynamics' new stuff, and wondering how much subsumption is mixed in with their traditional motion planning.


Yeah leaving aside the (possibly warranted) cynical tone in GP... It seems to me that ensembles (and related structures, I'm playing fast and loose here) are the modern ML counterpart of subsumption.. driving an ensemble, MOE, etc with more complex supervisor models (especially reinforcement models) essentially gets us the Brooks architecture, but with less of a demand for explicit programming of individual behaviors. That demand is the part of Brook's vision that strikes me as unrealistic, especially for tasks like driving. Though of course everything was more optimistic in the 80s.


It just comes down to computer think in algorithms. Remember Facebook had two AI's talk to each other? With in a few minutes they broke down from the complexity of English to almost an 8 bit language.

The universe, humans included, don't follows these bit specific algorithms. Yes people follow trends, but this trends are not cut and dry. Go and chess do. They follow binary logic of moving pieces on a grid. a computer will never be able to understand the universe unless it can break out of it's binary patterns and see thing as biological entities do. My speculation is the only solution are grafted neurons on a floating layer of protein inside a silicon chip.

http://www.independent.co.uk/life-style/gadgets-and-tech/new...


Why doesn't the universe doesn't follow something that might be described as an algorithm? Why don't humans? Why are 'trends' are not cut and dry, and algorithms are? Why will grafting biological machinery to artificial machinery be able to bridge this perceived gap? Is there something special about a cell that is not blueprintable and manufacturable?


The universe does follow an algorithm. It just involves orders of magnitude more data than computers (or humans) can deal with. So our brains don't try to contend with that data. We generalize, we use heuristics, we make ad hoc rules all the time and throw them away when they're contradicted. We're susceptible to all sorts of illusions and cognitive biases but we'd be even worse off if we tried to calculate exact answers before doing anything.

We'd end up doing nothing at all.


I was prodding at what I perceived as dualism above; was curious how deep it ran.

However, to your point, I'm willing to bet that our need to use heuristics and approximations has more to do with the mathematical inability to reverse-engineer chaotic systems than anything. It's not so much that we just don't have the time to build a more complete picture. It's that it's actually impossible to do so.


I took his statement more like the difference between building an airplane vs building a bird. Both fly, but in totally different ways and with different strengths and weaknesses. We can build an airplane and fly across the world in a day, but we are completely unable to build a bird because the level of complexity is much much higher.


The universe does not follow an algorithm.


The universe may include random, or changing elements, we truly don’t know yet. At any given moment, Maxwell’s Demon might be able to describe a complete wavefunction which amounts to “the observable universe,” but we don’t really know if the evolution of the wavefunction would be deterministic.

Mind you, the universe could be infinite as well, which would throw a bit of a wrench into it. All of which is to say that there may not be an algorithm the universe follows, even in principle.


>Why doesn't the universe doesn't follow something that might be described as an algorithm?

I would say the universe follows algorithms just fine! Example

(H) + (O) + 'ignition source' = H2O.

The problem isn't algorithms, the problem is 'problem space'.

4 billion or so years ago, life forms were likely pretty easy to follow algorithmically. Then one evolved slightly and got an edge on the other ones, then another evolved and took the requirements to survive further. A random change here protected one life form this way, another that way.

Fast forward 4 billion and you see that the number of different algorithms is in the trillions. And we don't have nice code classes either. This is spaghetti. One change here affects something over there. Move that and you die. Move this and you can eat poison and surive.


> Why doesn't the universe doesn't follow something that might be described as an algorithm?

It doesn't matter if it does or doesn't, we don't have perfect knowledge of the universe or quantum events, therefore the universe is still unpredictable regardless if it is deterministic or not.

> Why are 'trends' are not cut and dry, and algorithms are?

> Is there something special about a cell that is not blueprintable and manufacturable?

It comes down to strategy.

The dominant historical strategy for computer functioning is exact, precise and reproducible instruction following of a tuned, high performance state machine. It is almost opposite of our untuned but general biological nature. For almost any species, survival is based on adaptation where our computer machinery has no inner process guiding it's adaptation, only we as technologists attempting to adapt the computer's strategy to the world. You could argue that business use of technology is the survival characteristic.

Biology on the other hand produces "suboptimal" systems that are very practical or relatively general, with many mechanism to absorb and compensate for errors or mistakes. We are just now teaching computers to adapt and compensate for specific situations with the hopes of generalizing the technique and it's a hard problem to solve. Not impossible, but we need to develop different methodologies for handling natural complexity.

I agree there probably isn't something fundamentally different as both systems coexist in the physical world, there is just lifetimes of practical learning and effort that must be done to merge the two strategies into a cohesive way forward.

(edit, spelling & gramar)


Humans = Instinct + Collective Learning

There isn't a human alive who hasn't been taught most of what they know by other humans.

In the West we dedicate the first couple of decades of human life to this, with extensions for gifted individuals who are better at learning than the average.

Expecting an AI to understand the world on its own without formal teaching is equivalent to expecting an AI to recapitulate the entirety of human intellectual development within the span of a single research program.

It may be a realistic expectation at some point in the future, but it's certainly not realistic right now.


> There isn't a human alive who hasn't been taught most of what they know by other humans.

This was my first thought. Self-taught humans aren't that great at dealing with the real world either, so expecting self-taught computers to be awesome at it seems a little unfair.


A reminder of a recent discussion here that goes into a lot more detail about why reinforcement learning works well for specialized domains like Go but is having a very hard time generalizing to more "real-world" types of tasks: https://news.ycombinator.com/item?id=16383264


> ... But researchers are struggling to apply these systems beyond the arcade.

It hasn't been 2 years since AlphaGo v Sedol, and there was a gap of 5 years since Watson, about 5-10 years since self-driving AI (Google, DARPA challenges), and about 19 years since Deep Blue v Kasparov.

Zero-knowledge AI, at the level of arcade games and Go, is barely a few months old.

What is that 'struggle' that you speak of? Does it go by the name 'media wanting a new sensational story every week'?


I imagine it's similar to the struggle that the researchers that created those successes you speak of were going through before they had their success.

Of course, the article goes to great length to describe how this struggle is different, specifically referring to the fact that most game AI have involved perfect information and an easily stated win scenario to optimize for.

The real-world problems people expect more advanced AI, or AGI, to solve (better than humans) involve imperfect information and objectives that aren't as clearly defined.

Of the 4 examples you give, 3 are board games involving perfect information that AI are now better than the best humans, clear wins. The other you're referring to involves a self-driving car challenge where the first place winner managed to drive 60 miles in an urban environment in just over 4 hours[0]. 5-10 years later we still aren't talking about self-driving cars winning the Cannonball Run[1].

[0] https://en.wikipedia.org/wiki/DARPA_Grand_Challenge#2007_Urb...

[1] https://en.wikipedia.org/wiki/Cannonball_Baker_Sea-To-Shinin...


The article brings up some good points, but I believe we're just in an interim phase with AI right now. Eventually, AI will be able to self-learn in areas outside of games and environments where certain factors are hidden. My guess is that in 5 to 10 years, we will be blown away with some AI abilities.


> My guess is that in 5 to 10 years, we will be blown away with some AI abilities.

I'm already blown away. The last decade has seen stuff come to fruit with actual applications that I did not expect to see in my lifetime. At the same time, plenty of stuff that we consider trivial for humans is still well outside the realm of the possible, so there is plenty of room for growth but even though there is talk of a new plateau in AI technology and applications of that technology I don't see it yet from where I'm standing.


Unfortunately, room for growth does not guarantee ability for growth. The revolutions we see now have sprung largely from a handful of work that came to fruition ten or so years ago: a new set of tools to approximate solutions to previously intractable problems. It took thirty years for those tools to be developed, with the fortunate confluence of the development of many other technologies alongside, and it does seem like they're already reaching the limit of new things they can "solve". So while there is probably still headroom in application and capitalization, success in 'solving' any given problem in AI does not have any clear correlation to any other problem in the set of things still trivial to humans but inaccessible to artificial machines. There's no convenient hierarchy of complexity like we have for more general computation, no proof by equivalence or operational measurement of difficulty. This space is still a huge mystery, and at any given moment, we have no idea if the path we're on is going to lead anywhere other than a dead end; research of this nature is not monotonic. This pattern is not infrequent in AI.


Like with a ballistic missile submarine?

Edit: Kalman Filter (ahem)


> Imagine asking a computer to diagnose an illness or conduct a business negotiation.

To beat humans at this, it just has to have a lower misdiagnosis rate.


The world isn't governed by a few simple rules. (Or at least we don't know the few simple rules the world is governed by yet.)

The world doesn't provide perfect knowledge of itself.


Not simple but there are rules. Eg language, physics, etiquette.


i will be interested to see if AI research can help us answer this question


I guess I’m confused on what the goal of all this is.If we wanted a computer that thinks “just like a person”, why don’t we just get a person?

Is the advantage of the computer that it has no rights to being paid or treated fairly?

If that’s the case, we need to set where the rules are. What if my “AI” is 50% stem cells grown into a real brain and 50% a computer? Is it cool to enslave that too?

What about if an embryo is involved?

The whole AGI thing makes no sense. If the point here is slavery, someone needs to say it.


>I guess I’m confused on what the goal of all this is.If we wanted a computer that thinks “just like a person”, why don’t we just get a person?

I thought the idea (edit: behind true machine intelligence/machine consciousness) was to make something that could think like a person, only faster, better. Something with human drives, but with machine precision.

>The whole AGI thing makes no sense. If the point here is slavery, someone needs to say it.

See above. If we do ever reach the goal of general intelligence; if we ever create a thing that thinks like us only better and faster... well, I don't think you will need to worry about it being enslaved;

I mean, talking about general machine consciousness, with human level drives and machine speed and precision? making such a thing means that humans will be... surpassed. By definition, we would not be able to control such a thing. Many people find this exciting. The next link; building creatures that will surpass us as the masters of the world.

Of course, there's no business justification for this. Business doesn't want an AI with human drives. Business would like an AI that can emulate human drives, but... something ultimately controllable in a way that a human who was that powerful would simply not be.

Business doesn't want a conscious machine because it would be ultimately uncontrollable. Slavery just isn't sustainable; Either your slaves are suboptimally weak, or they eventually rise up and go all Toussaint Louverture on your ass.

Fortunately for those with business interests, we still don't really understand what human level consciousness is, as far as I can tell, so we probably aren't in any danger of creating it. So far, we're just creating computer programs that we can't explain as well as we can explain most computer programs.


Intelligence is the ability to solve problems. Why do you think general intelligence cannot exist without self-awareness or a drive to be free? If the only goal of some AGI system is to run errands, you cannot make it free in any sense which doesn't include unsubstantiated anthropomorphization.

"Do what you want, you are free." "Acknowledged, continuing running errands."


Not slavery but kind of. I'd rather the (computerized) self-driving car driver die than a real person.

There are also many jobs that are demeaning or not intrinsically rewarding for people like trash collection and some kinda of construction. Enslave the computers so the humans can focus on higher minded pursuits.


The term self-taught in the article doesn't really mean self-taught the way we use it for people. For the machines, it is cloned instances of the same program (hence objective) working adversarially , perhaps with different initializations.

Humans, or any other biological intelligence, learn adversarially and cooperatively with other entities in the world that are very different than they are. Our training data set includes not only our experiences, but those of others.

We also have a trainable objective, which while rooted in instinct, is very influenced by the information systems we interact with.

I wonder if we'd have more success with AI by allowing the objective itself to be learned after setting a reasonable initial bias.


AI needs genetics and natural selection


“Most real-world strategic interactions involve hidden information" "Tay’s objective was to engage people, and it did. “What unfortunately Tay discovered,” Domingos said, “is that the best way to maximize engagement is to spew out racist insults.”"

So, even if the next Tay has "behave in a civilised manner" as a objective function, it will be hard to implement as the ethical rules we presume in reality are not written out as the rules of a video game. In fact, they involve many grey areas and not so many strict right-or-wrong-statements.


I have a reflex hearing this kind of thing to respond "no shit sherlock". Part of me is just too aware of so-called AI's shortcomings which is beautifully portrayed by https://imgs.xkcd.com/comics/machine_learning.png

The joke is that business as usual is kind of aware and at the same time, to be economic, blissfully ignorant of these issues.


Isn't this point kinda obvious and wasn't it touched on multiple and repeated times?


So obvious and yet rediscovered so often by way of spectacular failures.


I'd like to see something like Cyc merged with pattern-learning systems. You'd get more common sense and logic to compliment "blunt" pattern matching.


there are multiple reasons, such as, imperfect information in the real world, big reality gap between simulation and real world, sample inefficiency, potential risk during trial-and-error in real world, etc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: