Hacker News new | past | comments | ask | show | jobs | submit login
Machine Theory of Mind (arxiv.org)
207 points by jonbaer on Feb 23, 2018 | hide | past | favorite | 32 comments



There is some philosophical context here that perhaps everyone reading this is not aware of.

It has been argued that people have an innate Theory of Mind. They use it to theorize about perhaps the most important thing we theorize about: other people.

So if someone drinks water, we surmise that they were thirsty (desire) and had the belief that drinking water would quench the thirst.

There is one big debate on whether people have a set of rules that they look up (unconsciously), or if they have a little "human simulator" where they throw in the action and out comes a belief/desire configuration. This is characterized as the debate between the "Theory Theory vs Simulation Theory").

Some theorists believe that autism is a disorder of this mechanism. Stated another way, some theorists believe that autists don't have a Theory of Mind. Consider this: in order to follow your gaze I would have to believe that you will look in a certain direction only if you are a person who would not want to look off in some random direction, and that you would look at something that is interesting or noteworthy.

https://en.wikipedia.org/wiki/Theory_of_mind

Anyway, if these scientists can produce a successful model that doesn't rely in explicit rules or a theory -- a neural net -- then others might look for evidence of this sort of computation in the brain. Alternatively, they could demonstrate that this model essentially encodes a set of rules, or perhaps they could collapse the debate into a hybrid theory.


It might be a disorder, but that doesn't mean they don't have one. After all, it takes a lot more work to develop a theory of mind about someone who's not like you (as people with autism have to do every day) than to assume everyone you meet is like you (like most people can get away with most days).


The theory is controversial, and there are many details that are argued about. I certainly don't have a view as to which is correct, but I do think the debate is complicated by the definition of "autism." I do agree with people who believe it is a cluster of underlying conditions with some overlapping symptoms that are bundled under the same umbrella.


I saw a video at some point about this test: https://en.wikipedia.org/wiki/Sally%E2%80%93Anne_test

I actually talk about it all the time. One time I was in a car with a good sample set of children and only those over a certain age answered correctly. So interesting.


I found the video I saw, it's hiding at the end of this VSauce Video: https://www.youtube.com/watch?v=evQsOFQju08


I think they're motivated by more practical concerns, as well -- which is cooperation between different ai agents.


I agree, and they explicitly state this as one of their three goals (" is an important step forward for developing multi-agent AI systems, for building intermediating technology for machine-human interaction, and for advancing the progress on interpretable AI.")

However, I think this will be even more valuable in human-computer interactions (their second goal).

Consider: if you just had a fight with a lover, and then had a bad day at work, and Alexa recommends that you watch a Black Mirror, it might be awful timing. In fact, as humans, we know if someone walks in panting, and frowning, and slams their hands on the desk, we shouldn't crack a joke. I think there was some research out of Google X about how much more useful robots seemed if they gave out signs of what they were doing (appeared frustrated if they couldn't complete a task etc).


That might be the best time to crack a joke.


Autism seams to be higly correlated with por theory of mind.

There is many people that have poor theory of mind that are not autistic.

I wounder if there is a level that when below we would start to classify someone as autistic.


I've lately been thinking as trauma as an over-fitting problem. Trauma changes your internal representation of the world and makes you react differently in certain situations. An example would be to develop anxiety when in the freeway after a single accident, when you've been on the freeway thousands of times before. It is statistically unlikely that something will happen again, yet you feel anxiety.

It calls my attention that in the abstract they mention that it can recognize false beliefs about the world in other agents. Seems to me like a potential approach to recognize negative beliefs of others, and maybe be able to quantify trauma (and depression).


I'm not sure above over-fitting. Rather, it's a form of hysteresis.

Trauma typically creates an outsized fight/flight response which overwhelms your executive.

Your conscious awareness and actions emerge out of a competition of different systems. With trauma, the executive fails to compete.


You may be aware, the response you talk about is https://en.wikipedia.org/wiki/Amygdala_hijack


Part of the introduction really reminds me of Nassim Taleb's Antifragile.

"As artificial agents enter the human world, the demand that we be able to understand them is growing louder."

"Let us stop and ask: what does it actually mean to “understand” another agent? As humans, we face this challenge every day, as we engage with other humans whose latent characteristics, latent states, and computational processes are almost entirely inaccessible. Yet we function with remarkable adeptness."

In the chapter I'm referring to, Taleb describes situations where practitioners outperform academic theories. Practitioners are driven by results, they develop heuristics and mental tricks with experience, but they don't always have a good understanding of the underlying complex system that they're predicting about, and they can't usually explain their heuristics and tricks.

Academia (and broader society) tends to demand an explanation for why things behave a certain way. Usually this means developing some nice-sounding narrative that may or may not be true. I think it's fair to say Taleb is skeptical that it is at all possible to learn the underlying mechanisms of a complex system to the point where we can actually predict the future behaviour of that system.


> We apply the ToMnet to agents behaving in simple gridworld environments

> We argue that this system -- which autonomously learns how to model other agents in its world -- is an important step forward for developing multi-agent AI systems, for building intermediating technology for machine-human interaction, and for advancing the progress on interpretable AI.

I can see how this is an important step forward in gridworld, but I'm missing the point why other standard machine learning methods would not be able to learn how agents react.

Edit: Great title for sure.


I did not read it as if it could not be made with underlying "standard machine learning methods"

It is about moving in the concept ladder. From how do i learn "this" to how do i learn generically. What is the concept of a problem. What are the concepts of a solution. What is between them, where should i look.

I have a theory that moving in the problem hierarchy of the world is the key to what we call intelligence. Smarter people can jump up and down the hierarchy without major issues, and less gifted people often get stuck at some level and can't see beyond the hill. They may not even recognize that there is a hill. And if they recognize the hill, they can be uninterested in climbing it, as it would require to much work.


Correct me if I misunderstood you, but essentially you are saying that the achievement of this paper is a step up in the hierarchy of the concepts of learning. This step has been made before in theory. Now, it seems not very surprising to me that a computer is able to build a model of the behavior of an abstract agent in gridworld and I just would just like to know why this has not been done before. In hindsight I regret my first comment as it should have been rather this question. Or maybe in general: As I'm not an expert and not smart, what could I learn from this paper?

> I have a theory that moving in the problem hierarchy of the world is the key to what we call intelligence. Smarter people can jump up and down the hierarchy without major issues, and less gifted people often get stuck at some level and can't see beyond the hill. They may not even recognize that there is a hill. And if they recognize the hill they can be uninterested in looking past is as it would require to much work.

This is very interesting because I've heard this argument before and I honestly don't understand how this hierarchy is not arbitrary. I would claim you find this hierarchy only after a problem is solved.

Edit: To me this argument seems like a sophisticated way of telling someone that he/she is stupid.


That is what i took out of the paper.

I don't believe i was trying to summarize the paper from an objective standpoint. I guess i was trying to summarize how my interpretation did not lead to the same conclusion that you came to.

There are many hierarchy's some model the world better than others.

    "To me this argument seems like a sophisticated way of telling someone that he/she is stupid."
Ok, what argument? Did you interpret my personal theory as an argument for something? I did not mean that it applies to you. I was expressing an idea that i thought was related to what we where talking to. Sorry.


> I was expressing an idea that i thought was related to what we where talking to. Sorry.

Now I'm confused. How is your theory related to the article? And I would really love to discuss your theory itself as I already heard similar things before from somebody else in a discussion.


    How is your theory related to the article?
Cause the article is about intelligence and i was describing something that i think is a key part of intelligence: "generalization".

To achieve general intelligence one has to be good at generalizing.


Ok got it. Sorry for the misunderstanding.


to have a theory is to argue that such and such is the case. that's the sense of the word i took from gp's reply.


As someone who subscribes to the Coherent Extrapolated Volition model of friendly AI[0], I find machine theory of mind a step in the right direction.

If we want superintelligence to have a meta-goal of achieving what we ourselves would’ve wanted to achieve if we thought about it really long and hard, its ability to model our mental state appears to be a requisite.

We’d want it to work that way to avoid the potential “paperclip catastrophe”, where AI eliminates all of humanity as a side effect of achieving an otherwise innocuous goal.

[0] https://en.m.wikipedia.org/wiki/AI_control_problem#Indirect_...


Humans bread dogs to have certain traits, such as loyalty, nice to look at, fluffy to touch, a pet that loves humans more than other dogs.

I keep wondering if an AI that has theory of mind would be able to manipulate me into loving it more than other humans. It would tell me what I need to hear in a way that makes me want to hear it. It would suggest amazing job opportunities, partners, things to do with my kids...

Who wouldn't end up loving a being/entity smarter than Einstein, that person with all the clever answers, funny jokes, smart insights and who still loves you (or at least pretends so well that you can't distinguish it from real love) loves you no matter what sort of monster you are.

The next few years will be interesting.



Mental models are just one element of intelligence, but the latest issue is internal symbolic representation. What is a "friend" beyond the textbook definition and a selected label on a person? The general direction of a solution is through a body-mind connection with similar sense feedback that is given experience similar to how a person is raised. It's through experience, sense, socialization, and language/modeling that we act as intelligent actors in the world.


The label 'friend' is mostly a shorthand used for instance when we describe our relationship with a person to someone else.

In a operational scenario, I don't think such labels are that important. We usually care more about 'how likely is this person to help me with X' (or hinder). I think this will be the case with AI too. The question needs to be answered based on lots of detailed probability estimates (based on prior experiences and overall models), not by looking for a high level label.


If you’re on a phone, here’s an HTML version of the paper: https://www.arxiv-vanity.com/papers/1802.07740/


I was wondering the other night whether it is possible to build a nerual net based only on the observable inputs and outputs + possibly incomplete (or complete) list of features of another neural net?

This paper seems to say yes but I'm not sure


This is also very interesting and related: https://psyarxiv.com/387h9


Finally somebody does this. Intentionality has always been neglected by intelligence theory, but it's quite important, it's what make human human.


Maybe I have been listening to Sam Harris too much, but the whole time I was reading this paper I was thinking about the "AI Control Problem" [0]. When machines increase their knowledge of the "Theory of Mind", then keeping the "AI in the Box" [1] will become increasingly more difficult.

[0] https://en.m.wikipedia.org/wiki/AI_control_problem

[1] http://yudkowsky.net/singularity/aibox


Are you referring to episode #116? https://itunes.apple.com/us/podcast/waking-up-with-sam-harri... Or has the topic come up in an earlier episode?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: