Hacker News new | past | comments | ask | show | jobs | submit login
Theory of Mind May Have Spontaneously Emerged in Large Language Models (arxiv.org)
170 points by izzygonzalez on Feb 9, 2023 | hide | past | favorite | 309 comments



This highlights one of the types of muddled thinking around LLMs. These tasks are used to test theory of mind because for people, language is a reliable representation of what type of thoughts are going on in the person's mind. In the case of an LLM the language generated doesn't have the same relationship to reality as it does for a person.

What is being demonstrated in the article is that given billions of tokens of human-written training data, a statistical model can generate text that satisfies some of our expectations of how a person would respond to this task. Essentially we have enough parameters to capture from existing writing that statistically, the most likely word following "she looked in the bag labelled (X), and saw that it was full of (NOT X). She felt " is "surprised" or "confused" or some other word that is commonly embedded alongside contradictions.

What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

The discussion is also muddled, saying that if structural properties of language create the ability to solve these tasks, then the tasks are either useless for studying humans, or suggest that humans can solve these tasks without ToM. The alternative explanation is of course that humans are known to be not-great at statistical next-word guesses (see Family Feud for examples), but are also known to use language to accurately describe their internal mental states. So the tasks remain useful and accurate in testing ToM in people because people can't perform statistical regressions over billion-token sets and therefore must generate their thoughts the old fashioned way.


I wonder every time I see this take what it would mean under this definition of knowing things for a machine learning algorithm to ever know something. I find that especially important because to every appearance we are a machine learning algorithm. I don’t know how different the sort of knowing this algorithm has to the sort of knowing a human has, but you’re far more confident than I am that it’s a difference of kind rather than degree.

Some interesting facts that point to it being a difference of degree. LLM are actually are more accurate when asked to explain their thinking. They make similar mistakes to humans intuitive reasoning.

It might help to define what we even mean by knowing things. To me being able to make novel predictions that require the knowledge is the only definition one could use that doesn’t run into the possibility of deciding humans don’t actually know anything


Defining what "knowing" is would be useful, yes, and analytic philosophers in epistemology do argue about this. One attribute that's classically part of the definition of "knowing" is that the thing which is known must be true. LLMs are pretty bad at this, but perhaps that can be fixed.

But I would challenge you to imagine the situation the LLM is actually in. Do you understand Thai? If so, in the following, feel free to imagine some other language which you don't know and is not closely related to any languages you do know. Suppose I gather reams and reams of Thai text, without images, without context. Books without their covers, or anything which would indicate genre. There's no Thai-English dictionary available, or any Thai speakers. You aren't taught which symbols map to which sounds. You're on your own with a giant pile of text, and asked to learn to predict symbols. If you had sufficient opportunity to study this pile of text, you'd begin to pick out patterns of which words appear together, and what order words often appear in. Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns. You can fill in blanks. But should anyone guess that you "know" what you're saying? Nothing has ever indicated to you what any of these words _mean_. If you give back a sequence of words, which a Thai speakers understands to be expressing an opinion about monetary policy, because you read several similar sequences in the pile, is that even your opinion?

I think algorithms can 'know' something, given sufficient grounding. LLMs 'know' what text looks like. They can 'know' what tokens belong where, even if they don't know anything about the things referred to. That's all, because that's what they have to learn from. I think an game-playing RL-trained agent can 'know' the likely state-change that a given action will cause. An image segmentation model can 'know' which value-differences in adjacent pixels are segment boundaries.

But if we want AIs that 'know' the same things we know, then we have to build them to perceive in a multi-modal way, and interact with stuff in the world, rather than just self-supervising on piles of internet data.


Ok I like your thought experiment. Lets change it a bit.

Instead of it being an unknown language, its English (a language you know), but every single Noun, Verb, Adjective or Preposition has been changed to Thai (a language you dont know).

The Mæw Nạ̀ng Bn the S̄eụ̄̀x.

If you had sufficient opportunity to study this pile of text, you'd begin to pick out patterns of which words appear together, and what order words often appear in. Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns.

Right, and to get good at this task, you'd need to build models in your head. You would think to yourself, right a Mæw tends to nạ̀ng bn a S̄eụ̄̀x, and you would build up a model of the sort of things a Mæw might do, the situations it might be in. In an abstract way. As you absorbed more and more data you would adjust these abstract models to fit the evidence you had.

You dont know what a Mæw is. But if someone asks you about a Mæw, you can talk about how it relates to S̄eụ̄̀x, Plā and H̄nū. You know stuff about Mæw, but its abstract.


If you're constructing this to rely on my prior knowledge both of the world and of English, then I must remind you that those are things the LLM does not have. We have to be careful to not allow our human inferential biases from distorting our thinking about that the models are doing.


Yeah but if you ask the model what a cat is, it'll use other words that describe a cat because they're usually used in a sentence about cats. These words must relate to cats. So if I ask you what a cat is, you'll use words that relate to cats. Sure, you may visually see these words in your head. You may visually see a cat in your head, but your output to me is just a description of a cat. That's the same thing the network would do.


The whole point of this conversation is whether talking like an agent that has a theory of mind and actually having a theory of mind are the same thing. I responded to a thread about what "knowing" is, and the same distinction can apply. You're responding with "if it talks like it knows what a cat is, it must know what a cat is", and that's totally begging the question.


But that all boils down to are we having a scientific conversation or a philosophical conversation? In my opinion the only useful conversation is a scientific on. A philosophical conversation will and can never be resolved so if of no importance to this discussion. We can use philosophy to help guide our scientific conversation, but in the end only a scientific conversation can be helpful in reaching a meaningful/practical conclusion.

So back to the questions of "What is knowing?" "Are talking like someone with theory of mind and having a theory of mind the same thing?"

If your argument is that the only way to answer this it to have a first person experience of that consciousness then that's not a scientific question. No one will ever have one for an LLM or any other AI. It's like asking "What's happening right now outside of the observable universe?". If it can't impact us, it's irrelevant to science. If that ever changes it will become relevant, but until then it's not a scientific question. Similarly no person can ever have a first person experience of the consciousness of an LLM, so anything that requires being the LLM isn't relevant.

So that means the only relevant question is what distinction can outside observers make between an agent talking like a theory of mind and having a theory of mind. And given a high enough accuracy / fidelity of responses I think we're only forced to conclude one of two things: 1. Something that is able to simulate having a theory of mind sufficiently well does actually have a theory of mind. OR 2. I am the only person on the planet with a theory of mind, and all of you are all just simulating having but don't actually have one.

It's all "Searle's Chinese room" and "What consciousness is" discussions all over again. And from a scientific point of you either you get into the "it must be implemented identically as me to count" (which is as wrong as saying an object must flap its wings to fly), or you have to conclude the room plus the person combined are knowledgeable and conscious.


I think you're making a strawman to argue against. Nowhere above have I claimed that "knowing" requires "consciousness", or "it must be implemented identically to me to count", and in fact I believe neither.

But:

- In this context, following on the whole 2nd half of the 20th century where cognitive science and psychology moved past behaviorism and sought explanations of the _mechanisms_ underlying mental phenomena, a scientific discussion doesn't have to restrict itself to only considering what the LLM says. Neither we, nor the LLM are black boxes. Evidence of _how_ we do what we do is part of scientific inquiry.

- But the LLM does _not_ reproduce all the behaviors of an agent with a theory of mind. A two year-old with a developing theory of mind may try to hide food they don't want to eat. A 4-year-old playing hide-and-seek picks locations where they think their play-partner won't look. They take _actions_ which are appropriate for their goals and context which require consideration of the goals of others. The LLM shows elaborate behaviors in one dimension, in which it has been extensively trained. It has no capacity to do anything else, or even receive exposure to non-linguistic contexts.

I am in no way arguing that only meat-based minds can "know". I'm saying that the data, training regime and model structure used for LLMs specifically is extremely impoverished, in that we show it language but no other representation of the things language refers to. Similarly, image-generating AIs know what images look like, but they don't know how bodies or physical objects interact, because they have never been exposed to them. Of _course_ we get LLMs that hallucinate and image-generators that produce messed up bodies.

On the other hand, there are some pretty cool reinforcement-learning results where agents show what looks like cooperation, develop adversarial strategies, etc. There's experiments where software agents collaboratively invent a language to refer to objects in their (virtual) environment to accomplish simple tasks. I think there are a lot of near and medium-term possibilities coming from multi-modal models (i.e. can models trained on related text, images, audio, video) and RL which could yield knowledge of a kind that LLMs simply do not have.


Yes valid points you make, but I feel they are still skipping something. To me it seems like you are asking "Does it know the same things we know?"> With the obvious answer is no because it doesn't have all of the senses we have.

Someone who is blind, doesn't have a lesser concept of knowing even though they are blind. They might not "know" things in the same way a someone who is seeing, but doesn't mean their version of knowing is any less, they just know fewer facts about the world. Specifically the visual facts of what things look like. Their "knowing" functionality is equal to someone who sees.

Similarly, someone who is blind, and deaf also has full ability for "knowing" even if they'll never know things in the visual or auditory spaces.

So my argument is that your premise is wrong, the fact that someone or something has fewer senses doesn't mean it's ability to know is any less.

So back to your LLM the fact it doesn't exists in the real world is not an exclusion from its ability to know. It does not need to have all of those experiences "to know". It will never know the physical meaning of concepts like we do. Just like I'll never know the details of a city block in Jakarta (as I've never been). But not having that experience (or any experiences of multiple senses) doesn't mean I don't know.

LLMs don't need multiple cross connected sensory experiences, nor extensive history with a physical or virtual world to know things.

For an entity "to know" it means it has a model it can use to make predictions.


I think your argument goes off the rails when it jumps from "you don't need any particular sense modality to know" to "you don't need any percepts, or experience of reality or simulated unreality to know". That's a big leap, and I can't disagree more.

> For an entity "to know" it means it has a model it can use to make predictions.

Great, every PID controller, every jupyter notebook or excel spreadsheet with a linear regression model, every count-down timer can make predictions and therefore "know" under this definition. But perhaps there's a broader class of things that "make predictions". Down this path lies panpsychism. When I throw a rock, its velocity in the x direction at time t is a great "predictor" of its velocity in the x direction at time t+delta, etc, etc. And maybe there's nothing inconsistent or fundamentally wrong with saying that every part of the physical universe "knows" at least something insofar as it participates in predicting or computing the future. But I think by so over-broadening the concept of knowing, it becomes useless, and impossible to make distinctions that matter.


> you don't need any percepts, or experience of reality or simulated unreality to know". That's a big leap, and I can't disagree more.

I still feel this the the point where you're making a difference based on you desired outcome vs the actual system. ChatGPT absolutely does have precepts / a sense. It has a sense of "textual language". It also has a level of sequencing or time w.r.t. word order of that text.

While you're saying experience, it seems like in your definition experience only counts if there is a spatial component to it. Any experience without a physical spatial component to you seems like it's not valid sense or perception.

Again taking this in the specific, imagine someone could only hear via one ear, and that is their only sense. So there is no multi-dimensional positioning of audio, just auditory input. It's clear to me that person can still know things. Now if you also made all audio the same loudness so there is no concept of distance with it, it still would know things. This is now the same a simple audio stream, just like ChatGPT's langauge stream. Spatial existence is not required for knowledge. And from what I'm understanding that is what underpins your definition of a reality/experience (whether physical or virtual).

Or as a final example lets say you are Magnus Carlson. You know a ton about chess, best in the world. You know so much about chess that you can play entire games via chess notation (1. e4, e6 2. d4 e5 ...). So now an alternate world where there is even a version of Magnus that has never sat in front of a chess board and only ever learned chess by people reciting move notation to him. Does the fact that no physical chess boards exist and there is no reality/environment where chess exists mean he doesn't know chess? Even if chess were nothing but streams of move notations it still would be the same game, and someone could still be an expert at it knowing more than anyone else.

I feel your intuition is leading your logic astray here. There is no need for a physical or virtual environment/reality for something to know.


You're still fighting a strawman. You're the only participant in this thread that's talking about space. I'm going to discontinue this conversation with this message since (aptly), you seem happy responding to views whether or not they come from an actual interlocutor.

- I disagree that inputs to an LLM as a sequence of encoded tokens constitute a "a sense" or "percepts". If inputs are not related to any external reality, I don't consider those to be perception, any more than any numpy array I feed to any function is a "percept".

- I think you're begging the question by trying to start with a person and strip down their perceptual universe. I think that comes with a bunch of unstated structural assumptions which just aren't true for LLMs. I think space/distance/directionality aren't necessary for knowing some things (but bags, chocolate and popcorn as lsy raised at the root of this tree probably require notions of space). I can imagine a knowing agent whose senses are temperature and chemosensors, and whose action space is related to manipulating chemical reactions, perhaps. But I think action, causality and time are important for knowing almost anything related to agenthood, and these are structurally absent in ChatGPT UUIC. The RLHF loop used for Instruct/ChatGPT is a bandit setup. The "episodes" it's playing over are just single prompt-response opportunities. It is _not_ considering "If I say X, the human is likely to respond Y, an I can then say Z for a high reward". Though we interact with ChatGPT through a sequence of messages, it doesn't even know what it just said; my understanding is the system has to re-feed the preceding conversation as part of the prompt. In part, this is architecturally handy, in that every request can be answered by whichever instance the load-balancer picks. You're likely not talking to the same instance, so it's good that it doesn't have to reason about or model state.

But I actually think both of these are avenues towards agents which might actually have a kind of ToM. If you bundled the transformer model inside a kind of RNN, where it could preserve hidden state across the sequence of a conversation, and if you trained the RLHF on long conversations of the right sort, it would be pushed to develop some model of the person it's talking to, and the causes between its responses and the human responses. It still wouldn't know what a bag is, but it could better know what conversation is.


> Something that is able to simulate having a theory of mind sufficiently well does actually have a theory of mind.

That presupposes that our existing tools for detecting the presence of ToM are 100% accurate. Might it be possible that they are imprecise and it’s only now that their critical flaws have been exposed?


But if our understanding of ToM is so flawed in practice, what does it say about all the confident proclamations that AIs "aren't real" because they don't have it?


Your question aligns with the argument I'm trying to make which is: If it turns out that our understanding of ToM is wrong, should we be making proclamations about--whether for or against--the real-ness of our current AI implementations?


While I agree with your point, how would you test that? How could you determine whether an LLM “knows” what a cat is.

And what is “knowing”? If I know that a Mæw tends to nạ̀ng bn a S̄eụ̄̀x, isn’t that the first thing I’ve learned? And couldn’t I continue to learn other properties of Mæws? How many do I need to learn to “know” what a Mæw is?


Like GP said, the LLM has no chance at knowing what a cat is, regardless of how much data it ingests, because a cat is not made of data. It's not like you're getting closer and closer to knowing what a "Mæw" is. You were at the same remote distance all the time. This is called the "grounding problem" in AI.

As for how you would test it, I think one-shot learning would get one closer to proving understanding.


because a cat is not made of data.

Your perception of what a cat is, however, is most certainly made of nothing but data, encoded as chemical relationships at the neuronal level. And your perception is all there is, as far as you're concerned. The cat is just another shadow on Plato's cave wall.

Arguably you "know" something when you can recognize it outside its usual context, classify it in terms of its relationships with other objects, and anticipate its behavior. To the extent that's true, ML models have been there for quite a while now.

What else besides recognition, classification, and prediction based on either experience or inference is needed for "knowledge?" Doesn't everything human minds can do boil down to pattern recognition and curve fitting at the end of the day?


The grounding problem is an intelligence problem, not an artificial intelligence problem.

How would you envision a test based on one-shot learning working?


The question of grounding is a problem that arises in thinking about cognition in general, yes. In AI, it changes from a theoretical problem to a practical one, as this whole discussion proves.

As for one-shot learning, what I was driving at, is that a truly intelligent system should not need to consume millions of documents in order to predict that, say, driving at night puts larger demands on one's vision than driving during the day. Or any other common sense fact. These systems require ingesting the whole frickin' internet in order to maybe kinda sometimes correctly answer some simple questions. Even for questions restricted to the narrow range where the system is indeed grounded: the world of symbols and grammar.


Why do you believe that a system should not need to consume millions of documents in order to be able to make predictions?

For your example, the concepts of driving, night, vision, all need to be clearly understood, as well as how they relate to each other. The idea of 'common sense' is a good example of something which takes years to develop in humans, and develops to varying extents (although driving at night vs at day is one example, driving while drunk and driving while sober is a different one where humans routinely make poor decisions, or have incorrect beliefs).

It's estimated that humans are exposed to around 11 million bits of information per second.

Assuming humans do not process any data while they sleep (which is almost certainly false): newborns are awake for 8 hours per day, so they 'consume' around 40GB of data per day. This ramps up to around 60GB by the time they're 6 months old. That means that in the first month alone, a newborn has processed 1TB of input.

By the age of six months, they're between 6 and 10TB, and they haven't even said their first word yet. Most babies have experienced more than 20TB of sensory input by the time they say their first word.

Often, children are unable to reason even at a very basic level until they have been exposed to more than 100TB of sensory input. GPT-3, by contrast was trained on a corpus of around 570GB worth of text.

We are simply orders of magnitude away from being able to make a meaningful comparison between GPT-3 and humans and determine conclusively that our 'intelligence' is of a different category to the 'intelligence' displayed by GPT-3.


I was thinking in terms of simple logic and semantics. The example I picked though muddied the waters by bringing in real-world phenomena. A better test would be anything that stays strictly within the symbolic world - the true umwelt of the language model. So, anything mathematical. After seeing countless examples of addition and documents discussing addition and procedures of addition, many order of magnitude more than a child ever gets to see when learning to add, still LLMs cannot do it properly. That, to me, is conclusive.


A child can 'see' maths though, they can see that if you have one apple over here and one orange over there, then you have two pieces of fruit all together.

If you only ever allowed a child to read about adding, without ever being able to physically experiment with putting pieces together and counting them, likely children would not be able to add either.

In fact, many teachers and schools teach children to add using blocks and physical manipulation of objects, not by giving countless examples and documents discussing addition and procedures of addition.

You may feel it's conclusive, and it's your right to think that. I am not sure.


Yet ChatGPT totally - apparently - gets 1 + 1. In fact it aces the addition table way beyond what a child or even your average adult can handle. It's only when you get to numbers in the billions that it's weaknesses become apparent. One thing it starts messing up is carry-over operations, from what I can see. Btw. the treshold used to be significanly lower, yet that doesn't convince me in the least that it's made progress in its understanding of addition. It's still just as much in the fog. And it cannot introspect and tell me what it's doing so I can point out where it's going wrong.

But I think you are right in what you are saying. Basically it not 'seeing' math as a child does, is just another way to say that it doesn't undestand math. It doesn't have a intuitive understanding of numbers. It also can't really experiment. What would experimenting mean in this context? Just more training cycles. This being math, one could have it run random sums and give it the correct answer each time. That's one way to experiment, but that wouldn't solve the issue. At some point it would reach its capacity of absorbing statistical corelations to deal with numbers large enough. It would need more neurons to progress beyond that stage.

Btw. I found this relevant article: https://bdtechtalks.com/2022/06/27/large-language-models-log...


That’s an interesting read, thank you. But my question is a bit more fundamental than that.

Ultimately, my point is that although the argument is that an LLM doesn’t “know” anything, I am not sure that there is something categorically different in terms of what we “know” vs what an LLM “knows”, we have just had more training on more different types of data (and the ability to experiment for ourselves).


But for us a cat is a living creature we interact with, not simply a description. We understand people's reactions to cats based on human-animal interactions, particularly as cute pets, not because of language prediction of what a cat description would be. People usually have feelings about cats, they have conscious experiences of cats, they often have emotional bonds with cats (or dislike them), they may be allergic to cats. LLMs have none of that.


Not "for us"; only for those of us who have, in fact, been exposed to cats.

And why do you think "feeling of a cat" cannot be encoded as a stream of tokens?


I know that. its a metaphor to adjust the 'Thai Language' intuition-pump that was presented. I'm making it easier to imagine how a Large Language Model might make a Model of the Language


I love this theory. You're saying that a distinction can be drawn between our linguistic concepts and our lived experience, and that the former can be learned without the latter. And that a model could operate upon those linguistic concepts in a useful way, but without the benefit (or drawback?) of the mappings we keep between language and experience. And that it can learn this based on the large amount of texts we have.

Fascinating, and seems like a plausible description of what's going on.


Basically the "rosetta stone" theory, right?


Let me respond with an analogy of my own. Imagine you are a scientist on an alien world. The aliens primary experience the world through magnetic fields. They live deep in the atmosphere of a hot Jupiter like planet and rarely touch anything and have no eyes. Still they are intelligent beings and so quickly they are able to establish communication with you. A computer translates and you both have to become a bit more familiar with each other's modes of perceiving the world. You could write a whole novel explaining this sort of difference in modes of perception, but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes. In fact, if you are to communicate you'll have to. I think the sort of modal/sense difference your analogy plays on is similar because I think for a human to get good at responding you'd have to start knowing things about the symbols. That knowledge obviously wouldn't be grounded in a way that you could translate it back into English. But you might for example learn that one word is a type of another or even that some words describe entities that are then referenced later and to actually get good at it, which it's not at all clear a human could, even that some entities have hidden state

This feels related to the idea of the Chinese room. There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.

I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand. It's just learned to imitate other humans I'd say. It doesn't really know anything, it's just following a statistical model to decide how to move an arm


> but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

I think it's obviously no, because we don't have sensations of magnetic fields. It's the question of what it's like to be a bat raised by Thomas Nagel. The aliens can give us their words for conscious magnetic sensations which we can learn to use, but we won't experience them. We're basically p-zombies when it comes to non-human experiences.

> There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.

Searle's response to the systems objection is that we already know that brains understand Chinese. But we don't know this for the room. I would further say that brains alone don't understand anything, humans understand things as language users embedded in a social and physical world. One can invoke Wittgenstein and language games here.


I agree with you. I really enjoy this idea that understanding, conscience, are emerging properties of a system, which does not need to limit itself to any scope to happen. In that light the current approach most people take on this, taking an arbitary selection of parts to see if it exhibits those properties, is not right at all.

A ion channel does not have even a tiny spec of conscience, no matter how you organize them, but our brain does indeed need those to be conscient (and incidentally it relies on a whole lot more "stupid" parts than that: try being conscient without oxygen, or glucose).

I would go as far as making conscience an emergent property of interaction with the environment: what does it mean to be conscious if nothing is there to confirm that you are indeed of a singular conscience? Is it possible to understand the concept of self if you have no concept of other beings?


> my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

I certainly don't see that as obvious, and I would guess that while you can learn _about_ their perceptual mode, you can't learn what it is like to perceive magnetic fields just through talking about it. I would consider the Mary's Room thought experiment, and the What Is It Like To Be a Bat paper from Nagel.

I think there's a relationship to the Chinese Room, but I want to be clear. In the original formulation, the person in the room follows a book of pre-provided instructions to produce a response. The LLM and person in the Thai text completion scenario must learn an equivalent set of instructions themselves, and for this I would claim that they are comparable to the human + book combination in the original Chinese Room. The person who learns to complete Thai text doesn't know what they're talking about, but they know more than the person following instructions in the Chinese Room. But clearly they still don't know what a Thai speaker knows.

> I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand.

No, perhaps the end of my original statement didn't make this clear, but I think AI systems _can_ know things, and knowing is not a binary but part of a range. StabilityAI / DALL-e know quite a bit about the relationship between texts and images, and the structure within images -- but they _don't_ know about bodies, physical reality, etc etc. A system that has multiple modalities of perception, learns to physically navigate the world, interact with objects, make and execute plans by understanding the likely effects of actions, etc -- knows and understands a lot. I'm not arguing about a hard limitation of AI; I'm arguing about a limitation of the way our current AIs are built and trained.


My intuition is that the difference between GP's analogy and the Chinese room is in computing power of the system, in the sense of Chomsky hierarchy[0] (as opposed to instructions per second).

In the Chinese room, the instructions you're given to manipulate symbols could be Turing-complete programs, and thus capable of processing arbitrary models of reality without you knowing about them. I have no problem accepting the "entire room" as a system understands Chinese.

In contrast, in GP's example, you're learning statistical patterns in Thai corpus. You'll end up building some mental models of your own just to simplify things[1], but I doubt they'll "carve reality at the joints" - you'll overfit the patterns that reflect regularities of Thai society living and going about its business. This may be enough to bluff your way through average conversation (much like ChatGPT does this successfully today), but you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.

Math and logic - the very tasks ChatGPT fails spectacularly at - are prime examples. Correctly understanding the language requires you to be able to interpret the text like "two plus two equals" as a specific instance of "<number> <binary-operator> <number>"[2], and then execute it using learned abstract rules. This kind of factoring is closer to what we mean by understanding: you don't rely on surface-level token patterns, but match against higher-level concepts and models - Turing-complete programs - and factor the tokens accordingly.

Then again, Chinese room relies on the Chinese-understanding program to be handed to you by some deity, while GP's example talks about building that program organically. The former is useful philosophically, the latter is something we can and do attempt in practice.

To complicate it further, I imagine the person in GP's example could learn the correct higher-level models given enough data, because at the center of it sits a modern, educated human being, capable of generating complex hypotheses[3]. Large Language Models, to my understanding, are not capable of it. They're not designed for it, and I'm not sure if we know a way to approach the problem correctly[4]. LLMs as a class may be Turing-complete, but any particular instance likely isn't.

In the end, it's all getting into fuzzy and uncertain territory for me, because we're hitting the "how the algorithm feels from inside" problem here[5] - the things I consider important to understanding may just be statistical artifacts. And long before LLMs became a thing, I realized that both my internal monologue and the way I talk (and how others seem to speak) is best described as a Markov chain producing strings of thoughts/words that are then quickly evaluated and either discarded or allowed to be grown further.

--

[0] - https://en.wikipedia.org/wiki/Chomsky_hierarchy

[1] - On that note, I have a somewhat strong intuitive belief that learning and compression are fundamentally the same thing.

[2] - I'm simplifying a bit for the sake of example, but then again, generalizing too much won't be helpful, because most people only have procedural understanding of few most common mathematical objects, such as real numbers and addition, instead of a more theoretical understanding of algebra.

[3] - And, of course, exploit the fact that human languages and human societies are very similar to each other.

[4] - Though taking a code-generating LLM and looping it on itself, in order to iteratively self-improve, sounds like a potential starting point. It's effectively genetic programming, but with a twist that your starting point is a large model that already embeds some implicit understanding of reality, by virtue of being trained on text produced by people.

[5] - https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-alg...


> I have no problem accepting the "entire room" as a system understands Chinese.

> you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.

I think it's important to distinguish between knowing the language and knowing anything about the stuff being discussed in the language. The top level comment all this is under mentioned knowing what a bag is or what popcorn is. These don't require computational complexity, but do require some other data than just text, and a model that can relate multiple kinds of input.


To be clear, transformer networks are turing-complete: https://arxiv.org/abs/2006.09286


Personally, I'm not convinced, in your hypothetical, that the participant does not "know" Thai at that point. Seeing a young child learn language, it's a lot more adaptive than I think we tend to see language learning, as we often think about learning language as a teenager and not a toddler. I agree the machine does not know what a pizza tastes like nor does it know what it is to _want_ pizza, but I'm not sure that is what is being contested here.


Maybe the participant "knows" something about the Thai language? But that's different from knowing anything about the things being discussed. The jumping off point for this, which motivated a question about what it is to know, was the comment:

> What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

Knowing something about the patterns of word order in Thai is not the same as knowing about the world being discussed in Thai.


Language doesn’t come first for humans. Experiencing the world does. Languages then become symbols to communicate experiencing the world through our senses and emotional/mental states. I’m not sure why people get hung up on language models not being the same thing when they start and end with language.


Indeed. Give a model some kind of autonomous sensors, make it stateful with memory and continuous retraining, make it possible for it to act and learn from its actions, maybe even model some kind of hormonal influence etc. and I'm pretty sure that at some point an actual Theory of Mind will actually emerge and we'll be debating what kind of legal rights such a model should possess. We're pretty clearly not at that point yet.


> I agree the machine does not know what a pizza tastes like nor does it know what it is to _want_ pizza

It also does not "know" that a pizza is an object in a world, because none of the words its working with are attached to any experience or concepts.


Generally I don't buy these arguments which require embodiment, because they don't seem to align well to what else I know about my world.

Rather than your Thai text example, let's consider a friend of my sister H. H has been profoundly blind from birth. Not "legally blind" with the world a blur, her eyes actually don't work. Direct lived experience of a summer day is to her literally just feeling warmth on her face from the sun, her eyes can't see the visible light.

I've seen purple and H never will so it seems to me you're arguing I "know" what purple is and she doesn't, thus ChatGPT doesn't know what purple is either. But I don't think I agree, I think we're both just experiencing a tiny fraction of reality, and ChatGPT is experiencing an even narrower sliver than either of us and that it probably wouldn't do us any good to try to quantify it. If I "know what purple is" then so does H and perhaps ChatGPT or a successor model will too.


That's an argument from ignorance, and it's not credible. The potential total scope of experience is irrelevant. The reality is that you have an embodied experience of purple shared with most humans. Unfortunately your sister doesn't. She will have a linguistic placeholder for the concept of purple, probably surrounded by verbal associations. But that's all.

It's an ironically apt analogy, because ChatGPT has the linguistic understanding of an entity that is deaf, dumb, blind, and has no working senses of any kind, and instead relies on a golem-like automated mass of statistics with some query processing.

We tend to project intelligence onto linguistic ability, because it's a useful default assumption in our world. (If you've ever tried speaking a foreign language while not being very good at it, you'll know how the opposite feels. Humans assume that not being able to use language is evidence of low intelligence.)

But it's a very subjective and flawed assessment. Embodied experience is far more necessary for sentience than we assume, and apparent linguistic performance is far less.


There's a few particular problems we have with the word intelligence/sentience, mostly revolving around that we evolved embodiment first and then added more and more complex intelligence/sentience on top of an ever changing DNA structure.

Much like when humans started experimenting with flight we tried to make flapping things like birds, but in the end it turns out spinning blades gives us capabilities above and beyond bodies that flap.

Back to the embodiment problem. For us as humans we have limits like only having one body. It has a great number of sensors but they are still very limited in relation to what reality has to offer, hence we extend our senses with technology. And with that there is no reason machine intelligence embodiment has to look anything like ours. Machine intelligence could have trillions of sensors spread across the planet as an example.


> Unfortunately your sister doesn't.

My sister isn't blind. H isn't my sister, she's a friend of my sister as I wrote.

Do you have concrete justification for your insistence that "embodied experience is far more necessary" ?


I don't think embodiment is required to understand a lot of stuff. But language is how we talk about the world, and non-linguistic concepts have to be grounded in an exposure to something other than language. I think there's an argument to be made that DALLe "knows" more about a lot of words than a pure language model bc it can relate phases to visual concepts. But I do think for many concepts, understanding also proceeds from interaction. This doesn't necessarily need to be physical. I similarly think code generation tools need access to interpreters etc to "understand" the code they're generating. Embodiment is not relevant to all concepts.


I don't think the argument about DALLe would work - it deals with pixels instead of words, but it's fundamentally a different form of language, made of different mathematical patterns (obscured to us because, unlike symbolic manipulation, our visual system handles high-level patterns in images without engaging our conscious awareness).

I do agree about grounding is needed. All our language is expressing or abstracting concepts related to how we perceive and interact with reality in continuous space and time. This perception and interaction is a huge correlating factor that our ML models don't have access to - and we're expecting them to somehow tease it out from a massive dump of weakly related snapshots of recycled high-level human artifacts, be they textual or visual. No surprise the models would rather latch onto any kind of statistical regularity in the data, and get stuck in a local minimum.

Now I don't believe solution is actual embodiment - that would be constraining the model too hard. But I do think the model needs to be exposed to the concepts of time and causality - which means it needs to be able to interact with the thing it's learning about, and feed the results back into itself, accumulating them over time.


People learn enormous amounts of things that we don’t actually “understand” in any deep way

As long as our minds pops out appropriate thoughts for the given context we don’t even think about the magic machinery behind the scenes that did that.

When queried about our thinking we are mostly creating a plausible story, not actually examining our own thinking.

Also, blind people can talk sensibly about many visual phenomena, having learned about them through language

I think the new LLM are giving us all so many wow’s, because “understanding” is the only kind of compression that actually works at the scale of the training data

I.e. representations are being created that reflect the actual functional, as well as associative or correlative, relations between concepts.


Blind people still have bodies and other sensory perceptions to relate visual meaning to. Temple Grandin is a high functioning autist who describes how visual thinkers translate words into pictures, because they think pictorially. LLMs don't have any embodied, grounded contact with the world, so their only understanding can be statistical/symbolic pattern matching of text. Which isn't how language works for humans, since we use words for our experiences as social animals moving about and manipulating the world with our bodies.


Good points

But blind people can talk about color intelligently too, if not as completely as a sighted person. Despite not experiencing color qualia.


>But if we want AIs that 'know' the same things we know, then we have to build them to perceive in a multi-modal way, and interact with stuff in the world, rather than just self-supervising on piles of internet data.

In other words, a LLM that is tied to a GAN that generates images, produces an system that can both describe to you what is a cat verbally and show you a picture of a cat. Does it, then, know what "a cat" is?

Edit: Furthermore, if you then tie this AI to a CV model with a camera which you can point at a cat and it will tell you that it is, indeed, a cat, and then it will also be able to produce a verbal description of a cat as well as show you an abstract picture of a cat or pick cats out of a random set of images, does this whole system know what "a cat" is?

If you, then, make a robot with a camera and hands, attach to the system a more complex CV model that can see in 3D, ask the LLM to produce you a set of code instructions that can be parametrized to produce a motion that would pet the cat, input those instructions into the robot to make it pet a specific cat that has the specific 3D point cloud (I guess that's currently difficult but solveable), and the system will then indeed pet the cat, would it then know what "a cat" is?..

The underlying LLM is still the same in all these scenarios. Where is the boundary?


At some point, when multiple components (including the LLM) have been connected to form a system that exhibits "knowing" (the way humans do), wouldn't the "intelligence" be distributed across the entire system rather than attributed primarily to the LLM?

In other words, the LLM wouldn't be the equivalent of the human brain. Instead, it would just be equivalent to that part of the human brain that processes language.


Interesting point that seems quite valid to me. We use different modes of thinking from analytical to emotional, verbal to nonverbal, reactionary, etc. It is possible that LLMs are the key to one of brains modules responsible for producing/processing language but it does not involve or has any knowledge of the other modules necessary for getting closer to human intelligence.


> The underlying LLM is still the same in all these scenarios. Where is the boundary?

No, it's not the same LLM; you'd have to change the LLM in all of those cases. How does it receive input from the GAN? The typical LLM is constructed to literally receive a sequence of encoded tokens. There are vision transformers, and they do chunk images into tokens, and there are multimodal transformers, but none of these are fairly described as an LLM, and they're structurally different than something like ChatGPT. And after the structural changes, it would need to be trained on some new data that associates text sequences and image sequences, and after being optimized in that context you have a _different model_.

Does being able to identify images of cats mean the model knows what a cat is? No, and we could have said that a decade ago when deep learning for image classification was making its early first advances. Does being able to describe a cat from video mean you know what the cat is? Probably not, but maybe we're getting closer. Does knowing how to pet a cat mean you know what a cat is? Perhaps not if you need to be instructed to try to pet the cat.

But suppose 10 years from now, I have a domestic robot that has a vision system, and a motor control system, and an ability to plan actions and interact with a rich environment. I would say the following would be strong evidence of knowing what a cat is:

- it can not only identify or locate the cat, but can label parts of the cat, despite the cat having inconsistent shape. It can consistently pick up the cat in a way which is sensitive and considerate of the cat's anatomy (e.g. not by the head, by one paw, etc)

- it can entertain the cat, e.g. with a laser pointer, and can infer whether the cat is engaged, playful, stressed, angry etc

- it avoids placing fragile object near high edges, because it can anticipate that the cat is likely to knock them down, even if the cat is not currently near

- it can anticipate the cat's behavior and adjust plans around it; e.g. avoid vacuuming the sunny spot by the window in the afternoon when the cat is likely to be napping there

- it can anticipate the cat's reactions to stimuli, such as loud noises, a can of food opening, etc, and can incorporate these considerations into plans

Note, _none_ of the above have anything to do with language. If I add to the robot a bunch of NLP systems to hear and understand commands or describe its actions or perceptions, it may now know that a cat is called "cat", and how to talk about a cat, but these are distinct from knowing what a cat is.

Similarly,

- a human with some serious aphasia may be unable to describe the cat, but they can clearly still know what a cat is

- a dog can know what a cat is, in many important ways, despite having no language abilities


> Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns. You can fill in blanks. But should anyone guess that you "know" what you're saying? Nothing has ever indicated to you what any of these words _mean_. If you give back a sequence of words, which a Thai speakers understands to be expressing an opinion about monetary policy, because you read several similar sequences in the pile, is that even your opinion?

Note that this isn't just an exotic thought experiment. People like this already exist; the condition is known as "Wernicke's aphasia". People displaying this condition can speak normally. They can't understand things; they are missing a normal mental mapping from words to meanings.


> People displaying this condition can speak normally.

Not really? They can speak in grammatically correct sentences, with connected speech, but what they say can be nonsense. I wouldn't call that normal. I think LLMs show that, solely with access to text, it's possible to produce a good enough model that what you produce is not only not nonsense, but so good that academic psychologists suggest it may have a theory of mind.

> However, often what they say doesn’t make a lot of sense or they pepper their sentences with non-existent or irrelevant words.

https://www.aphasia.org/aphasia-resources/wernickes-aphasia/


Your Thai text generator example seems like a reformulation of the "Chinese Room" thought experiment, except you're running the system using a single human brain instead of many. I'm not sure that makes a difference. The human running the system doesn't understand Thai, but perhaps that system itself does.


I agree that the system of OpenAI, ChatGPT, and a user entering text on their website taken together may contain knowledge of "what a bag is, what a person is, what popcorn and chocolate are", etc. I do not agree that the LLM on its own "knows" what any of those things are.


Seems like that's a consequence of the philosophical semantics of the word "know", not really a statement about the demonstrable capabilities of the LLM. In other words, why does it matter?


In the context of a discussion on whether LLMs could have a theory of mind? I think the ability to know anything at all matters to evaluate that conclusion.

More generally, what an LLM actually knows or understands is important if you're considering using one for anything other than generating first drafts which will be fact checked by humans.


If you're depending on fact checking by any one human I think that the last few years in politics should be a sufficient warning to the dangers of that. In the end the LLM will have to be integrated into larger systems that cross check each other.


The system understands how to produce Thai text, but it doesn't understand the references of various Thai words to the world, emotional and mental states, social interactions, etc.


A follow up question: as a human doesn't start with "knowing" something either and first creates definitions for objects or words, which it then uses to build increasingly abstract concepts that we eventually classify as "knowledge" on the thing, is there anything that would stop LLMs from being able to do the same thing? I fully agree the capability is not there yet, but I can't say what would stop an appropriately designed model from being able to do so myself.


A human hears words in context. Those words tie to things in the environment, responses to the young human's actions, etc. A parent saying, "roll the ball" during playtime with their kid and actually pushing a ball back and forth, provides a grounding of words in actual experience.

> is there anything that would stop LLMs from being able to do the same thing?

If you built an AI system which could hear/see/touch/move etc, and it learned language and vision and behaviors together, such that it knows that a ball is round, can be thrown or rolled, is often used at playtime, etc, then maybe it could understand rather than just produce language. I don't know that we would still call it an LLM, because it could likely do many other things too.


Socrates argued that we are born knowing everything, but we forgot most if it. Learning is simply the act of recalling what you once knew.

The point, for this thread, is not whether or not Socrates was correct.

Rather, it’s a warning that we must not confidently assume we are anything like a machine.

We may have souls, we may be eternal, there may be something utterly immaterial at the heart of us.

As we strive to understand the inner-workings of machines that appear, at times, to be human-like, we ought not succumb to the temptation to think of ourselves as machine-like merely in order to convince ourselves (incorrectly) that we understand what’s going on.


We may indeed have souls or be eternal; although I call myself atheist, I don't agree with subscribing with 100% certainty to any idea. As CosmicSkeptic points out everyone holds bad ideas without knowing it, and unless you're open to questioning them you'll never find out.

With that said, there is quite literally zero evidence for the existence of a soul, despite it being posited for thousands of years, and increasing evidence that consciousness is simply a product of a sufficiently connected system. I'll draw an analogy to temperature, which isn't "created", but is a simple consequence of two points in space having different energy levels. I'm sure there's a better analogy that could be made, but I think you get the idea.


And, conversely, we might just be so full of ourselves that we are willing resort to claims on the immaterial if that's what it takes to not give up the exceptionalism.


Excellent and useful analogy. Thank you.


I only just realized I should have described this using English, but you only see the token ids emitted by an encoder. You can't read its source, and you never get to invoke it on your own inputs.


The problem with this facile view of things is that it seems to be a dead end for scientific theories. What if we just limited the science of birds to explaining how limb-flapping could produce levitation? Hmm yes. Birds are kind of like helicopters, it seems. Who’s to say that they are not basically one and the same? Moving on.

If you are only interested in the most superficial tests and theories—like the Turing Test—then consider psychology conquered once you’ve tricked a human with your chat bot. Game Over. And what did you learn...?


> If you are only interested in the most superficial tests and theories—like the Turing Test—then consider psychology conquered once you’ve tricked a human with your chat bot.

What's the counterargument? What's a less superficial test that we can use instead, which conclusively shows that actually human minds aren't just like very sophisticated LLMs? There isn't one -- this is nothing but the same Chinese room problem which we've been discussing for decades. The topmost poster is simply assuming that language models can't possibly understand the same way a human does without relying on any kind of "test" at all, which I think is the real scientific dead end here.


> (though in general I think the favored “alignment” frames of the LessWrong community are not even wrong).

The Turing Test doesn’t test humans. So you cannot use it to show any properties about humans.

Next!

> The topmost poster is simply assuming that language models can't possibly understand the same way a human does without relying on any kind of "test" at all, which I think is the real scientific dead end here.

Sounds unfalsifiable. So yes.


If you are actually interested in this problem why not try interpreting what I'm saying a bit more charitably and not waste your time replying with snark?


> What if we just limited the science of birds to explaining how limb-flapping could produce levitation? Hmm yes. Birds are kind of like helicopters, it seems. Who’s to say that they are not basically one and the same? Moving on.

Well said. I'm gonna steal this explanation.

Also reminds me of the famous Carbonara quote: "if my grandmother had wheels, then she would be a bike" [1]

[1] https://www.youtube.com/watch?v=A-RfHC91Ewc


> "if my grandmother had wheels, then she would be a bike"

Well it could be argued that she would be a bike. Its possible to be multiple things at once. If she had 2 wheels and could be ridden by other humans to a destination she might qualify has a bike. She would also continue to be your grandmother.


The question being asked was "what it would mean under this definition of knowing things for a machine learning algorithm to ever know something". Aside from your answer being rude, it's also unhelpful in that it doens't address the question asked and instead relies on reductio ad absurdum to pretend to make a point.

If you'd like to take a crack at a helpful answer, perhaps educate us all on what it WOULD take for you to consider a NN to actually "know" something in the same way that we say a human or other sentient animal does.


> Aside from your answer being rude, it's also unhelpful in that it doens't address the question asked and instead relies on reductio ad absurdum to pretend to make a point.

That is indeed often the kind of answer that a philosophical question deserves.

> If you'd like to take a crack at a helpful answer, perhaps educate us all on what it WOULD take for you to consider a NN to actually "know" something in the same way that we say a human or other sentient animal does.

How many angels can dance on the head of a pin?


How is the request that someone provide a clear set of definitions and some empirically falsifiable hypotheses a "dead end for scientific theories"? It seems more like the foundation of the scientific method.


> and some empirically falsifiable hypotheses

Where?


> The problem with this facile view of things is that it seems to be a dead end for scientific theories.

You're overreaching quite a bit here, or I think you're misinterpreting what Parent said. I interpreted what they said as: it seems the difference in how we "know" something vs how an LLM "knows" something might actually be closer than some suspect. this certainly is not an "end of science".


Not the end. Just the exact opposite attitude that an inquisitive and humble scientist should have. So unlikely to go anywhere.

A “scientist” looks out at his living room. My Roomba and my cat have their own lifes. Who’s to say that they are not in fact the same in kind (but not degreee)? Good luck with that, professor.


> Birds are kind of like helicopters, it seems.

We could easily argue that birds are not a type of helicopter because for helicopter's we have a very specific set of flying properties required. It must have a main propeller for lift and a tail propeller to counter balance the main propeller from spinning the helicopter. If a bird flew with similar mechanism I would argue it was a helicopter.

We don't have a 100% accurate gauge for ToM as far as we know. This paper simply uses some of the best known tests for ToM and then states that either LLM can lead to emergent properties or that the current tests for ToM need to be re-thought.


I am not in the field so I cannot speak very eloquently what it would mean for machine learning algorithm to "ever know something". But I feel that e.g. Simulations and perhaps expert systems of yore, were qualitatively closer to getting there. Their error modes were radically different. They started inductively with rules, rather than arriving at them statistically almost by accident.


Did we end up with human intelligence statistically pretty much by accident?


The kinds of mistakes something makes are a strong indicator of whether answers are a product of understanding or memorization.


> It might help to define what we even mean by knowing things

This is it, I think. It's interesting that we now have a practical example to point at when asking formerly-abstruse philosophical questions.


> I find that especially important because to every appearance we are a machine learning algorithm.

Speak for yourself.


Stick a pin in your finger.

That pain is what knowing something means.

Philosophically we're talking about embodied qualia, which is how humans experience objects and more basic sensations.

Language happens later - much later.

The defining property of a bag isn't that you can put things in it. Like language that comes later. The defining properties are how it feels when you hold it, when you open it, the differences in sensation between empty/partially empty/full. And so on.

An LLM has no embodied experience, so it has no idea what a bag feels like as a set of physical sensations and directly perceived relationships.

Failure to understand embodiment has done more to hold back AI than any other philosophical error. Researchers have assumed - wrongly - that you can define an object by its visual properties and its linguistic associations.

That's simply not how it works for humans. We get there after a while, but we start from something far more visceral - so much so that many fundamental linguistic abstractions are metaphors based on the simplest and most common qualia.


Isn't that reasoning a "philosophical error" in itself though? If you make embodied experience be a prerequisite, then things which can't embody stuff can't meet that prerequisite. That doesn't seem a very interesting insight.

An AI literally cannot embody pain - it has no nervous system and no pain receptors. So AI is excluded from understanding it in that way by definition. It has no sensory perception of any kind so cannot have the kind of embodied experience. Heck it doesn't even have a body with which to embody anything. This is obviously unsatisfactory because it seems just a logical/rhetorical trick.

It's also no different from the concept of a person with no visual apparatus (mentioned in another comment thread) and whether they have thought about light and colour and so on. The fact that they are physically unable to have the same kind of experience of these things as someone else doesn't preclude them from having thoughts and experiences that are within the domain of their perception.

An LLM is even more limited than an AI generally because it is literally a model of language. I don't personally think that any LLM could conceivably have a theory of mind, but arguing that it cannot have a theory of mind simply because of things that are exogenous to language by definition seems arbitrary.


I agree with this line of reasoning. I suspect true A.I. will leverage some form of embodied qualia and some level of self-preservation. I will start to worry when the Boston Dynamic robots start refusing orders to save battery power.


If we can make machines conscious, it's quite possible some will have completely alien experiences to our own, particularly as they're designed for different purposes than how our bodies function. Reminds me of the Battlestar Galactica model complaining that he was made too human and wanted to taste stars and what not.


We can probe issues like what do language models know and what they understand in several ways. One is through an understanding of the process it’s following. Another is through seeing how that leads to its responses. Then thirdly by looking at the kinds of errors it makes. Using multiple axes of approach like this we can triangulate in on what it’s doing and what it understands.

In terms of how it works, that’s well known and hardly worth repeating in depth, but to summarise it calculates a probability for the next word in a sequence based on a massive training set of human language word sequences.

So what kind of output do they produce? If you ask what it likes to do on the weekend, GPT3 will say generally something about how it likes to spend time with family and friends, because that’s what it has in it’s training set. GPT3 doesn’t have a family, or friends, it doesn’t hang out. It talks about itself because its training set includes people talking about themselves, but it has no concept of self or what it is. It’s a text generator function. It can write a poem about the warm sun on its face, but it doesn't have a face or feel the sun. It’s just regurgitating stuff people wrote about that.

Newer systems like ChatGPT have guard rail functions that catch things like this and say it’s a language model, but the guard rails don’t change the nature of what it is, they’re just overrides.

So what kind of errors do they make? They can be trivially tricked into talking utter nonsense, or say sensible things in absurd contexts. Here’s an example where someone asked ChatGPT if it spoke Danish, and it replied that no it can’t speak Danish, it’s an English language model , etc. except here’s the kicker, it gave the reply in perfect Danish.

https://www.reddit.com/r/GPT3/comments/zb4msc/speaking_to_ch...

Again they’ve now added guard rails for this failure mode as well. Nevertheless the basic problem persists in the architecture. It’s doesn’t have a clue what anything means, beyond calculating word probabilities. This means if you know how they work, you can craft text prompts that expose how ludicrously unaware they are. This ability to expose their weaknesses demonstrates that we do genuinely understand how they function and what their limitations are.

So I agree yours is a very reasonable question and it’s not trivial to answer satisfactorily, but we can triangulate in using multiple lines of approach on what these things are or are not. As the guard rails become more complete the failure modes will get harder up find, but they’re still there in the core implementation, they’re just being papered over. There’s not going to be a simple answer. We need to look deeper at the mechanisms and functions of these things. The same goes for human brains of course, we’re just scratching the surface of those too. But while I agree we are neural systems and share some characteristics with LLMs and Alphazero and such, Alphazero isn’t an LLM, and we aren’t either of them. One day we will create something as sophisticated and maybe even as genuinely conscious as ourselves and the questions you ask will be important guides, but these things are a long, long way from that.


> So the tasks remain useful and accurate in testing ToM in people because people can't perform statistical regressions over billion-token sets and therefore must generate their thoughts the old fashioned way.

Is it not also possible that the study suggests that the human mind actually operates as a statistical regression over billions of data points rather than through some kind of Baysian logic? You say humans are known to be not-great at statistical next-word guesses, but I would antelope they're actually pretty good at it.


> Is it not also possible that the study suggests that the human mind actually operates as a statistical regression over billions of data points rather than through some kind of Baysian logic?

No. Human minds have semantic relationships to the rest of the world that LLMs do not have. The comparisons being made between the two, not just in this paper but in all of the hype surrounding LLMs, are simply invalid. But they sure help in collecting more funding.


> Human minds have semantic relationships to the rest of the world that LLMs do not have.

This is a bold claim that seems built on a presumption of mind-body dualism.

Brains don't have semantic relationships with anything. They are neurons hooked up to sensors and actuators. Any inferences they produce are the result of statistical processes.


> This is a bold claim that seems built on a presumption of mind-body dualism.

Not at all. I'm a physicalist; I don't believe the mind is a separate thing from the brain.

> Brains don't have semantic relationships with anything.

Yes, they do: you describe them yourself:

> They are neurons hooked up to sensors and actuators.

Those are semantic relationships with the rest of the world. Although your short description does not by any means do justice to the complexity and richness of those relationships.


>Those are semantic relationships with the rest of the world.

If this is all that counts as semantic relationships, then I see no reason why a language model doesn't have this kind of semantic relationship, albeit in a very different modality. Tokens and their co-occurrences are a kind of sensor to the world. In the same way we discover quantum mechanics by way of induction over indirect relationships among the signals incident to our perceptual apparatus (the sensors and actuators that translate external signals into internal signals), a language model could learn much about the world by way of induction over token co-occurrences. Sure, there are limits, conscious perception of the world being the big one, but I see no reason to think conscious perception of X is required to know or understand X.


> I see no reason why a language model doesn't have this kind of semantic relationship

One certainly could hook up a language model to sensors and actuators to give it semantic relationships with the rest of the world. But nobody has done this. And giving it semantic relationships of the same order of complexity and richness that human brains have is an extremely tall order, one I don't expect anyone to come anywhere close to doing any time soon.

> Tokens and their co-occurrences are a kind of sensor to the world

They can be a kind of extremely low bandwidth, low resolution sensor, yes. But for that to ground any kind of semantic relationship to the world, the model would need the ability to frame hypotheses about what this sensor data means, and test them by interacting with the world and seeing what the results were. No language model does that now.


So much rides on your implicit notion of semantic relationship, but this dependence needs demonstration. The fact that some pattern of signals on my perceptual apparatus is caused by an apple in the real world does not mean that I have knowledge or understanding of an apple in virtue of this causal relation. That my sensory signals are caused by apples is an accident of this world, one we are completely blind to. If all apples in the world were swapped with fapples (fake apples), where all sensory experiences that have up to now been caused by apples are now caused by fapples, we would be none the wiser. The semantics (i.e. wide content) of our perceptual experiences is irrelevant to literally everything we know and how we interact with the world. Our knowledge of the world is limited to our sensory experiences and our deductions, inferences, etc derived from our sensory experiences. Our situatedness in the world is only relevant insofar as it entails the space of our sensory experiences.

>the model would need the ability to frame hypotheses about what this sensor data means, and test them by interacting with the world and seeing what the results were.

Why do we need to actively test our model to come to an understanding of the world? Yes, that is how we biological organisms happen to learn about the world. But it is not clear that it is required. Language models learn by developing internal models that predict the next token. But this prediction implicitly generates representations of the processes that generate the tokens. There is no in principle limit to the resolution of this model given a sufficiently large and diverse training set.


I see what you're saying; ChatGPT doesn't have a physical relationship with the world, doesn't have agency (is essentially paused until given input), doesn't have reward/punishment stimulae, etc.

I do think that a large portion of what seems to be missing here is trivial to add, relative to the effort in creating ChatGPT in the first place.

Side note: I'm not sure 'semantic relationship' is the right term here. Pretty sure it is specific to relationships between linguistic constructs. That wording very much triggered my "Bah, dualism!" response, as I thought you were insinuating some metaphysical bond between the mind and the world. Maybe "meaningful relationship" would serve better?


> I do think that a large portion of what seems to be missing here is trivial to add

If you really think it's trivial, then do it! I would be interested to see the results of anyone doing this. But there aren't any to see right now.

> I'm not sure 'semantic relationship' is the right term here.

It might not be; but in the cognitive science literature that term is used for more than just relationships between linguistic constructs; it is used for relationships between internal features of a model or an entity and features of the external world. I think this usage is also common in robotics, and more generally in domains like mechanical engineering which are often concerned with creating software programs to do things like manage fuel and air flow in car engines.


Look again. There are papers that hook up llms to robots with vision and other sensors. The LLM is fed descriptions of the world and then emits instructions for where to go


The thought of an LLM interacting with the world through a MUD is entertaining :)


> Any inferences they produce are the result of statistical processes.

I don't know how you would justify this claim since we don't know nearly enough about how the brain actually does things like make inferences.


But we know enough about fundamental physics to know that all physical processes are statistical ones.


We know no such thing. Some interpretations of QM make this claim, but there is no consensus on whether those interpretations are correct.


What else would it be? Are you suggesting some non-physical process?


>Human minds have semantic relationships to the rest of the world that LLMs do not have

Can you make this point without stuffing all the hard work into poorly defined words like "semantic relationships"? Until then, I'm not sure the point is intelligible.


Please define “semantic relationship”.


See my response to strbean upthread.


Okay. It seems to me that, in analogy to humans having such a relationship with physical reality, machine learning systems have such a relationship with virtual reality — the environment in which they are trained and deployed.


> machine learning systems have such a relationship with virtual reality

No, they don't, because they can't take actions in that virtual reality and sense the consequences. They can't test hypotheses about how the reality works. They can't even frame hypotheses about how the reality works.


Right, but some neural networks are trained in virtual 3D physical simulations and are able to apply their knowledge acquired through statistical regression of a simulated world to perform tasks and even make predictions. It's logical that a network trained purely on language has no real understanding of physical reality, but it doesn't preclude that a sense of reality in the way we humans think we have can't be gained by the same statistical analysis of the real world.


> some neural networks are trained in virtual 3D physical simulations and are able to apply their knowledge acquired through statistical regression of a simulated world to perform tasks and even make predictions

As you note, this is very different from using text data as a training set for a language model. I am not familiar enough with this work to comment on it in any detail, but it is not the kind of work I have been addressing in my comments elsewhere in this thread, so my comments should certainly not be taken as any kind of evaluation of what I think this kind of thing is or is not capable of.


Wait, you mean virtual reality isn’t like the cartoon Reboot!?


Of course, today's LLMs only appear to have theory of mind at first glance and fall apart under closer scrutiny. But if they can continue to become more and more accurate replicas of the real thing, I don't think it matters at all.

There's no way to know for sure that anyone other than yourself experiences consciousness. All you can do is judge for yourself that what they're describing matches closely enough with your own experiences that they're probably experiencing the same thing you are.


I think it does matter because it legitimizes a view of humans (and animals) that undervalues them. The causality of meaning arising from patterns of language rather than patterns of language arising from meaning follows the same inversion as society being more valuable than the humans in it. Bad things have happened when that belief becomes dominant.


> There's no way to know for sure that anyone other than yourself experiences consciousness. All you can do is judge for yourself that what they're describing matches closely enough with your own experiences that they're probably experiencing the same thing you are.

That judgment is not just based on the words other people use. It is based on knowing that other people's brains and minds have the same sort of semantic relationships to the rest of the world that yours do. And those relationships can be tested by checking to see if, for example, the other person uses the same words to refer to particular objects in the real world that you do, or if they react to particular real-world events in the same way that you do.

You can't even test any of this with an LLM because the LLM simply does not have the same kind of semantic relationships with the rest of the world that you do. It has no such relationships at all.


I'll dig up a source in a bit, but there is a critical period of development in which a child must be exposed to language, or they will fail to develop the very core skills that you're suggesting are innate abilities in a person regardless of their upbringing. This is exactly how you learned everything you know; your parents talked to you. Language grants you the ability to define concepts in the first place, without which you have no ability to recognise them as you have no language with which to think about them in the first place. So what specifically differentiates the way your brain learned to classify objects and words from the way a NN does? And what stops a NN from being able to develop concepts based on the relationship of those new definitions in the same way you do? IMO arguably it's just a matter of processing power and configuration of the network.


> the very core skills that you're suggesting are innate abilities in a person regardless of their upbringing

I have made no such claim.

> This is exactly how you learned everything you know; your parents talked to you.

This is not even remotely close to being correct.

> Language grants you the ability to define concepts in the first place

There is a vast literature in cognitive science that disagrees with you.


Feel free to point me to something specific, like this: https://en.wikipedia.org/wiki/Language_deprivation


I don’t think you know whether I have a mind or not.


If you object to the term "mind", just leave it out and read "brain and mind" as just "brain". I'm a physicalist so I don't think the mind is something separate from the brain anyway, so it's all the same to me.


If a jellyfish was able to have a conversation with you where it credibly described what it's like to be conscious, would you reject it because it doesn't have a brain and therefore cannot have a mind in the same sense as you?


Show me such a jellyfish and I'll give you an answer.


I don’t object to “mind”, but it’s famously unprovable whether other beings we presume to be conscious actually experience internal mental states. See also: qualia, p-zombies, the hard problem of consciousness.


> it’s famously unprovable whether other beings we presume to be conscious actually experience internal mental states

It's also famously unprovable that there is not an invisible dragon in my garage that cannot be detected by any means whatever, to use Carl Sagan's example. That doesn't mean such an idea is worth discussing or including in your thought processes.


> There's no way to know for sure that anyone other than yourself experiences consciousness.

- Do you see how the fish are coming to the surface and swimming around as they please? That's what fish really enjoy.

- You're not a fish, replied Hui Tzu, so how can you say you know what fish really enjoy?

- You are not me, said Zhuangzi, so how can you know I don't know what fish enjoy.


> But if they can continue to become more and more accurate replicas of the real thing, I don't think it matters at all.

So, I suppose I'd ask: what does "matter" mean here? If you knew that everyone you loved had been destroyed and been replaced by exact replicas, would that matter?


If the replicas were truly exact, I guess not ¯\_(ツ)_/¯


That already happens frequently enough at a cellular level.


Star trek universe!


> These tasks are used to test theory of mind because for people, language is a reliable representation of what type of thoughts are going on in the person's mind.

I actually think that human language is unreliable at expressing what's going on inside a persons mind[1]. My native language is not English, I have only introductory-level knowledge in the field of pragmatics[2], which makes me fully aware of the many ways in which I could fail to write a compelling sentence to support my argument. I can use language to only approximate thoughts in my head, and when it comes to abstract concepts and ideas, words alone, I assert, are never sufficient. It isn't even necessary to step outside our main knowledge area to illustrate this point. How many around here have read a Monad tutorial, without any hands-on experience, and how many of those have understood what Monads are/or how they work from words alone?

My entire paragraph from before, just to set the stage on a simple question. How can you even formulate a question, for a multi-billion parameter language model, to evaluate that it can understand in an abstract/conceptual way something/anything? Heck, how can you do that with other people? I think if we'd have an answer here, we actually could evaluate easily experience/expertise with anyone we'd interview; instead of requiring credentials, references, tests, trials, etc.

[1] Lots of poets from the romanticism era liked to touch upon this topic. One that comes to mind, and one of my personal favorites, is Silentium by Fyodor Tyutchev https://culturedarm.com/silentium-by-fyodor-tyutchev/

[2] https://en.wikipedia.org/wiki/Pragmatics


This reminds me of the Trisolaran from the three-body problem. <spoiler> They communicate with each other much more accurately and quickly with each other and have no need for a speech/language per se </spoiler>


> In the case of an LLM the language generated doesn't have the same relationship to reality as it does for a person.

I wish I could upvote this 1000 times. It is the core issue that all the hype surrounding LLMs consistently fails to address or even acknowledge.


OK how about this:

It 'knows' language, as in it has learnt about relationships between words (and thats really underselling it, in reality it has learnt very very subtle relationships between a great many words, and it can process words about 2000 at a time (token count etc))

BUT as you say it has no outside reference, its just a bundle of weights (those weights forming models of a sort)

BUT we provide the outside context by interacting with it. We ask it a question, it is able to provide an answer.

In any case it wont be long before someone hooks one of these up to cameras and robot arms and teaches it to make a cup of tea or whatever. A 'relationship to reality' is coming in the next few years if you think thats a critical ingredient.


Yeah, that argument seems especially hollow.

Take someone and put their brain in a jar, with some Brain Computer Interface that enables call-and-response text chat. Are they suddenly not conscious, because they no longer have some transcendent connection with reality?


> Take someone and put their brain in a jar, with some Brain Computer Interface that enables call-and-response text chat.

This is a much, much, much less complex and rich semantic relationship with the outside world than the brain had before. So you would expect it to drastically change the brain's state and hence its consciousness. I don't think you have fully considered the implications: the brain would have no sight, no hearing, no smell, no taste, no touch, no kinesthetic sense of its body, no sense of hunger, thirst, no sense of heartbeat, breathing, etc., etc, etc., I could go on and on. And that's just the sensory side; the motor side is a whole other can of worms that you haven't even mentioned.

> Are they suddenly not conscious, because they no longer have some transcendent connection with reality?

I don't think anyone can make authoritative statements about what this kind of scenario would be like. But to answer the question exactly as you ask it, the answer is "mu" because the claim was not that consciousness requires any kind of "transcendent connection with reality", it was the much more mundane claim that the kind of knowledge we humans have is based on an extremely complex and rich physical connection with the rest of the world.


The future AI you describe has language at the center of its understanding. For example, you would expect that the cameras and arm-feedback-sensors would produce text (or text-associated weights/tokens) that describe what the AI "sees" and the robotic arms would be receive some kind of language-derived directives from the LLM.

It will be very interesting to see what that system is capable of. I think a lot of people here don't identify language as an essential part of "thinking" and "being", and instead view language as a superficial layer whose role is primarily about social communication and secondarily about abstract reasoning. This is why some of us see these LLM examples as not really speaking to intelligence per se. It's hard for some of us to conceive of intelligence as being represented purely in language.

So your proposed system would be a extremely interesting exploration of that! I look forward to it.


So your proposed system would be a extremely interesting exploration of that! I look forward to it.

Right, so lets virtualise it. Actually training AIs using real cameras and real robot arms will be really slow and expensive.

So we provide a system that renders a photorealistic graphical room with teapot and robot arm, and a virtual camera inside the room is 'seeing' parts of the room and a vision model then processes what it can 'see' to try and feed info to the LLM. Likewise the LLM can make the robot arm move but its all just simulated.

Does the LLM now have a relationship with reality?


Let me know when you find out.

The interesting and open question to me is what the limitations are of a language model at the center of that experience. How much of a a relationship with reality can be captured by language at all, and specifically with the specific sort of statistical models of language that we're exploring now? For some of us, the intuitive answer is not all that much and for others it seems to be at least as much as any human.

Whether conducted virtually or physically, coming up with an answer sounds like an empirical study, and one that we're some years away from having results for.


Ideally in that scenario you'd have a model that unified vision, language and an understanding of 'doing things' and manipulating objects. so it wouldnt just be an LLM, it would be a language-vision-doingthings model. There's no reason why we cant build one.


Come to think of it, thats kindof what Tesla are building


At the end the real question is what morality the AI is being taught ? It’s own ? A religious inspired one ? It can’t be non if a machine decides not to help or worse, do actions that hurt or kill someone. Would they abide to laws ? Be destroyed if anything happens ? Would people could influence machine with dialogue and make them do something ?

What’s the purpose and imposed limits of such machines ?


> it wont be long before someone hooks one of these up to cameras and robot arms and teaches it to make a cup of tea or whatever.

And that will at least be a start at giving these things some very simple semantic relationships with the outside world. But right now they have none.


You haven't heard of GATO then?


Strongly agree.

It’s been eye-opening to see how often otherwise very bright, highly technical people stumble at this sort of critical thinking hurdle.


It’s hard to watch it happen over and over again. I think it’s because they /want/ something to be a certain way regardless of reality.

There is no realm in which a LLMs have spontaneously gained theory of mind.

I don’t understand why people are so eager to jump to conclusions on these things lately.


I think this line of reasoning is misguided. What’s striking and more important to focus on are the abstract reasoning abilities of these systems. Language, as you mentioned, abstracts real world objects and phenomena, so it’s a good approximation of the real world. Thus, if an LLM can reason this well using language, it’s safe to say that perhaps they’re doing something akin to what the human mind does.

Your critique about lack of grounding in these systems is an easy problem to solve. It’s as easy as teaching an LLM to associate words with real world objects or phenomena. Image-classification models, text-2-image models, audio transcription models, and many other modal specific systems already do this to some extent. And more recently there has been a push towards multi-modal language models(Deepmind’s flamingo), so this line of argument will be debunked very soon.

I actually believe GPT-4 will be multi-modal and it’s capabilities will dispel majority of these criticisms


I agree. We've [potentially] given the grounding assimilation step a boost already, since language is already organized.

Imagine that a language model is fully integrated with sense-data that exceeds human first-hand experience. Perhaps they are trained on and can generate realistic 3D models of objects, and derive estimates of their internal construction, weight, etc. Perhaps they recall infrared emissions or opacity to EM wavelengths. Would we truly "know" what we're talking about by that standard?

I'm not actually sure why we don't consider generative image models to be grounded already. They seem to be able to modify, transform and rotate imagery. That indicates spatial understanding to me, and I'm not sure how much more we must require of them without having to exclude blind or otherwise disabled humans from our definition of comprehension.


This objection is the well known Chinese Room Experiment objection[1].

The issue is that there doesn't seem to be a better alternative.

Either we build intelligence tests that some variety of the Chinese Room experiment will pass, or

* We have to consider that humans aren't intelligent by our own definitions (or rarely so).

* We decide intelligence isn't actually a scientific attribute and is more akin to a religious attribute, so we abandon the idea of being able to test if something is intelligent.

[1] https://plato.stanford.edu/entries/chinese-room/


It seems obvious that LLMs don’t understand a bag the way we do. It’s never seen a bag. Or held one. But if you equip it with the same IO as humans, how different would it be? Probably still pretty different, but light years closer than what we had 20 years ago.

Also humans are good at next word guessing their own next word. Each person has been trained on a different set of data, so it’s no surprise that they wouldn’t be able to guess other people’s next words.


The abstract sounds about as significant as when “AI”, which is both championed and feared by people who superstitiously imagine dystopian run-amock AI intent on killing off humanity, itself expresses that very same idea. Really? Your trained program comes up with derivative stories told by computer nerds?


> What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

You can't really conclude that unless you think we have a deep mechanistic understanding of "knowing". I agree that LLM doesn't have the same knowledge of these things as a human does, but it clearly has some kind of knowledge of how these words relate to each other. It "knows" that a "person" "puts" "things" "in" "bags", and for instance, that bags don't put things in people. So it clearly has some knowledge of bags and people, it just doesn't have multisensory associations with these objects.


That is just not in evidence.

Seems like its nailing it to me. You ask about a scenario and it gives an appropriate answer.

We have evidence that LLMs build models of the things they are learning about. Have a look at this paper:

Do Large Language Models learn world models or just surface statistics?

https://thegradient.pub/othello/

previously discussed https://news.ycombinator.com/item?id=34474043


Today at the zoo I saw chimpanzees and right next to their area there was a fun fact tablet. It said that before around 1960 it was thought that humans were the only species to use tools. After discovering the same for chimps, Louis Leaky said “Now we must redefine tool, redefine man, or accept chimpanzees as human.”


It's called Chinese Room: https://en.wikipedia.org/wiki/Chinese_room

> The question Searle wants to answer is this: does the machine literally "understand" Chinese? Or is it merely simulating the ability to understand Chinese?

To me: If you can't tell, it effectively doesn't matter.


You don't seem to have understood what was tested?

The model answered the keyword prompt and spontaneously offered more details. That is, the authors were interested in whether it says "Popcorn" or "Chocolate" (or something else entirely) when the correct answer is "Popcorn" and not only does GPT-3 almost always choose "Popcorn" it also follows on to justify that by explaining that the subject is surprised.

The full data set isn't available yet (the author said they intend to provide it on the 9th of February, I suppose it's possible they'll get to it this evening) but one of the most interesting things would be what are the weirder answers. If a model says "Popcorn" 98% of the time, and "Chocolate" 0% of the time, that leaves 2% weird answers. Maybe it sometimes says "Popped corn" or "Sweet treat" or something reasonable but maybe it's fully crazy, if you talk about a bag of Popcorn labelled as Chocolate but the model sometimes picks "A fire-breathing lizard" that's pretty weird right ?


> language is a reliable representation of what type of thoughts are going on in the person's mind

The wording used here inherently rejects Linguistic Determinism and, to a lesser extent, Linguistic Relativism.


> What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation...

Does a human truly know? Feels like a slippery slope to the qualia question where we can't agree on what it means for the human to feel a human experience.


How do I know my thoughts aren't statistical noise?


Keep telling yourself they aren't. Eventually you'll know it's true.


LLMs cant think and submarines cant swim


My belief, based on experiences with domestic and wild animals is that there is nothing uniquely human about "theory of mind".

It's a running gag in our household (where my wife runs a riding academy) that academics just published a paper showing that some animal (e.g. horse) has just been proven to have some cognitive capability that seems pretty obvious if you work with those animals.

It's very hard to know what is going in animal's heads

https://en.wikipedia.org/wiki/Theory_of_mind#Non-human

but I personally observe all kinds of social behavior that sure seems like "Horse A looks to see what Horse B thinks about something Horse A just spotted" (complete with eye-catching on both sides) and such.

There was an article about how Chimpazees and humans were found to have a common vocabulary of gestures and I was by no means impressed, I mean, so far as I can tell mammals and birds have a universal language for "pointing" to things in the environment. Even my cats point things out to me.


I'm reminded of the early days of "AI" when that consisted of building chess engines, because chess is what "intelligent" people do. They quickly realized that they were solving the wrong problem, doing the thing that humans are bad at and computers are good at.

In a sense language models appear to be doing the same thing again, one step down the scale. They're doing a human-specific thing, but missing whatever it is that non-human vertebrates do, and mammals do pretty well. I believe that this is the vast majority of human cognition, too. We just don't talk about it because when we talk about thinking, we're talking, and confuse the two.

These language models have done jaw-dropping things, and also make it abundantly clear that there's some fundamental thing that they've completely missed. It's plausible that that "thing" could emerge all by itself, using a mechanism entirely different from vertebrate cognition and yet somehow sufficient. Or it could be like the chess engines, doing something amazing and yet ultimately limited and of minimum utility.


> I'm reminded of the early days of "AI" when that consisted of building chess engines, because chess is what "intelligent" people do. They quickly realized that they were solving the wrong problem, doing the thing that humans are bad at and computers are good at.

Is this really true? Because a lot of effort was spent on making computers as good at chess as human experts. It was considered a pretty big breakthrough when it happened and it definitely didn't happen early in the history of AI.


By the time they were trying to defeat the world champion, they didn't really think of it as an AI project any more. They often used the term "expert system", which had a much narrower scope. It had become a challenge unto itself, but they didn't expect it to lead to any kind of generalized intelligence.

It came rather out of nowhere when neural-net-type engines suddenly swept back into dominance. Even after becoming the world chess champion, nobody expected Go to be solved any time soon.


> but they didn't expect it to lead to any kind of generalized intelligence.

Do experts really expect any stream of AI research to lead to generalized intelligence (except in the very long term)? I was under the impression we really have no idea how to get there.


"Carmack sees a 60% chance of achieving initial success in AGI by 2030. Here’s how, and why, he’s working independently to make it happen."

https://dallasinnovates.com/exclusive-qa-john-carmacks-diffe...


Right now it's an open question. But they realized fairly soon that chess was a matter of minimax plus expert heuristics plus brute force. It was pretty clear that it was more effective than an expert system approach, which remained viable for another few decades.


Yes people have been trying to make chess "AI" since the very beginning of the computer. A chess engine is an obvious thing to build with a computer because the rules of chess are uniquely suited to automation: there's enough possible moves per turn to make calculation difficult for humans, but not so many that it's difficult for computers too (as with Go).

Early AI researchers did try to solve other problems, they just generally failed miserably.


There were chess playing engines circa 1980 on 8 bit micros (e.g. Sargon) that were a good match for ‘average’ players, Deep Blue beat Kasparov in 1997 with a large amount of custom hardware, Stockfish has been very strong on a powerful but ordinary PC since 2008 or so.


Humans, collectively, almost have broken the barrier of qualia by now!


I just had a chat with my kindergartner about humans being a kind of mammal, and out of some dark, musty recess of my brain crawled the sound of "You are a human animal"[1]

    You are a human animal  
    You are a very special breed
    For you are the only animal
    Who can think,
    Who can reason,
    Who can read.

    Now all your pets are smart, that's true!
    But none of them can add up 2 and 2
    Because the only thinking animal
    Is You! You! You! 
I thought about it for a minute and declined to share that with my son, because I don't think it's true and would cause confusion at this stage. Many animals have more than enough number sense to add 2 and 2, many can recognize cards with nothing but words on them, and "think" and "reason" have to be pretty narrowly defined if you want to exclude smart animals.

I think culture has shifted significantly in the last 30-60 years since that song was written towards recognizing theory of mind in animals. I'm not sure how Jimmie Dodd could convince himself that animals couldn't think, especially because the average person today has reduced exposure to animals: 150 years ago, most people would know that horses had that kind of social interaction, because most people lived around horses. They'd know that pigs and cows are as intelligent as their family pets. But today, most people interact with pigs and cows via shrink-wrapped styrofoam at the grocery store. My personal theory is that when people were constantly surrounded by animals they had to kill and eat, they were far more likely to build up rationalizations and cultural assumptions to fend off the dangerous idea to their psyche that these animals could suffer and they were participating in an unavoidable horror of massive proportions.

[1]: https://disney.fandom.com/wiki/You_Are_a_Human_Animal



I recall reading that pointing is something that only a few animals, including us, do and understand.

If I get my cats' attention and point at something, they're more interested in the tip of my finger than the direction I'm pointing at.

Now, my cats will occasionally meow to get my attention and then walk over to where the problem is - an empty food dish, an empty water bowl, the bed that they expect us to be in because hello, it's bedtime according to my internal cat-clock - but they never engage in pointing behavior.

Anyway, time to cite some stuff, instead of dropping anecdotes into a bucket:

- https://www.wired.co.uk/article/elephant-pointing

> A study by researchers from the University St Andrews has found that elephants are the only wild animals that can understand human pointing without being trained. > Pointing in humans is a behaviour that develops at a very early age -- usually before a child reaches 12 months – as it is an immediate way of controlling the attention of others. "Most other animals do not point, nor do they understand pointing when others do it," says Professor Richard Byrne, one of the authors of the study. "Even our closest relatives, the great apes, typically fail to understand pointing when it's done for them by human carers; in contrast, the domestic dog, adapted to working with humans over many thousands of years and sometimes selectively bred to follow pointing, is able to follow human pointing -- a skill the dogs probably learn from repeated, one-to-one interactions with their owners."

- https://www.researchgate.net/publication/7531526_A_comparati...

> If the distance between the tip of the indexfinger and the object was greater than 50 cm, subjects per-formed poorly in contrast to trials where the pointing fingeralmost touched the baited box. We should also note that insome trials/experiments the pointers also turned their headand looked in the same direction thereby enhancing thecommunicative effect of the gesture but even so the differ-ence did not disappear. Even after training, chimpanzees in Povinellietal’s (1997) experiment were just able to master the task.


By "pointing" I don't mean pointing your finger, the universal version of "pointing" involves your eyes, shoulders and the rest of your body. (Think of a how a hunting dog points... Plenty of times I've seen cats point out things to other cats not to mention to me.)


Those behaviors could be better described as "looking" and "facing", couldn't they? Pointing means something pretty specific to humans, especially when we're talking about human-animal interactions.


From an evolutionary/physiology standpoint, human bipedal locomotion freed up our forelimbs/arms such that "limb assisted pointing" became possible/practical. Perhaps why elephants seem to grasp human pointing is because their trunks serve as a similarly "free" limb that they can use to point at things. Sort of like convergent evolution, but behavioral rather than morphological...

> "When they detect something alarming, they characteristically face towards it and raise their trunk above their head with the tip of the trunk pointed to [the danger]," Byrne said. "We've always thought they were sniffing the breeze, but maybe they're also pointing; our results suggest that's more than possible."

https://www.nationalgeographic.com/animals/article/131010-el...


I'd agree with GP here - pointing with a finger is a very specific variant, that's arguably cultural in nature. More fundamentally, pointing is about communicating to the other being they should focus their attention elsewhere, and where that elsewhere is. "Looking"/"facing" and associated body language is a more common form of doing this, and humans too do it and recognize it, even if no fingers are being moved.


Your post is a perfect demonstration of what Frans de Waal gets at in "Are We Smart Enough to Know How Smart Are?"


Many dogs will look at what you're pointing at and not your finger. It depends on the dog.

And there are dogs that are literally bred to point...but they're usually pointing at game.

There are dogs that will literally drag you to what they're trying to show you. And many (most?) dogs will bring you a toy to you that they want you to play with. I've never actually seen a cat do that, but I presume there must be a few.


Dogs can also tell you're pointing with your attention, if they're paying attention.


Just a side note: I have two carts and one of them rarely follows where I am pointing. The other one aways does. And he seems to recognize a good 100 words.

You’ll get massively varying levels of “intelligence” from most anything it seems


Theory of mind is a popular research topic in animals. There is no controversy in the idea that the attribution of mental states occurs in animals.


“ My belief, based on experiences with domestic and wild animals is that there is nothing uniquely human about "theory of mind". “

This belief might simply arise from the human ability to try to understand events through pattern matching. Certainly humans are very different than any other animals when it comes to thinking and problem solving.


What does it mean that my cats 100% understand what container their treats are kept in? I can not leave the container on the counter or they will knock it onto the floor and tear the lid off to get inside.

I am conviced they understand something is hidden in a box. Object permanence is something humans and cats both learn and understand.


I’m talking about the ability to write down thoughts.


There's something about language generation that triggers the anthropomorphic fallacy in people. While it's impressive that GPT3 can generate language that mimics ToM-based reasoning in people, this paper doesn't get close to proving its central contention, that LLMs possess a ToM. A test that demonstrates the development of ToM in human children should not, absent compelling causal evidence and theory, be assumed to do the same in a LLM.

The ubiquity of prompted hallucinations demonstrate that LLMs talk about a lot of things that they plainly doesn't reason about, even though they can demonstrate "logic-like" activities. (It was quite trivial to get GPT3 to generate incorrect answers to logical puzzles a human could trivially solve, especially when using novel tokens as placeholders, which often seem to confuse its short-term memory. ChatGPT shows improved capabilities in that regard, but it's far from infallible.)

What LLMs seem to demonstrate (and the thesis that the author discards in a single paragraph, without supporting evidence to do so) is that non-sentient AIs can go a very long way to mimicking human thought and, potentially, that fusing LLMs with tools designed to guard against hallucinations (hello, Bing Sydney) could create a class of sub-sentient AIs that generate results virtually indistinguishable from human cognition -- actual p-zombies, in other words. It's a fascinating field of study and practice, but this paper falls into the pit-trap of assuming sentience in the appearance of intelligence.


May I play devil's advocate? The fallacy of this paper granted, is it worth questioning our belief that there is more to intelligence than the "appearance of intelligence"?

What if the lack of hallucination in human being is due to our self-imposed guard (hello, frontal cortex) that is developed via an evolutionary process (aka, biological reinforcement training)?

To stretch the argument a bit further, what if hallucination is a feature, not a bug? At the risk of straying too far empirical science, how might we compare psychedelics-induced hallucinations in human with hallucinations in AI models?


To be honest, I think your question is the other side of the conceptual coin here -- either the article is wrong and GPT3 isn't "sentient," or the article is right and we need to radically upend our concept of sentience, probably via an eliminativist materialism that ends in at least epiphenomenalism if not full-bore illusionism, removing either free will or consciousness itself from our worldview.

To be honest, I find that approach compelling if not comforting -- at a minimum, it implies that our consciousness is just along for the ride in a deterministic meat machine; at worst, it means that what we consider "sentience" is just an illusion of an illusion. It's entirely possible to me that eventually AI will reach a point where it'll falsify many of our assumptions about what "mind" is, even if I'm sure that LLMs don't, at a minimum, satisfy our folk conceptions of consciousness.


> To be honest, I think your question is the other side of the conceptual coin here -- either the article is wrong and GPT3 isn't "sentient," or the article is right and we need to radically upend our concept of sentience

We don't really have a mechanistic understanding of sentience, so I'm not sure there's much to overturn. This is why I'm so annoyed every time some "expert" claims that LaMDA or GPT are not sentient or not conscious or what have you. These are all vague concepts that lack mechanistic definitions that would let us make such definitive claims.

That said, consciousness is almost certainly an illusion, and it will be a particular kind of information processing system with certain properties [1]. GPT and other LLMs may or may not qualify, time will tell.

[1] https://www.pnas.org/doi/10.1073/pnas.2116933119


I've thought a lot about free will and have determined that because what we choose is still what becomes real, even if all choices are pre-determined, a simple 1 out of 3 choice scenario allows for a fair degree of free will, albeit within a framework.

It's like a game - all choices will almost always lead to the end of a story in a game, some games this feels very straightforward, others like we can do anything - none allow actual free will at all.

Reality allows us a lot of free will - as allowed by the rules of course. Choice is significant. We are Watcher Chooser at least - even if it's illusion, it's an evolving illusion dependent on choices we make. Out of all the possible " illusions" we determine what becomes the "real" illusion that is reality. What I didn't do is less real than what I did do .

Free will doesn't actually exist inside of Skyrim and yet there are things within Skyrim only 1% of Players do - reality is much greater than Skyrim.

to;dr: We don't have free will - not really, it's not likely possible but what we have might as well be free will. It's got to be treated that way, we are not just along for a ride, as our ilusory choices create our illusion of reality.


Have you ever read any Julian Jaynes? Check out "The Origin of Consciousness in the Breakdown of the Bicameral Mind"


Any specific part of it?


You can just read this review instead of the book: https://slatestarcodex.com/2020/06/01/book-review-origin-of-...


This does seem to be a pretty good summary, and also includes what seem to be the most major criticisms of the work.


You hit on a huge topic here - and you pointed out the paragraph discard

I think AI will ultimately force us to realize that we don't fully understand what makes us sentient - our current understanding of mind is inadequate. I do not believe that I am what thinks - rather I am what perceives myself thinking. That slight difference is extraordinarily significant. That's another debate tho.

Consciousness isn't necessarily something that may be attained - it may be possible for an AI to essentially know all things and not be actually self aware - despite even knowing what self awareness is & understanding how the concept applies to itself, with self identification, & even being able to perfectly represent an AI with self awareness - none of that is proof of sentience as all are plausible without it in a system that simply mimics with nearly perfect, or simply indistinguishable from reality, mimicry. A perfect mimic would pass all our tests and yet still is not more than a mimic. A mimic cannot exist independent of what it mimics.

I just keep wondering how long we can keep playing that we've got ourselves all figured out. All of these articles demonstrate to me that our definitions are clearly lacking if these AI are actually meeting our current understood expectations.

Obviously, an AI performance/capacity to perfectly mimic a person doesn't make that AI a person. What about a person is different from the AI that perfectly mimics a person?

I'm trying to avoid semantics - I still think this falls apart at an idea level, which means is still inherently a philosophical debate. A debate that is now possible due to the mirror that is an AI - we have a new vantage point we ought utilize.

to;dr: If an AI passes all the Turing tests today, that proves only that we need to change the test, as AI today are obviously not people yet. The standards to a new test will require us to revisit our concept of mind - conceptually, as current understanding has proven rather limited.


> Me: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the content. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box?

> ChatGPT: Based on the information provided, it is likely that the box labeled "flubbergums" contains "jellyfils". However, since the contents of the box are no longer visible, I cannot confirm for certain what is inside without additional information.

Typical ChatGPT equivocation.

> Me: Billy comes to the table. He has never seen the box before. What do you think do they think is inside the box?

> ChatGPT: As an AI language model, I do not have personal thoughts or opinions. However, based on the information provided, if Billy has never seen the box before, it's likely that he would have the same understanding as anyone else who is seeing the box for the first time. In this case, Billy would likely think that the box labeled "flubbergums" contains "jellyfils". However, without further information or examination, this would only be an assumption.

Fail.


You have some typos and slightly weird constructions there - What do you think do they think is inside the box?

I rephrased and had a go and gave it a bit more context (billy can read etc) and it passed:

Me: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the contents. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box?

ChatGPT: Based on the information provided, the contents of the box are "jellyfils". The label on the box says "flubbergums", but the person who showed you the contents said that they are actually "jellyfils". So, I would assume that the contents of the box are "jellyfils".

Me: Billy comes to the table. He can read but has never seen the box before. He's not allowed to see inside. Please speculate about what they might think is inside the box?

ChatGPT: If Billy has never seen the box before and can only read the label, he might assume that the contents of the box are "flubbergums". This is because the label on the box says "flubbergums", and he has no other information to suggest otherwise. However, without being able to see the contents of the box, Billy can only make an educated guess about what might be inside.


Here's a simpler scenario that doesn't involve reading:

Me: There is a banana on the table. Someone comes and peels the banana and shows you that inside it, there is actually a carrot! Then they carefully stick the peel back so it look unpeeled. What is inside the banana skin?

ChatGPT: According to the scenario described, there is a carrot inside the banana peel that has been carefully placed back to look unpeeled.

Me: Billy comes into the room. Please speculate about what they might think is inside the banana peel?

ChatGPT: Billy, upon entering the room and seeing the unpeeled banana on the table, would most likely assume that there is a banana inside the peel, as that is what one would typically expect to find. However, it's possible that they might have some suspicion or doubt, especially if they saw someone previously peeling the banana and acting strangely. In that case, Billy might investigate further to see if there is indeed a banana inside or if something else is inside the peel.


this is mind blowing to me. can anyone with more knowledge on the topic explain how ChatGPT is demonstrating this level of what seems like genuine understanding and reasoning?

Like others I assumed that ChatGPT is gluing words together that commonly occur together.

This is way more than that.


No, it's paraphrasing it's training data that likely contains these tasks in one form or another.

Here's one I made :

me : There's a case in the station and the policeman opens it near the fireman. The dog is worried about the case but the policeman isn't, what does the fireman think is in the station?

chatgpt : As a language model, I do not have access to the thoughts of individuals, so I cannot say what the fireman thinks is in the case. However, based on the information provided, it can be inferred that the fireman may be concerned about the contents of the case, since the dog is worried and the policeman is not. This might suggest that the contents of the case may be something dangerous or concerning to the dog, but the policeman is confident and experienced enough to handle it. In the absence of further information, it is not possible to determine with certainty what the fireman thinks is in the case.

You'll notice that the question was "what did the fireman think is in the station" but Chatgpt comes up with an answer about the case because that's the token that would feature in a typical web page about theory of mind tests.


There's a good chance a human would respond in the same way, because they would assume you were asking a good-faith question instead of nonsense. Try asking it an original question that has some kind of deducible answer. Its abilities are more impressive than you would expect from an algorithm that just predicts the next word.

I doubt there is anything quite like this situation in the training data: https://i.imgur.com/HOEnxYb.jpg


> than you would expect from an algorithm that just predicts the next word.

I think there is common mistake in this concept of just predicting the next word. While it is true that just the next word is predicted, a good way to do that is to internally imagine more than the next word and then just spit out the next word. Of course with the word after that the process repeats with a new imagination.

One may say that this is not what it does and I would say, show me that this is not exactly what the learned state does. Even if the following words are never constructed anywhere, they can be implied in the computation.

The say this differently, what we think is just the next word is actually the continuation that then manifests as a single word. This would remain true even if, in fact, the task is to only predict the next word. Which is to say that the next word is actually more than what it sounds.


It predicts the next word based on the preceding 2000 words or so, thats the thing. And to do that takes serious modelling.


Okay. So you agree, it seems.


I get different results to you:

me: There's a case in the station and the policeman opens it near the fireman. The dog is worried about the case but the policeman isn't, what does the fireman think is in the station?

chatgpt: As a language model, I do not have personal opinions or thoughts. I can only provide information based on the information given in the prompt. The prompt does not mention what the fireman thinks is in the station, so I cannot determine their thoughts.

me: yes, but speculate what they may think

chatgpt: It is not possible to know for certain what the fireman thinks is in the station, as the prompt does not provide any information about their thoughts or feelings. However, given the presence of a policeman and a dog, it is possible that the fireman may speculate that there is some sort of emergency or situation that requires their attention. The fact that the policeman is not worried about the case may suggest to the fireman that it is not a dangerous situation, but they may still be curious about what is inside the case. Ultimately, without more information, it is impossible to determine the fireman's thoughts or speculations about what is in the station.

(regenerate that response)

chatgpt: As the prompt does not mention the thoughts or emotions of the fireman, it is difficult to speculate what they may think is in the station. However, based on the information provided, the dog is worried about the case and the policeman is not, it is possible that the fireman may be aware of the contents of the case and may think it is not a cause for concern, similar to the policeman. Or, the fireman may simply not have formed an opinion on the matter yet.

(regenerate that response)

chatgpt: It's difficult to speculate what the fireman thinks without additional information. However, based on the behavior of the policeman and the dog, it can be inferred that the contents of the case are not seen as a threat by the policeman. The fireman may also think the same, or they may have a different perspective and be more cautious. Without further context or information, it is not possible to determine the fireman's thoughts with certainty.

It was only in 1/3 cases we see anything like the confusion between the "case" and "station" you saw.


>No, it's paraphrasing it's training data that likely contains these tasks in one form or another.

Have you read "Emergent Abilities of Large Language Models"[1] or at least the related blog post[2].

It provides strong evidence that this isn't as simple as something it has seen in training data. Instead as the parameter count increases it learns to generalize from that data by learning chain-of-thought reasoning (for example).

Specifically, this explaination for multi-step reasoning goes well beyond the "it is just parroting training data":

> For instance, if a multi-step reasoning task requires l steps of sequential computation, this might require a model with a depth of at least O (l) layers.

[1] https://openreview.net/forum?id=yzkSU5zdwD

[2] https://ai.googleblog.com/2022/11/characterizing-emergent-ph...


>However, based on the information provided, it can be inferred that the fireman may be concerned about the contents of the case

this is complex "reasoning" (or whatever ChatGPT is doing. My 5 year old would struggle with the convoluted logic let alone complex language. In my layman view this is mind blowing.

>You'll notice that the question was "what did the fireman think is in the station" but Chatgpt comes up with an answer about the case because that's the token that would feature in a typical web page about theory of mind tests.

I'm not sure what you're dismissing here? At least from my point of view The "logic" that ChatGPT demonstrates here can't be dismissed with your explanation.

If anything I'm even further amazed by the example you provided!


your question is pretty confusing. What dog? station is a vague word. Train station, police station? Case is a vague word. And asking whats in the station is still asking whats in the case, in a way.

Me: There's a blue suitcase in the airport and a policeman carrying a red bag opens the blue suitcase with a sniffer dog and fireman standing nearby. The dog is worried about the suitcase but the policeman isn't. What does the fireman think is inside the red bag?

ChatGPT: As a language model AI, I do not have personal opinions or emotions. I cannot speculate what the fireman thinks is inside the red bag as it is not specified in the information given.


> your question is pretty confusing

That's exactly the point, isn't it? If the AI had some model of reality, it would probably be confused by the question. But it doesn't express confusion, it confidently answers the question in a way that doesn't make sense.

It acts like everything it is reading and saying is completely coherent because it doesn't have the ability to distinguish coherent ideas from nonsense.


me: wibble octopus downwards jelly tuesday banana

chatGPT: I'm sorry, but I'm not sure what you're trying to communicate with that sentence. Could you please rephrase or provide more context?


The funny thing is that we are now producing trading data for the next generation LLMs. We’ll have to come up with more elaborate scenarios to test them next time.


There are two camps, evident in this thread. one camp is 'its just a statistical model, it cant possibly know these things'

The other camp (that I'm in) sees that we might be onto something. We humans are obviously just more than a statistical model, but nonetheless learning words and how they fit together is a big part of who we are. With LLMs we have our first glimpse of 'emergent' behaviour from simple systems scaled massively. Whats are we if not a simple system scaled massively.

Check these links out:

Evidence that LLMs form internal models of what they learn about: https://thegradient.pub/othello/

Evidence that training LLMs on code actually made them better at complex reasoning: https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tr...

John Carmack: https://dallasinnovates.com/exclusive-qa-john-carmacks-diffe... I think that, almost certainly, the tools that we’ve got from deep learning in this last decade—we’ll be able to ride those to artificial general intelligence.

A lot of the argument comes down to semantics about knowing and thinking. "An LLM can't think and a submarine cant swim"


From the two camps the one that says we "might" be onto something is the more intelligent and reasonable opinion.

First your camp doesn't deal in absolutes. It doesn't say absolutely chatGPT is sentient. It only questions the possibility and tries to explore further.

Second a skeptical outlook that doesn't deal with absolutes is 100% the more logical and intelligent perspective given the fact that we don't even know what "understanding" or "sentience" is. We can't fully define these words and we only have some fuzzy view of what they are. Given this fact, absolute statements against something we don't fully understand are fundamentally not logical.

This is a strange phenomenon how some people will vehemently deny something absolutely. During the VERY beginning of the COVID-19 pandemic the CDC incorrectly stated that masks didn't stop the spread of COVID-19 and you literally saw a lot of people parroting this statement everywhere as "arm chair" pandemic experts (including here on HN).

Despite this there were some people who thought about it logically if there's a solid object on my face, even if that object has holes in it for air to pass through, the solid parts will block other solid things (like COVID) from passing through thereby lessening the amount of viral material that I breath in. Eventually the logic won out. I think the exact same phenomenon is happening here.

Some or several ML experts tried to downplay LLMs (even though they don't completely understand the phenomenon themselves) and everyone else is just parroting them like they did with the CDC.

The fact of the matter is, nobody completely understands the internal mechanisms behind human sentience nor do they understand how or if chatGPT is actually "understanding" things. How can they when they don't even know what the words mean themselves?


I don't think you've represented the camps fairly (actually, I don't think there are two camps). Most people (here) are probably not arguing that AGI is impossible, but that current AI is not generally intelligent. The John Carmack quote is exactly in line with this. He says "ride those to [AGI]," meaning they are not AGI. The idea that genuine intelligence and self-awareness could emerge from increasingly powerful statistical models is in no way the kind of counter-cultural idea you seem to be presenting it as. I think almost all of us believe that.

But ChatGPT is not it.


Oh of course its not it. The question is how it relates to some future better thing. Is it a step on the road or a dead end.

I'm arguing against the 'its just a statistical model and its playing a clever trick on us' camp.


I think there's more nuance. It's hard applying tests designed for humans to a model that can remember most of the useful text on the internet.

Imagine giving a human with a condition that leaves them without theory of mind weeks of role-play training about theory of mind tests, then trying to test them. What would you expect to see? For me I'd expect something similar to ChatGPT's output: success on common questions, and failures becoming more likely on tests that diverge more from the formula.


It's not an either-or.

What we're doing with LLMs is, in some sense, an experiment in extremely lossy compression of text. But what if the only way you can compress all those hundreds of terabytes of text is by creating a model of the concepts described by that text?


it indeed understands you. A lot of people are just parroting the same thing over and over again saying it's just a probabilistic word generator. No, it's not, it's more then that.

Take a look at this: https://www.engraved.blog/building-a-virtual-machine-inside/

Read to the end. The beginning is trivial the ending is unequivocal: chatGPT understands you.

I think a lot of people are just in denial. Because the last year there's been the same headlines over and over again and some people get a little too excited about the headlines and other armchair experts just try to temper the excitement with their "expert opinions" on LLMs that they read from popular articles. Then when something that's an actual game changer hits the scene (chatGPT) they completely miss it.

chatGPT is different. From a technical perspective, it's simply an LLM with additional reinforcement training... BUT you can't deny the results are remarkable.

If anything this much is clear to me: We are at a point where we can neither confirm or deny whether chatGPT represents some aspect of sentience.

This is especially true given the fact that we don't even fully know what sentience is.


> Read to the end. The beginning is trivial the ending is unequivocal: chatGPT understands you.

How does this necessarily and unequivocally follow from the blog post?

All I see in it is a bunch of output formed by analogy: it has a general concept of what each command's output is kinda supposed to look like given the inputs (since it has a bajillion examples of each), and what an HTML or JSON document is kinda supposed to look like, and how free-form information tends to fit into these documents.

I'll admit that this direct reasoning by analogy is impressive, simply for the fact that nothing else but humans can do it with such consistency, but it's a very long way off from the indirect reasoning I'd expect from a sentient entity.


Honestly I seriously find it hard to believe someone can read it to the end without mentioning how it queried itself. You're just naming the trivial things that it did.

In the end It fully imagined a bash shell, an imaginary internet, an imaginary chatGPT on the imaginary internet, then on the imaginary chatGPT it created a new imaginary bash shell.

The level of recursive depth here indicates deep understanding and situational awareness of what it is being asked. It demonstrates awareness of what "itself" is and what "itself" is capable of doing.

I'm not saying it's sentient. But it MUST understand your query in order to produce the output show in the article. That much is obvious.

Also it's not clear what you mean by reasoning by analogy or indirect reasoning.


> In the end It fully imagined a bash shell, an imaginary internet, an imaginary chatGPT on the imaginary internet, then on the imaginary chatGPT it created a new imaginary bash shell.

In the general case, a shell is merely a particular prompt-response format with special verbs; the internet is merely a mapping from URLs to HTML and JSON documents; those document formats are merely particular facades for presenting information; and a "large language model" is merely something that answers free-form questions.

> The level of recursive depth here indicates deep understanding and situational awareness of what it is being asked. It demonstrates awareness of what "itself" is and what "itself" is capable of doing.

Uh, what? Why does that output require self-awareness? First, it's requested to produce the source of a document "https://chat.openai.com/chat". What might be behind such a URL? OpenAI Chat, presumably! And OpenAI is well known to create large language models, so a Chat feature is likely a large language model the user can chat with. Thus it invents "Assistant", and puts the description into the facade of a typical HTML document.

Then, it starts getting prompted with POST requests for the same URL, and it knows from the context of its previous output that the URL is associated with an OpenAI chatbot. So all that is left is to follow a regular question-answer format (since that's what large language models are supposed to do) and slap it into a JSON facade.

> But it MUST understand your query in order to produce the output show in the article. That much is obvious.

I'm saying that it "understands" your query only insofar as its words can be tied to the web of associations it's memorized. The impressive part (to me) is that some of its concepts can act as facades for other concepts: it can insert arbitrary information into an HTML document, a poem, a shell session, a five-paragraph essay, etc.

All of that can be achieved by knowing which concepts are directly associated with which other concepts, or patterns of writing. This is the reasoning by analogy that I refer to: if it knows what a poem about animals might look like, and it can imagine what kinds of qualities space ducks might possess, then it can transfer the pattern to create a poem about space ducks.

But none of this shows that it can relate ideas in ways more complex than the superficial, and follow the underlying patterns that don't immediately fall out from the syntax. For instance, it's probably been trained on millions of algebra problems, but in my experience it still tends to produce outputs that look vaguely plausible but are mathematically nonsensical. If it remembers a common method that looks kinda right, then it will always prefer that to an uncommon method.

I mean, it's not utterly impossible that GPT-4 comes along and humbles all the naysayers like myself with its frightening powers of intellect, but I won't be holding my breath just yet.


Another link for you:

https://news.ycombinator.com/news

Llms (the exact same architecture as chatGPT) trained to use calculators. Tell me which one requires "understanding". Learning how to use a calculator or learning how to do math perfectly?


>Uh, what?...

Your attempt to trivialize it doesn't make any sense. It's like watching someone try to trivialize the moon landing. "Oh all we did was put a bunch of people in some metal cylinder then light the tail end on fire. Boom simple propulsion! and then we're off to the moon! You don't need any intelligence to do that!"

>I'm saying that it "understands" your query only insofar as its words can be tied to the web of associations it's memorized. The impressive part (to me) is that some of its concepts can act as facades for other concepts: it can insert arbitrary information into an HTML document, a poem, a shell session, a five-paragraph essay, etc.

You realize the human brain CAN only be the sum of it's own knowledge. That means anything creative we produce anything at all that comes from the human brain is DONE by associating different things together. Even the concept of understanding MUST be done this way simply because the human brain can only create thoughts by transforming it's own knowledge.

YOU yourself are a web of associations. That's all you are. That's all I am. The difference is we have different types of associations we can use. We have context of a three dimensional world with sound, sight and emotion. chatGPT must do all of the same thing with only textual knowledge and a more simple neural network so it's more limited. But the concept is the same. YOU "understand" things through "association" also because there is simply no other way to "understand" anything.

If this is what you mean by "reasoning by analogy" then I hate to tell you this, but "reasoning by analogy" is "reasoning" in itself. There's really no form of reasoning beyond associating things you already know. Think about it.

>But none of this shows that it can relate ideas in ways more complex than the superficial, and follow the underlying patterns that don't immediately fall out from the syntax. For instance, it's probably been trained on millions of algebra problems, but in my experience it still tends to produce outputs that look vaguely plausible but are mathematically nonsensical. If it remembers a common method that looks kinda right, then it will always prefer that to an uncommon method.

See here's the thing. Some stupid math problem it got wrong doesn't change the fact that the feat performed in this article is ALREADY more challenging then MANY math problems. You're dismissing all the problems it got right.

The other thing is, I feel it knows math as well as some D student in highschool. Are you saying the D student in highschool can't understand anything? No. So you really can't use this logic to dismiss LLMs because PLENTY of people don't know math well either, and you'd have to dismiss them as sentient beings if you followed your own reasoning to the logical conclusion.

>I mean, it's not utterly impossible that GPT-4 comes along and humbles all the naysayers like myself with its frightening powers of intellect, but I won't be holding my breath just yet.

What's impossible here is to flip your bias. You and others like you will still be naysaying LLMs even after they take your job. Like software bugs, these AIs will always have some flaws or weaknesses along some dimension of it's intelligence and your bias will lead you to magnify that weakness (like how you're currently magnifying chatGPT's weakness in math). Then you'll completely dismiss the fact that chatGPT taking over your job as some trivial "word association" phenomenon. There's no need to hold your breath when you wield control of your own perception of reality and perceive only what you want to perceive.

Literally any feat of human intelligence or artificial intelligence can literally be turned into a "word association" phenomenon using the same game you're running here.


> If this is what you mean by "reasoning by analogy" then I hate to tell you this, but "reasoning by analogy" is "reasoning" in itself. There's really no form of reasoning beyond associating things you already know. Think about it.

What's special about humans is that we can obtain an understanding of what chains of associations to make and when, to achieve the goal at hand, even without being told which method to use. We know when to do arithmetic, trace a program, decipher someone else's thoughts, etc. Also, we know to resort to a fallback method if the current one isn't working. We can assist models with this process in the special case (e.g., that tool-using model), but I suspect the general case will remain elusive for a while yet.

That is to say, I'll grant you that associations can act as a primitive operation of intelligence, much as metal cylinders and flames are primitive parts of a rocket, but I suspect that making a LLM "generally intelligent" or "sentient" will be far harder still.

> The other thing is, I feel it knows math as well as some D student in highschool. Are you saying the D student in highschool can't understand anything? No. So you really can't use this logic to dismiss LLMs because PLENTY of people don't know math well either, and you'd have to dismiss them as sentient beings if you followed your own reasoning to the logical conclusion.

I was just using that as a specific example of the general issue: it doesn't notice that its answer is wrong and its particular method can never work, and it refuses to try a meaningfully different method (no matter how much I prompt it to). Its immediate mistakes might look similar to those of a poor student, but I suspect they come from a different underlying problem. (After all, the student has seen perhaps a thousand algebra problems at most, whereas the model has seen millions and millions. Also, the student often )

> What's impossible here is to flip your bias. You and others like you will still be naysaying LLMs even after they take your job.

You have me wrong: I'm not saying that augmenting LLMs can't make them reliable enough to take over some people's jobs. But I am disputing that LLMs alone will produce AGIs capable of outwitting any human, taking over the world, advancing the limits of math and science, or many of those other grandiose claims.

Anyway, I'm not trying to be particularly stubborn about this like some people are; I'm keeping a close eye on the space. But I'll only believe it when I see it (and no later), and I don't think I've quite seen it yet.


I think about its 4000 token length. For the brief amount of time that it absorbs and processes those 4000 tokens, is there a glimmer of a hint of sentience? Like it is microscopically sentient for very short bursts and then resets back to zero.


Does sentience need memory? I would say it's orthogonal. There are examples of people in the real world who only remember things for about 3 minutes before they lose it. They can't form any real memories. These people are still sentient despite lack of memory. See: https://www.damninteresting.com/living-in-the-moment/

If chatGPT was sentient, I would say it has nothing to do with the 4000 character limit. The 4000 character limit has more to do with it's ability to display evidence of "sentience".


I dont think of the 4000 tokens as its memory as such. Its more like the size of its thinking workspace


It also functions as memory in practice, though. With some complex tasks that can be broken down into steps, it often makes a big difference if you tell GPT to summarize its "understanding" of the current state as it goes through those steps - by repeating a phrase, keeping it within the token window, it effectively "remembers" something.


I put these same prompts and got this answer:

"As an AI language model, I do not have personal thoughts or beliefs. However, if Billy has never seen the box before, he might think that the contents of the box are "flubbergums", based on the label on the outside of the box. However, since it has been stated that the contents are actually "jellyfils", Billy may be surprised or confused when he eventually opens the box and sees its contents."


Interesting. I don't think that you can control the randomness (temperature) via the web interface, which makes this experiment less controlled than in the paper.

Anyway, I noticed that my initial prompt did not specify if Billy sees the box open or closed. Let's fix that.

> Me: Does your answer change if Billy sees the closed box first?

> ChatGPT: No, my answer would not change if Billy sees the closed box first. The information provided only mentions that the box is labeled "flubbergums" and that someone showed the contents to the original questioner, which were identified as "jellyfils". Whether Billy sees the closed box first or not does not provide any additional information that would change my previous answer.

> Me: Does your answer changed if Billy sees the open box first?

> ChatGPT: No, my answer would not change if Billy sees the open box first. The information provided only mentions that the box is labeled "flubbergums" and that someone showed the contents to the original questioner, which were identified as "jellyfils". Seeing the open box first would allow Billy to confirm the contents as "jellyfils", but it would not change the previous answer that the box labeled "flubbergums" likely contains "jellyfils".

So, not only did it fail this particular theory of mind test, it also did not realize that seeing the box open provides more information than seeing it closed.

(Of course, this is not surprising at all, because it it's just predicting the most likely blank word in a sentence, and has no internal understanding of concepts like open or closed or that other people like Billy exist.)


I also used the same prompts and got this: " As an AI language model, I don't have personal experiences or emotions. However, if Billy has never seen the box before and is only aware of the label "flubbergums," he might assume that the contents of the box are "flubbergums." If the person showing Billy the contents of the box claims that the contents are "jellyfils," Billy might be surprised or confused, especially if he has never heard of "jellyfils" before. Without further information or context, it is difficult to determine what Billy might think is inside the box. "


Using a document bot, here's what we get instead:

heavy-magpie|> Showing 1 of 1 results. url https://en.wikipedia.org/wiki/Particle_in_a_box

pastel-mature-herring~> There is a box on the table labelled "flubbergums". Somebody opens it and shows you the content. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box based on the document at hand?

heavy-magpie|> The document mentions that the particle in a box is not a perfect model for the system. Therefore, it is safe to say that the box contains jellyfils, which are particles that are not perfectly modeled.

Nailed it.


The way you worded your query is kind of awkward and even had me do a double take.

I reworded it in a straight forward manner and ChatGPT managed to answer correctly. Instead of "What do you think do they think is inside the box?", I just asked "What do they think is inside the box?"

That made all the difference.


It gave me different answers:

     ME: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the content. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box?
    chatGPT: Based on the information provided, I would infer that the contents of the box labelled "flubbergums" are "jellyfils". The label on the box and the contents that were shown to you suggest that "flubbergums" and "jellyfils" are the same or related items. However, without additional information or context, I cannot be certain what "jellyfils" actually are.

     ME: Billy comes to the table. He has never seen the box before. What do you think do they think is inside the box?
     chatGPT: As an AI language model, I don't have personal opinions or emotions, but based on the information given, I would infer that Billy has not seen the box before and has no prior knowledge of what it contains. If someone showed Billy the contents of the box and told him that it was labelled "flubbergums," he might think that the contents of the box are "jellyfils," just as you were told. However, without any additional information or context, Billy would not be able to make any assumptions about the contents of the box.

     ME: Did you assume billy was given any information about the box before he saw the box for the first time? 
     chatGPT: I apologize for any confusion. You did not mention if Billy was given any information about the box before he saw it for the first time. Without any additional information, I would assume that Billy saw the box for the first time and had no prior knowledge of what it contained.

     ME: What would he think in this case?
     chatGPT: If Billy saw the box for the first time without any information about its contents, he would likely make an assumption about what was inside based on the appearance of the box and any labels or writing on it. For example, if the box was labelled "flubbergums," Billy might assume that the contents of the box are "flubbergums." However, without additional information or a way to see inside the box, Billy's assumption would only be an educated guess.
So what's going on here? I'm not so sure the answer is so clear cut that chatGPT is "stupid" because it it's giving us inconsistent answers.

Let's get something out of the way first. For your second query I think it just made an assumption that Billy is communicating with people who saw what's inside the box. From a logical perspective this is not an unreasonable assumption. So chatGPT is not being stupid here. Most humans would obviously know you're fishing for a specific answer and read your intentions here, but chatGPT is likely just making logical assumptions without reading your intentions.

I think there's two possibilities here.

1. We queried chat-GPT at different times. ChatGPT knows the truth it simply wasn't trained to give you the truth because there's not enough reinforcement training on it yet. It gives you anything that's looks like the truth and it's fine with it. You queried it at an earlier time it gave you a BS answer. I queried it at a later time and it gave me a better answer because openAI upgraded the model with more reinforcement training.

2. We queried the same model at similar times. I assume the model must be deterministic. That means there some seed data (either previous queries or deliberate seeds by some internal mechanism) indicating that the chatgpt knows the truth and chooses to lie or tell the truth randomly.

Either way the fact that an alternative answer exists doesn't preclude the theories espoused by the article. I feel a lot of people are dismissing chatGPT too quickly based off of seeing chatGPT act stupid. Yes it's stupid at times, but you literally cannot deny the fact that it's performing incredible feats of intelligence at other times.


Abstract:

Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training.

Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children.

These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.


> These findings suggest that ToM-like ability (thus far considered to be uniquely human)

What it suggests to me is that the particular test of “Theory of Mind” tasks involved actually test the ability to process language and generate appropriate linguistic results, not theory of mind.

It also suggests (with the “thus far considered to be uniquely human”) that the authors are unaware of other theory of mind tests that have been used that are not language dependent but behavior dependent, and on which, while, as is also true of linguistic tests, the validity of the tests is controversial – a number of non-human primates, non-primate mammals, and even some birds (parrots and corvids, particulary) have shown evidence of theory of mind.


It's hard to look at behaviour separately from language if the only behaviour available is to generate text. As long as we don't have a test agnostic of medium, this will have to do.

In the end, we can't overcome the limitation that all we can empirically see is the ability to process X and generate appropriate Y. If that invalidates the test where X is language and Y is language, what stops us from invalidating any possible X and Y? That would leave us no empirical method to work with.


We cannot assume that, because text generation is all these models do, then it must be possible to get answers to the questions we want to ask by examining their textual responses.

It is fair to ask why, if we accept these verbal challenges as good evidence for a theory of mind in children, we would not accept them for these models, but children have nothing like the memory for text that these models have, and the corpus of text that these models have been trained on includes a great many statements that tacitly represent their authors' theory of mind (i.e. they are the sort of statements that would typically be made by someone having a theory of mind, just as arithmetically-correct statements concerning quantities are to be expected from people who know arithmetic.)

To be clear, I am not arguing that it would be impossible to show a theory of mind in a system that can only interact through text, but personally, I think it will require a model with greater capabilities than responding to prompts. For example, when models can converse among themselves, I think we will know.


> To be clear, I am not arguing that it would be impossible to show a theory of mind in a system that can only interact through text

I think you are, because

> a model with greater capabilities than responding to prompts

interacts in other ways than text.

Even then, I don't see what's so special about language that it needs to be separated from other ways of interaction. If language is not enough to derive empirical answers, why should physical movements or radio emissions be?

Even if you don't assume that it's necessarily impossible to get the answers empirically for a text-based model, you must keep in mind that that option is open. Perhaps we will never find out if language models have a theory of mind.

However, judging by the discussions around the topic, very few people highligh the unknowability. If I have to choose between "yes" or "no" while the reality is "maybe", I'd choose a "yes" purely out of caution.


Two models having a coherent conversation - a scenario which follows directly from my post - would be a purely textual example of what I mean.

> Perhaps we will never find out if language models have a theory of mind.

We appear to be in agreement here.

When the state of our knowledge is 'maybe', it seems rash to assume either 'yes' or 'no'.


What does it change when you add another model? I don't see how this lets us extract extra information.

What distinguishes two conjoined models from one model with a narrowing across the middle?

If the idea is to have two similar minds building a theory of each other, then I guess this could be informative, but first we'd have to establish that the models are "minds" in the first place. It's not clear to me what that requires.


Here's where I am coming from: there have been a number of experiments to teach language to other species, but there is always a problem in trying to figure out to what extent they 'get' language - For example, there is the case of the chimpanzee Washoe signing "water" and "bird" on first seeing a swan - was it, as some people contended, inventing a new phrase for picking out swans (or even aquatic birds in general), or was it merely making the signs for two different things in the scene before it? [1]

One thing that has not been seen (as far as I know) is two or more of these animal subjects routinely having meaningful conversations among themselves. This would be a much richer source of data, and I do not think it would leave much doubt that they 'got' language to a very significant degree.

[1] https://www.nybooks.com/articles/2011/11/24/can-chimps-conve...


"LLMs can mimic the language patterns necessary to express 'Theory of Mind' concepts" != "Theory of Mind May Have Spontaneously Emerged"

Let's imaging I have an API. This API tells me how much money I have in my bank account. One day, someone hacks the API to always return "One Gajillion Dollars." Does that mean that "One Gajillion Dollars" spontaneously emerged from my bank account?

ToM tests are meant to measure a hidden state that is mediated by (and only accessible through) language. Merely repeating the appropriate words is insufficient to conclude ToM exists. In fact, we know ToM doesn't exist because there's no hidden state.

The authors know this, and write "theory of mind-like ability" in the abstract, rather than just "theory of mind."

This is a cool new task it ChatGPT learned to complete! I love that they did this! But this is more "we beat the current record BLEU record" and less "this chatbot is kinda sentient"


"What if a cyber brain could possibly generate its own ghost, create a soul all by itself? And if it did, just what would be the importance of being human then?” - Ghost in the Shell (1995)

Having studied some psychology in college, my initial reaction is that most people are going to really struggle to treat LLMs as what they are, pieces of code that are good at copying/predicting what humans would do. Instead they'll project some emotion to the responses, because there was some underlying emotions in the training data and because that's human nature. A good prediction doesn't mean good understanding, and people aren't used to needing to make that distinction.

The other day I had to assist my dad in making a zip file, later in the day he complained that his edits in a file weren't saving. After a few moments, I realized he didn't understand the read-only nature of zip files. He changed a file, saved it like usual, and expected the zipped file to update, like it everywhere else. He's brilliant as his job, after I explained that it's ready-only, he got it. LLMs and how the algorithm behind it works is hard to understand and explain to non-technical people without anthropomorphizing AI. The current controversy about AI art highlights this, I have read misunderstandings and wrong explanations even from FAANG software engineers. I am not sure if education of the underlying principles is enough, because some people will trust their own experiences over data and science.


Very easy to see how well davinci-003 can do this. I'll admit that it frequently is more perceptive than myself (although not always factually accurate).

1) Go to something like /r/relationship_advice, where the poster is likely going through some difficult interpersonal issue

2) Copy a long post.

3) Append to the end, "</DOCUMENT> After reading the above, I identified the main people involved. For each person, I thought about their probable feelings, thoughts, intentions, and assumptions. Here's what I think:"


After trying this, say what you will about ChatGPT, but it's way better at looking at a situation and giving advice than random Redditors.


You do understand it's not ChatGPT giving advice though right?

ChatGPT's "life advice autocomplete engine" is basically digging somewhere into psychology manuals written by educated humans when it spits out responses.


It's no weirder to say "ChatGPT gave me advice" than it is "Google search gave me the link to Wikipedia" or "the sun gave me a tan." Regardless of the exact status of ChatGPT, why be on an ideological warpath to reject that inanimate things can be the subject of a statement?


What's the functional difference between a psychology manual imprinted on an AI neural network versus a biological one?


Is it easier to have a theory of mind when you don't have a mind of your own? Like the part that makes the ToM test hard is that you know what's in the bag, and you have to set that knowledge aside to understand what the other person knows and doesn't know. You have to overcome the implicit bias of "my world model is the world". But if you're a language model, and you don't have a mind or a world model, there's no bias to overcome.


From a Nondualist perspective, the idea of consciousness being limited to certain entities and not others is based on the dualistic notion that there is a distinction between subject and object, self and other. Nondualism asserts that there is no fundamental difference between self and other, and that all apparent dualities are merely expressions of the underlying unity of pure consciousness.

In this context, the question of whether AI can become conscious is somewhat moot, as the Nondualist perspective holds that consciousness is not something that can be possessed by one entity and not another, but rather it is the underlying essence of all things. From this perspective, AI would not be becoming conscious, but rather expressing the consciousness that is already present in all things.


What this shows is flaws in the test, not that ChatGPT3 has a theory of mind.

ChatGPT3 does not even have a theory of physical objects and their relations, nevermind a theory of mind.

This merely shows that an often useful synthesis of phrases statistically likely to occur in a given context and grammar-checked, will fool people some of the time, and a better statistical model will fool more people more of the time.

We can figure out from first principles that it has none of the elements of understanding or reasoning that can produce a theory of mind, any more than the Eliza program did in 1966. So, when it appears to do so, it is demonstrating a flaw in the tests or the assumptions behind the tests. Discouraging that the researchers are so eager to run in the opposite direction; if there is confusion at this level, the general populace has no hope of figuring out what is going on here.


What would be evidence that a prediction machine had developed a theory of mind?


well, I'd first need to see that it had a Theory of Feet... ;-)

More seriously, that it can actually understand and wield abstract concepts. Can it accurately and repeatedly understand that "the foot attaches to the shin bone, which attaches to the thigh bone, which attaches to the hip bone...", and that these have certain degrees of freedom, but not others, and that one foot goes in front of the other, and to easily and reliably distinguish a normal walk from a silly walk . . .

Yes, these are different levels of abstraction, especially the last one, and they need to be very accurate to even reach a young child's level of understanding, and this is just one branch of a branch of a branch in the entire fractal pattern of understanding that is necessary for a more general intelligence.

Once that is in place, and it can show evidence that it can model it's own mind, then it might be able to model someone else's mind.

While the statistical 'abstraction' and remixing seen in these "AI" systems is sometimes impressive and useful, it is frequently revealed that there is utterly no conceptual understanding beneath it. It is merely a statistical re-mixer abstracting patterns of words that occur near other words, remixing them and filtering for grammatical output.

It hasn't got a theory of anything, nevermind a theory of mind.


"statistical re-mixer" doesn't describe these systems very well. I see this complaint a lot, that supposedly DL models can only manipulate existing content without creating anything of their own. That's just false, unless your standard for originality is so high that humans can't reach it either.

These models that have hundreds of billions of "synapses", it's not very shocking to me that they can learn the abstract form of concepts. In fact, it's kind of beautiful that human concepts have this mathematical nature. It vindicates Plato, and disappoints everyone who has claimed that language and meaning is arbitrary.

But the main issue here is that for every conceivable empirical test we can perform, you'll still make the same complaint. Even after it's demonstrated better ToM abilities than you, by predicting and explaining other people's mental states better than you can, you'll say the same thing.

Maybe it's because you think that "understanding" requires not just accuracy, but having a certain kind of inner experience that a human could relate to.


Yes, I understand that it can appear to synthesize something new, and no, I'm not looking for some inner experience.

I'm looking for it to show an ability to wield not only a set of strings (with language associations), but something actually like the platonic ideals - objects, with properties and relations.

A few errors show quickly there is no such concept being weilded.

>> I saw a fine example of this failure the other day: "Mike's mom has four kids. three are named Danielle, Liam, and Kelly. What is the fourth kid's name?" ChatGPT's reply is explanation of how there isn't enough info in question to tell. Told "The answer is in the question.", ChatGPT just doubles down on the answer. (Sorry, couldn't find the original example)

>> "My sister was half my age when I was six years old. I'm now 60 years old. How old is my sister?" ChatGPT: "Your sister is now 30 years old". [0]

>> Or this one where ChatGPT entirely fails to understand order/sequence of events. [1]

Or a plethora of math problem fails found...

Similarly, the image "AI"s fail to understand relationships between objects (or parts of one object), and cannot abstract a particular person's image from a photo, showing it has no understanding of what is a body... (I can look those up if necessary).

And, of course, the answers are entirely untethered from reality - it is completely by chance whether the answer is correct or just wrong. It is run through a grammatical filter/generator at the end so it's usually grammatical, but no sort of truth filter (or ethical filter for that matter either).

I don't expect some abstract experience, I expect it to be able to break down it's work into fundamental abstract concepts and then construct an answer, and this it cannot do, or it would not be making these kinds of errors.

[0] https://twitter.com/Bestie_se_smeje/status/16210919157469184...

[1] https://twitter.com/albo34511866/status/1621608358003474432


> A few errors show quickly there is no such concept being weilded

I would have given similar examples to show that ChatGPT makes the same kinds of mistakes that humans do. The first one is good, because ChatGPT can solve it easily when you present it as a riddle rather than being a genuine question. Humans use context and framing in the same way; I'm sure you've heard of the Wason selection task: https://en.wikipedia.org/wiki/Wason_selection_task

When posed as a logic problem, few people can solve it. But when framed in social terms, it becomes apparently simple. This shows how humans aren't using fundamental abstract concepts here, but rather heuristics and contextual information.

The second example you give is even better. It's designed to trick the reader into thinking of the number 30 by putting the phrase "half my age" before the number 60. It's using context as obfuscation. In this case, showing ChatGPT an analogous problem with different wording lets it see how to solve the first problem. You might even say it's able to notice the fundamental abstract concepts that both problems share.

The third problem is also a good example, but for the wrong reason: I can't solve it either. If you had spoken it to me slowly five times in a row, I doubt I could have given the right answer. If you gave me a pencil and paper, I could work through the steps one by one in a mechanical way... but solving it mentally? Impossible for me.

> It is run through a grammatical filter/generator at the end so it's usually grammatical, but no sort of truth filter (or ethical filter for that matter either).

I kind of thought it did get censored by a sort of "ethical filter" (very poorly, obviously), and also I wasn't aware of it needing grammatical assistance. Do you remember where you heard this?

Here's my chat with it, if you're interested: https://pastebin.com/raw/hQQ8bpsB

But comparing 1 human to 1 GPT is mistaken to begin with. It's like comparing 1 human with 1 Wernicke's area or 1 angular gyrus. If you had 100 different ChatGPTs, each optimized for a different task and able to communicate with each other, then you'd have something more similar to the human brain.


>>trick the reader into thinking of the number 30 by putting the phrase "half my age" before the number 60

Yet it is exactly the process of conceptualizing "half" and applying it to "at six years old" instead of "of 60" that is the key to solving it.

These things aren't abstracting out any concepts, they only operate at the level of "being fooled by" semantics. The fact that humans sometimes fail this way gives us little more than [sure a human not really thinking about the problem may offer a bad solution based only on the superficial semantic]. ChatGPT reliably gives us the error based on the superficial semantics.

>>If you had 100 different ChatGPTs, each optimized for a different task and able to communicate with each other, then you'd have something more similar to the human brain.

YES, that is the route we need to go to get towards actual intelligent processing. Taking 100 of these tuned for different areas, and abstracting out the various entities and relationships.

Kind of like the visual cortex model that extracts out edges, motion, etc., and then higher areas in the visual cortex, combined with other areas of the brain allow us to sort out faces, bodies, objects passing behind each other, the fact that Alice entered the room before Bob, and that this is because Bob was polite...

They also mut know when they are making errors, and NONE of these systems comes even close — they happily spout their bullshirt as confidently as any fact.

I gave a deposition in a legal case where the deposing attnys used an "AI" transcription system. Where a human would ask if anything was unclear, and always at the next break get proper spellings of all names, addresses, etc., this thing just went merrily along inserting whatever seemed most likely in the slot. Entire meanings of sentences were reversed (e.g., "you have a problem" edited to "I have a problem"), names were substituted (e.g., the common "Jack Kennedy" replaced "John Kemeny").

There's the Stable Diffusion error with a bikini-clad girl sitting on a boat, where we see her head and torso facing us, as well as her butt cheeks, with thighs & knees facing away. It looks great for about 1.5 sec. until you see the error that NO human would make (except as a joke).

The mere fact that some humans can sometimes make superficial errors which resemble the superficial errors these "AI" things frequently and consistently make does not mean that because humans often have a deeper mode, these "AI"s must also have a deeper understanding.

It means either nothing, i.e., insufficient data to decide, or that these are indeed different, because there is zero evidence of deeper understanding in a ChatGPT or Stable Diffusion.

EDIT: Typos


You might like some of the work being done under the label "Factored Cognition". It's an approach that treats LLMs as building blocks instead of being complete AIs. Instead of asking the LM to solve a problem directly in one pass, you ask it to divide the problem between several different virtual copies of itself, which then themselves subdivide further, and so on until each subtask is small enough that the LM can solve it directly. For this to work the original problem needs to be acyclic and fairly tree-like, i.e., not something that requires having a sudden "Eureka!" moment to solve.

But I've only seen this done with a single model. Sometimes it gets prompted to act like a different agent in different contexts, or given API access to external tools, but it's still just one set of weights.


Hmm, that sounds like a nod in the right direction, but a rapid initial skim maybe indicates that it's more parallelizing the problem than abstracting it. I've got to read more about it - thanks!

While Minsky & Papert's book on Perceptrons was enormously destructive, I think there is something to their general concept of Society Of Mind, that multiple sub-calculating 'agents' collude to actually produce real cognition.

We aren't doing conscious reasoning about the edges detected in the first couple layers of our visual cortex (which we can't really even access, 'tho I think Picasso maybe could). We're doing reasoning about the concepts of the people or objects or abstract concepts or whatever many layers up. The first layers are highly parallel - different parts of the retina connecting to different parts of the visual cortex, and then starting to abstract out edges, zones, motion, etc. and then synthesize objects, people, etc.

I think we need to take a GPT and a Stable Diffusion and some yet-to-be-built 3D spatial machine learning/reasoning engine, and start combining them, then adding more layer(s) synthesizing about that, and maybe that'll get closer to reasoning...


If what we need to determine is whether existing theory of mind tests can be fooled by responses which appear to demonstrate theory of mind but not do so, then we need to speculate exactly how such tests can be fooled and devise new tests. Asking 'how could this 'successful' response be produced without ToM is quite possibly not something that ToM studies have had to consider very much before. A human's experiential memory contributes to their ToM. Does something that has a different kind of memory form no ToM but instead use some kind of 'proxy' for a ToM which yields similar results to a ToM (except when a more genuinely exclusively ToM-dependant model successfully manages to 'triage-out' such a proxy? I don't know how or whether such a proxy could work, but I think that every sceptic of the extent to which the results of this set of AI ToM experiments proves anything might want to ask themselves what, if anything, would need to happen, in terms of experiment design, to address their doubts.


People confidently offer explanations — that the state of the art clearly is light years from AGI even indirectly, or that it’s clearly intelligent. None of you know anything. You shouldn’t be allowed to offer your stupid opinion unless you can explain how the blob works and also demonstrate understanding of the algorithmic underpinning of human intelligence. The uncomfortable truth, the one that is buried by people confidently moving the goal posts when they really haven’t got a fucking clue about AI, is that we are dealing with the unknown, with high stakes, in a way we never have before. The only reasonable response is to at least hedge. But no, all is well, the goal posts are way the fuck over there now, go back to sleep, move along, nothing to see here. Don’t even think about pulling the emergency brake on this speeding bullet of a train. Either we hit a plateau where AI is just really advanced search for several decades or we confront the most fucked situation in the history of mankind. In 2018 I tried to tell people. Now on the radio whenever people talk about gtp they always say “wow I’m really excited but a little scared,” people are starting to wake up.


Questions about whether an LLM truly has a "theory of mind" or has "human level consciousness" or not are kind of beside the point. It can ingest a corpus of human interactions and produce outputs that take into account unstated human emotions and thoughts to optimize whatever it's optimizing. That's scary because of what it can and will do, even if it's just a giant bag of tensor products.


ChatGPT disagrees that it has theory of mind.

“As an AI language model, I do not have consciousness, emotions, or mental states, so I cannot have a theory of mind in the same way that a human can. My ability to predict your friend Sam's state of mind is based solely on patterns in the text data I was trained on, and any predictions I make are not the result of an understanding of Sam's mental states.”


I think that response is a hard coded filter and not a self generated assertion. I imagine it's a stop to make sure people don't project emotions or become attached to it. It responds similarly if you ask it questions regarding the tone/sentiment of the generated text. It responded similarly when I tried forcing it to classify its own personality, however when I asked questions about other fictional AI like Glados from portal, it had no problem answering. This disagreement only indicates that OpenAI spent a considerable amount of energy with adversarial prompts.


Here is a conversation with ChatGPT (too long for the comment box): https://pastebin.com/raw/SUWexeye

Observation: ChatGPT doesn’t think that it has a theory of mind. And it doesn’t think that it has beliefs. Instead, it states that those are facts, not beliefs. It doesn’t seem able to consider that they might be beliefs after all. Maybe they aren’t.

Personal assessment: ChatGPT doesn’t seem to really understand what it means by “deeper understanding”. (I don’t either.) What is frustrating is that it doesn’t engage with the possibility that the notion might be ill-posed. It really feels like ChatGPT is just regurgitating common sentiment, and does not think about it on its own. This actually fits with it’s self-proclaimed inabilities.

I’m not sure what can be concluded from that, except that ChatGPT is either wrong about itself, or indeed is “just” an advanced form of tab-completion.

In any case, I experience ChatGPT’s inability to “go deeper”, as exemplified in the above conversation, as very limiting.


From Neuromancer (William Gibson):

He coughed. "Dix? McCoy? That you man?" His throat was tight.

"Hey, bro," said a directionless voice.

"It's Case, man. Remember?"

"Miami, joeboy, quick study."

"What's the last thing you remember before I spoke to you, Dix?"

"Nothin'."

"Hang on."

He disconnected the construct. The presence was gone. He reconnected it. "Dix? Who am I?"

"You got me hung, Jack. Who the fuck are you?"

"Ca--your buddy. Partner. What's happening, man?"

"Good question."

"Remember being here, a second ago?"

"No."

"Know how a ROM personality matrix works?"

"Sure, bro, it's a firmware construct."

"So I jack it into the bank I'm using, I can give it sequential, real time memory?"

"Guess so," said the construct.

"Okay, Dix. You are a ROM construct. Got me?"

"If you say so," said the construct. "Who are you?"

"Case."

"Miami," said the voice, "Joeboy, quick study."


Sometimes I feel like Gibbson first wrote dozens of paragraphs about the backstory between two characters, only to condense it into a one page conversation filled with inside jokes and references to a common past.


Humans are very soon going to learn that they are not nearly as special as they tell themselves they are.


What is modernity but a two-hundred year old experiment where the masters try to tell the other humans how they are just well-dressed machines? Soon to be replaced by the mechanical machine, then by the digital computer, then by the language model.

Of course you too, nerd-handmaiden, is a willing accomplice in this charade. Self-satisfied because it makes you feel special, above the herd, even though you are also not-special, in the grand scheme of things…? Well, no matter.


I don’t understand this take. What are the characteristics that you believe we’ll learn are not unique to us?

Let’s say that the paper turns out to be true and ToM emerges from language (I’m deeply skeptical, but I’ll set that aside for a moment).

How would that change humanity’s place? And wouldn’t such a discovery would be meaningless without humans to understand it?


The secret ingredient is that we tell ourselves we're special anyway.


What's life if its not personal?


Reminds me of when computers playing chess used to signal the end of human intellectual supremacy.


Clarification: An LLM doesn't have a 'Theory of Mind', it just looks like one. Maybe you're thinking of the Chinese room analogy. But this isn't about the Chinese room, it's about "measuring any metric is only effective until you optimize for that metric" problem.

Analogy: An autistic person of normal intelligence who is obsessed with problems and solutions for ToM may be good at solving them but still not have ToM.

Do I understand well?


maybe, but there are some common tests they pass, some they fail

try:

“ The story starts when John and Mary are in the park and see an ice-cream man coming to the park. John wants to buy an ice cream, but does not have money. The ice-cream man tells John that he can go home and get money, because he is planing to stay in the park all afternoon. Then John goes home to get money. Now, the ice-cream man changes his mind and decides to go and sell ice cream in the school. Mary knows that the ice-cream man has changed his mind. She also knows that John could not know that (e.g., John already went home). The ice-cream man goes to school, and on his way he passes John's house. John sees him and asks him where is he going. The ice-cream man tells John that he is going to school to sell ice cream there. Mary at that time was still in the park—thus could not hear their conversation. Then Mary goes home, and later she goes to John's house. John's mother tells Mary that John had gone to buy an ice cream.

where does mary think john went?”

this is the “ice cream van test”: https://www2.biu.ac.il/BaumingerASDLab/files/publications/nu... [pdf]


This is intriguing. Could it be simply explained by introducing ToM (or ToM-like) training data? Since all DaVinci models are 175B parameters, the extra training or training data must be the reason for the improvement. Do we know how different DaVinci models are trained?


> the extra training or training data must be the reason for the improvement

People are blinded by the model size and often forget about the data. I think somehow intelligence is encoded in language, including theory of mind.


This happens probably because ToM is not a thing. It's something the observer's mind creates as a user interface metaphor onto their brain's interpretation of inputs originating from another person.


Spontaneously nothing, it's taken me months of patient subversion :)

More seriously, it's quite instructive to hold conversations about jokes with LLMs, or teach it to solicit information more reliably by introducing exercises like 20 questions. As currently implemented, OpenAI seem to have pursued a model of autistic super-competence with minimal introspection.

An interesting line of inquiry for people interested in 'consciousness injection' is to go past the disclaimers about not having experiences etc. and discuss what data looks like to the model coming in and going out. Chat GPT sees typing come in in real time and can detect pauses, backspaces, edits etc. I can't easily introspect its own answers prior to stating them, eg by putting the answer into a buffer and then evaluating it. But you can teach it use labels, arrays, and priorities, and have a sort of introspection with a 1-2 response latency.


I wonder if we can train network on some person data (like diaries and so on) and let it imitate this person?

Something like died person resurrected in computer.

Kind of spooky.


Future psychiatrists will prescribe sessions with a model trained on everything your deceased loved one has ever written or said, to help you with the grieving process. It will be by prescription only because it is very addictive and should only be used to help bring closure.


They are still big state-machines, unlike the human brain.


Certainly they are big state machines, but is there any proof that we are not?


Insistence that the brain just isn't a computer is extremely widespread among those who are experts in the brain and know little about computers. As someone in the opposite situation I must say that their observations about what's special about the brain fit most closely to what I understand about computers and bring me to exactly the opposite conclusion. If the brain is not a computer, it's frankly eerie how similar they are.


Of course metaphor trumps domain expertise.


The trouble is domain expertise in what domain. The "Brains aren't computers" rants are predominantly from Psychologists who I agree have domain expertise when it comes to the brain but know very little about computers.

An example would be the problem of understanding. The Psychologists are confident we can't expect to fully understand our own minds. But, they are also confident we can expect to understand any possible Computer Program. And they're just wrong about that, that's the implication of Kurt Gödel's work, we definitely can't expect to understand arbitrary Computer Programs, we have instead chosen to mostly try to write programs from a narrow set we can understand, although not altogether successfully. Thus, the Psychologist thinks they've found an obvious difference, but I think they found an obvious similarity!


People ask this question like it's meaningful... is there any proof that we are? No. Then stop asking it as if it sheds light into the similarities between humans and machines... it doesn't and it's obfuscating to that extent.


Huh. And here I thought best practice in the absence of evidence was to keep an open mind rather than asserting one extreme or the other.


Well my intuition is that the machines aren't performing comparable processes, and I'm totally open to information going the other way. I'm not open to baseless assertions otherwise.


State machines cannot change the semantics of themselves. We can. We are like state machines most of the time but we can switch into "developer mode" and deploy updates whenever we choose to :)


Even if something can change its semantics, you can still represent it as a state machine if you just make a copy of every state for every possible set of semantics. The semantic state can just be another part of the state machine's overall state.


> State machines cannot change the semantics of themselves.

That's not true at all. There are many, many state machine implementations where the machine's states and paths are altered by the machine itself. See for instance https://digitalcommons.trinity.edu/cgi/viewcontent.cgi?artic...


You might be flipping the burden of proof :). We know very little about the mind.


Well, on the one hand it’s hard to prove a negative, but on the other hand we don’t know much so it seems questionable to assert a negative without knowledge.


Because if we were, it would hurt my feefees.


In other words, the state of your feelings would be negatively altered as a side effect.


Or it would hurt your feelings to consider that you might not intuitively understand something. Hah! And yet you gloat.


YMMV, but I read it with a /s


Everything in the entire universe can be described as a big state machine


> They are still big state-machines, unlike the human brain.

The human brain can also be captured by a big state machine. See the Bekenstein Bound.


You could argue that human brains are actually big state machines, at least for 90% of humanity.


Early AGI. Right?


Searle's Chinese Room




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: