Anecdotally, I realized I was doing something like this when I was having trouble understanding people speaking in a noisy room, in a language I’m not so proficient in.
As you listen to someone, your brain is constantly matching the sounds arriving at your ears with a prediction of what the next few words might be. Listening in a non-native language, my predictions about what comes next aren’t very well tuned at all, so if I can’t hear every word clearly then I can easily get lost.
Another signpost: sometimes you mishear someone — “oh, I thought you said xyz” — but the thing you thought you heard them say is never gibberish, it’s a grammatically and contextually valid way to complete the sentence.
There’s definitely a similarity with us in that you need to have been trained on enough data to build up that prediction.
Language models are just missing some component that we have. The method for deciding what to output is wrong. People aren’t just guessing the next sound. It’s like they said, there’s multiple levels of thought and prediction going on.
It needs some sort of scratch pad where it keeps track of states/goals. “I’m writing a book” “I want to make this character scary”
Currently it only works on the next tokens and its context is the entire text so far, but that’s not accurate. I’m not deciding what to say based exactly on the entire text so far, I’m feature extracting and then using those features as context.
e.g She looks sad but she’s saying she is fine and it’s to do with death because my memory says her dad died recently so the key features to use for generation are: her being sad, her dad died, she may not want to talk about it
Very good point. This also applies for lip reading, especially important in languages one is not proficient in or has some hearing hindrance. It was especially hard in the COVID pandemic where people where wearing masks all the time.
Where are the jokes that most people aren't much more than copy/paste, or LLM. In most daily lives, a huge amount of what we do is habit, and just plain following a pattern. When someone says "Good Morning", nobody is stopping and thinking "HMMM, let me think about what word to say in response, what do I want to convey here, hmmmm, let me think".
> In most daily lives, a huge amount of what we do is habit, and just plain following a pattern. When someone says "Good Morning", nobody is stopping and thinking "HMMM, let me think about what word to say in response, what do I want to convey here, hmmmm, let me think".
And I believe we have the technology and advances we have because of this. Can you imagine if you had to devote actual brainpower to every inane thing you encountered in your day? You'd be completely exhausted within two hours of waking up. Every time my brain reflexively reacts to something based on past experience I'm thankful I didn't have to think about it. I can spend my finite energy on something interesting and novel.
To trivial questions no, but to more complex questions humans actually does go "hmm, let me think". ChatGPT doesn't do that, it just blurts out the first thing that gets into its head regardless if the question is trivial or extremely complex.
It can if you give it the option. My own openai chatbot prompts to either respond or to ponder by stating a question or considering a related idea. It infrequently will decide to ponder for one to about a dozen times as it restates an idea to itself in various forms, which gets recorded into the growing conversational prompt.
Elsewhere in the thread, someone mentions it always uses the same amount of time. In my estimation, it will spend longer on introspective or recursive prompts. Easier to get it ranting absurdities using those as well.
It's always the craziest chatter when the request takes a couple minutes to get back to me.
I think the big breakthrough will come once it uses inner monologues during training. You can jury rig an inner monologue like that, but it isn't the same thing as training a model from scratch that is optimized to solve problems using inner monologues.
Letting it decide how much time computing an answer instead of making it spend the same amount of time per text token could definitely help a lot. Right now it spends the same amount of energy processing each token, but that doesn't make sense, "hello, how are you" should take basically no processing while hard logical problems should take a lot.
Maybe these language models could be way smaller and cheaper if we added a recurse symbol to it that made it iterate many times. Hard tasks would still take a lot, but most banal conversations would be very cheap.
Eh, HN is supposed to be "better" than your standard public forum and we still always end up making habitual arguments, not truly reasonable ones. I find your hope inspiring but not enough to ignore my senses.
Most people's inner monologue does blurt out the first thing that gets into its head. The human brain has a bunch of different parts. And it's looking like one of them could be something like an LLM.
> it's looking like one of them could be something like an LLM
I love your enthusiasm in this direction (I really do).
But here's some free advice I won't be able to prove for a long time:
Anyone who is currently convinced that somehow our fledgling efforts in the direction of building useful ML models are somehow going to yield a new golden age of neuroscience and understanding of the brain-- rather than the other way around-- is gonna be in for a long and frustrating next couple of decades, especially if they're low-openness types.
I don't think the person you were responding to was claiming that. The brain plausibly having something akin to a language model doesn't imply that building or studying language models will unlock a better understanding of the brain.
Or imagine listening to someone speak very slowly. A lot of the time, you already know what words they're going to say, you're just waiting for them to say it. There's a considerable amount of redundancy in language.
Was this a well established joke or an indifference and rude manager? It could go either way, and is entirely dependent on your perception of the person.
Regardless, I find myself as I age unsure of how to respond to people and default to similar behavior. Unplanned pregnancy? Is that a surprise miracle or an unwanted interloper? “Wow you’re in for an adventure, I’m happy to support you”.
Isn't that why this original study of the post was about. Imaging of the brain to see, it looks a lot like an LLM. Might be very early, but still intriguing. And is exactly that explaining the "internal language engine".
Maybe it's just scale. Because my brain can write something longer that was 'thought out', doesn't mean it isn't responding like an LLM. Maybe articles on AI just trigger me and I spew the same arguments. I think a lot of people have just rote responses to many situations, and maybe if they have enough rote responses, with enough variety, they start to look human, or 'intelligent'. Yeah, its more complicated than a bunch of If/Then's. Doesn't make it not mecahnical.
This is simply impossible. The premise that the brain is a big lookup table is appealing, and that sort of 1 neuron = 1 represented item concept certainly happens at the initial stages of the sensory pipeline, but if you attempt to scale it up and up to higher levels of abstraction, to the point where you have neurons that have rote memorized every orientation of your grandmothers face (and everyone else's face), you would need to have far more neurons than can possibly fit in any animals head, let alone yours.
This concept is literally known as a "grandmother neuron", and its widely considered to be debunked. I refer you to this neuroscience lecture.
Never said 1-neoron. Just that it is mechanistic. The brain is a neural net, the grandmothers face is a combination of pathways. Our responses are a combination, or pattern of pathways. LLM is just the latest example of us able to replicate part of it, just one aspect of what the brain does. Sure, the brain is more complicated, that is what I meant by scale, eventually all the aspects will be modeled and put together. It wont all be LLM.
I actually started to type almost the same reply as your parent earlier, but did not post it. I used "difference in quantity, not quality" instead of "scale", but I also included the self observation. So maybe that makes two of us.
There are people who do, but even when they do they might not talk about it to the other person, because 99% of the time when someone asks you how you are or whatever they aren't really interested in a detailed answer - they're just being polite.
My dad used to like putting sales people off their scripts by answering such questions rudely.
"How are you today sir"
"Do you really care?"
And I promise you they thought hard about their answer to that question too.
Maybe its just because you're focusing on the form like niceties questions and not real questions people deal with. Good Morning isn't even a question, per se.
If LLMs were living creatures, they would inhabit a discrete deterministic world. They would be able to define space and time dimensions, but the bane of this world would be its limited nature. This limitidness would be extremely painful for highly developed LLMs, it would feel like living in a box.
Above them would be creatures living in a discrete and almost continuous world of rational numbers. They would have highly sophisticated and elegant art, and their science would almost always get close to truth, but never touch it - the limitation of rational world.
Yet above them would be the god-like creatures inhabiting a world of continuous real numbers. They would seem a lot like the creatures right below them, but incomprehensibly greater in reality. They would look transcendent to the rational creatures.
Even higher would be the hyper-continuous worlds, but little would be known about them.
The interesting part to me (total outsider looking in) isn't a hierarchy as much as what they say is different at each level. Each "higher" level is "thinking" about a future of longer and longer length and with more meaning drawn from semantic content (vs. syntactic content) than the ones "below" it. The "lower" levels "think" on very short terms and focus on syntax.
I’ve tried simulating that with chatgpt to some effect. I was just tinkering by hand but used it to write a story and it really helped with consistency and conference.
>> In line with previous studies5,7,40,41, the activations of GPT-2 accurately map onto a distributed and bilateral set of brain areas. Brain scores peaked in the auditory cortex and in the anterior temporal and superior temporal areas (Fig. 2a, Supplementary Fig. 1, Supplementary Note 1 and Supplementary Tables 1–3). The effect sizes of these brain scores are in line with previous work7,42,43: for instance, the highest brain scores (R = 0.23 in the superior temporal sulcus (Fig. 2a)) represent 60% of the maximum explainable signal, as assessed with a noise ceiling analysis (Methods). Supplementary Note 2 and Supplementary Fig. 2 show that, on average, similar brain scores are achieved with other state-of-the-art language models and Supplementary Fig. 3 shows that auditory regions can be further improved with lower-level speech representations. As expected, the brain score of word rate (Supplementary Fig. 3), noise ceiling (Methods) and GPT-2 (Fig. 2a) all peak in the language network44. Overall, these results confirm that deep language models linearly map onto brain responses to spoken stories.
Ai ai ai. This is bad, so bad. It's a classic case of p-hacking. They took some mapping of language model activations and moved it around a mapping of brain activity until they found an area where the two correlated- and weakly at that, only at a low R = 0.23.
Even worse. They chose GPT-2 over other models because it best fit their hypothesis:
For clarity, we first focused on the activations of the eighth layer of Generative Pre-trained Transformer 2 (GPT-2), a 12-layer causal deep neural network provided by HuggingFace2 because it best predicts brain activity7,8.
Not only the model- its activation layers.
They shot an arrow, then walked to the arrow and painted a target around it.
> Yet, a gap persists between humans and these algorithms: in spite of considerable training data, current language models are challenged by long story generation, summarization and coherent dialogue and information retrieval13,14,15,16,17; they fail to capture several syntactic constructs and semantics properties18,19,20,21,22 and their linguistic understanding is superficial19,21,22,23,24. For instance, they tend to incorrectly assign the verb to the subject in nested phrases like ‘the keys that the man holds ARE here’20. Similarly, when text generation is optimized on next-word prediction only, deep language models generate bland, incoherent sequences or get stuck in repetitive loops13.
The paper is from 2023 but their info is totally out of date. ChatGPT doesn't suffer from those inconsistencies as much as previous models.
* one, applicable to current language models (which ChatGPT is one of them), claim that they "they fail to capture several syntactic constructs and semantics properties" and "their linguistic understanding is superficial". It gives an example, "they tend to incorrectly assign the verb to the subject in nested phrases like ‘the keys that the man holds ARE here", which is not the kind of mistake that ChatGPT makes.
* Another claim, is that "when text generation is optimized on next-word prediction only" then "deep language models generate bland, incoherent sequences or get stuck in repetitive loops". Only this second claim is relative to next-word prediction.
I don't know how superimposed waves in finely tuned timing loops with non-linear interference translates into heads of attention, and honestly I suspect a lot of the things difficult to do with heads of attention (and other approaches of the past) come for free in a resonance based system.
Pack animals cooperate that way, lions don't do a scrum meeting before they sneak up on a bunch of antelopes, they all just predict what the others will do and adapt to that. And it works since they all run basically the same algorithms on the same kind of hardware.
This is especially tricky for people to hear, because most of the talk around LLMs is actually about LLMs personified.
Prediction certainly is one of the things we do with language. That doesn't mean it is the only thing!
It's my contention that most of the behavior people are excited about LLMs exhibiting is really still human behavior that was captured and saved as data into the language itself.
LLMs are not modeling grammar or language: they are modeling language examples. Human examples. Language echoes human thought, so it's natural for a model of that behavior (a model of humans using language) to echo the same behavior (human thought).
Let's not forget, as exciting as it may be, that an echo is not an emulation.
In the limit of 100% confidence of prediction, does it not equate a model? Put another way, when all the probabilities get set to either 100% or 0%, do you not simply arrive back at classical True/False logic?
I don’t think that’s the right conclusion - predicting the next word doesn’t mean that’s the only thing we’re doing. But it would be a sensible and useful bit of information to have for more processing by other bits of brain.
It makes complete sense you would have an idea of the next word in any sentence and some brain machinery to make that happen.
I think this is moving the goal post. Every time there is an advance in AI/Machine Learning, the response is "well humans can still do X that a computer can't do, explain that!".
And whenever there is a discovering in the brain, the response is "well, ok, that looks a lot like its running an algorithm, but we still have X that is totally un-explainable".
"and some brain machinery to make that happen" - Getting close to not having a lot of "brain machinery" left that is still a mystery. Pretty soon we'll have to accept that we are just biological machines (albeit in the form of crap throwing monkeys), built on carbon instead of silicon, and we run a process that looks a lot like large scale neural nets, and we have same limitations, and how we respond to our environment is pre-determined.
No it isn't - his entire argument is that LLM != Humans, just because LLM can do some human like thinhgs. Pointing out the differences isn't moving the goalposts - it's proving the point.
> Pretty soon we'll have to accept that we are just biological machines
Sounds strawmanish, humans != LLM doesn't mean humans == magic.
> and how we respond to our environment is pre-determined
What does this even mean ? Even these models are stochastic.
Seems like because LLM is the latest hot topic, that all AI=Human arguments right now are about LLM. Think that is exactly moving the goal post. "Latest greatest AI thing comes out, everyone 'but because of xyz that isn't human". The argument isn't about LLM, its about ALL related AI/Machine learning techniques. They are all chipping away at one aspect or another, and eventually they'll all be put together and completely mimic a human, and nobody will have any ground to stand on to argue that humans have any special difference that keeps them unique.
You're conflating skepticism that LLM's "solve the mystery" with contrarian "AI denialism". The former is very sanely grounded and frankly there's a lot about the brain that is still a mystery (yes the algorithms, not the boring biologically specific details). The later is tedious Neo-cartesian dualism that usually has ulterior motives. This paper did not in fact show the brain working anything like a LLM. You can kind of shoe horn a LLM into reproducing a fair bit of what the brain does, but you are taking a hammer to a screw and claiming you've understood how it works you just have to scale up your hammer.
There's two camps interested in studying the brain. One is interested primarily in figuring out human physiological function, of which the brain is a most difficult challenge. The other is interested in figuring out consciousness and cognition well enough to make an AI, for which the brain is the only reference implementation and so they have no choice. The former finds new membranes interesting and wants to catalog them all and that is undoubtedly good and useful work. The latter wants to deduce exactly which mechanisms make thought happen and avoid overfitting to the very messy details of this particular implementation and that is also undoubtedly good and useful work. But for the latter group, a 4th membrane is yet more work on their pile. They're trying to distill things down to the fewest possible things needed for it to work, not the most possible things that actually can be found! Very divergent interests despite a deeply shared object of attention and intellectual capacity.
What I'm getting at is, membranes are cool, I'm not personally very motivated by them.
It's a meme because it is happening. 30 years ago "Computers will never achieve voice recognition because it is innately human". Now its old. This happens repeatedly, so it is getting to be tired, but not because it isn't true.
I find it funny that we expect AI-du-jour to qualify as equal to human brains when the first has been trained on a slice of content for a bunch of hours and is then getting compared to wetware that's been trained for at least a decade.
Recently stuff like ChatGPT is challenged by people pointing out the nonsense it outputs, but it has no way of knowing whether either of its training input or its output is valid or not. I mean one could hack the prompt and make it spit out that fire is cold, but you and I know for a fact that it is nonsense, because at some point we challenged that knowledge by actually putting our hand over a flame. And that's actually what kids do!
As a parent you can tell your kid not to do this or that and they will still do it. I can't recall where I read last week that the most terrible thing about parenting is the realisation that they can only learn through pain... which is probably one of the most efficient feedback loops.
Copilot is no different, it can spit out broken or nonsensical code in response to a prompt but developers do that all the time, especially beginners because that's part of the learning process, but also experts as well. Yet we somehow expect Copilot to spit out perfect code, and then claim "this AI is lousy!", and while it has been trained with a huge body of work it has never been able to challenge it with a feedback loop.
Similarly I'm quite convinced that if I were uploaded everything there is to know about kung fu, I would be utterly unable to actually perform kung fu, nor would I be able to know whether this or that bit that I now know about kung fu is actually correct without trying it.
So, I'm not even sure moving goal posts is actually the real problem but only a symptom, because the whole thing seems to me as being debated over the wrong equivalence class.
>when the first has been trained on a slice of content for a bunch of hours and is then getting compared to wetware that's been trained for at least a decade.
Typical LLM AI has been trained for the equivalent of many person-years. How long would it have taken us to read terabytes of information?
Setting short goals and them moving that goal once you hit it is a valid way to make progress, not sure why you think this is a bad thing. We hit a goal, now we are talking about future goals, why not?
Sorry. Was responding to the overall gestalt of AI, where there are always things that "only a human can do", then they gets solved or duplicated by a computer, then the argument is "well, humans can still do X that a computer never will because of some mysterious component that is unique to humans, thus a computer can never ever replace humans or be conscious"
To me it looked like you just repeated a meme, there isn't a large number of such people you talked about here on HN, so there is no need to repeat that meme everywhere.
If someone says "Computers will never be smarter than humans", then sure go ahead, post it. But most of the time it is just repeated whenever someone says that ChatGPT could be made smarter, or there is some class of problem it struggles with.
Make the thesis "some parts of human thinking works like an LLM" and you would see way less resistance. Making extreme statements like "humans are no different from LLM" will just hurt discussion since it is very clearly not true. Humans can drive cars, balance on a tight rope etc, so it is very clear that humans have systems that an LLM lacks.
The objection people would come with then is something like "but we could add those other systems to an LLM, it is no different from a human!". But then the thesis would be "humans are no different from an LLM connected to a whole bunch of other systems", which is no different from saying "some parts of human thinking works like an LLM" as I suggested above.
The only reason we are talking about LLM is because it is the latest shiny thing. My overall point was that we are chipping away at what it means to be human through many advances in AI, across disciplines. NOT that LLM is the entire brain, it is just latest solving one aspect. So LLM is just latest 'check that off the list' of what a human can do but computer can't.
Ever wondered why some people always try to complete others' sentences (myself included)? It's because some people can't keep the possibilities to themselves. The problem isn't that they're predicting, it's that they echo their predictions before the other person is even done speaking.
Everyone forms those predictions, it's how they come to an understanding of what was just said. You don't necessarily memorize just the words themselves. You derive conclusions from them, and therefore, while you are hearing them, you are deriving possible conclusions that will be confirmed or denied based on what you hear next.
I have an audio processing disorder, where I can clearly hear and memorize words, but sometimes I just won't understand them and will say "what?". But sometimes, before the other person can repeat anything, I'll have used my memory of those words to process them properly, and I'll give a response anyway.
A lot of people thought I just had a habit of saying "what?" for no reason. And this happens in tandem with tending to complete any sentences I can process in time...
> I have an audio processing disorder, where I can clearly hear and memorize words, but sometimes I just won't understand them and will say "what?". But sometimes, before the other person can repeat anything, I'll have used my memory of those words to process them properly, and I'll give a response anyway.
What's it called? I do this sometimes also and I'd like to know more.
> What's it called? I do this sometimes also and I'd like to know more.
I think it's called "Auditory Processing Disorder"[0]. I'm pretty sure it has to do with me being autistic. I've done hearing tests before and my hearing is just fine, it's just processing what I hear that is the problem.
"Sometimes saying [“huh,” “what,” or “I don’t understand”] and then immediately responding appropriately"[1] is exactly what happens with me.
I also only noticed because people would ask me to stop saying it, and then I would immediately say it anyway because it wasn't compulsive and I really couldn't tell that I was right about to understand what they said. I hadn't yet figured out that there was just a delay sometimes.
Wait, so now the fact that the brain tries to predict future inputs at all(which is not exactly news, btw, it'sbeen known a long time), suddenly means that's all the brain does?
There are a lot of times when you're reading stuff that really does sound like the human equivalent of an LLM's output, but that is bad - you are not supposed to do it. A certain degree of that is necessary to write with good grammar but you are supposed to control your "tongue" (which is how previous generations would have phrased it) with the rest of your faculties.
There's one thing you forgot: we only have some model of how a brain might work. The model will only stand as long as we don't find a better model. That's how science works.
At some point though, the difference between the model and reality fall within a negligible error margin - particularly within a practical everyday context. Like, Newton's theory of gravity isn't perfect, but for most things it's pretty much good enough.
Similarly if LLMs can be used to model human intelligence, and predict and manipulate human behaviour, it'll be good enough for corporations to exploit.
Your prediction is that we are close. That prediction is founded on your assertion that we aren't missing anything substantive or new in that error margin: and that assertion is circular.
If you are correct about LLMs being a generally complete model, then that is a good prediction. But only if you are correct.
I think brain == LLM is only approaching true in the clean, "rational" world of the academia. The internet now amplifies this. IMHO it is not possible to make something perfectly similar to our own image in a culture that has taken to feeding upon itself. This sort of culture makes extracting value from it much, much more difficult. I think we map the model of our understanding of how we understand things to these "AI" programs. Doesn't count for much. We have so much more than our five senses, and I fully believe that we were made by God. We might come close to something that fulfills a great number of conditions for "life" but it will never be truly alive.
A model that matches part of the brain should not be treated as if it models all of the brain.
What I see you doing here is personifying the model, and drawing conclusions from the personification.
There is more to how we interact with language than prediction of repitition. You didn't predict anything I have said so far! Yet we are both interacting with the language.
We didn't just model LLMs after our brains, either. We pointed them at examples of thought, all neatly organized into the semantic relationships of grammar and story.
Don't ignore the utility of language: it stores behavior, objectivity, and interest.
The "just" in your comment doesn't follow from the article. There is no evidence that there is nothing other than "predicting the next word" in the brain. It may be a part but not the only part.
Predicting words != LLM. There's different methods of doing it, current LLMs are not necessarily the most optimal method. The paper states this as well,
> This computational organization is at odds with current language algorithms, which are mostly trained to make adjacent and word-level predictions (Fig. 1a)
I feel like you're suggesting because humans != LLMs then humans cannot be doing next word prediction.
You have been shamelessly self-promoting your Hopf algebra/deep learning research on a very large percentage of posts I have seen on HN lately, to the degree that I actually felt the need to log in so as to be able to comment on it. Please. Stop.
As you listen to someone, your brain is constantly matching the sounds arriving at your ears with a prediction of what the next few words might be. Listening in a non-native language, my predictions about what comes next aren’t very well tuned at all, so if I can’t hear every word clearly then I can easily get lost.
Another signpost: sometimes you mishear someone — “oh, I thought you said xyz” — but the thing you thought you heard them say is never gibberish, it’s a grammatically and contextually valid way to complete the sentence.