HN AGI discourse is full of statements like this (eg. all the stuff about stochastic parrots), but to me this seems massively non-obvious. Mimicking and rephrasing pre-written text is very different from conceiving of and organizing information in new ways. Textbook authors are not simply transcribing their grad school notes down into a book and selling it. They are surveying a field, prioritizing its knowledge content based on an intended audience, organizing said information based on their own experience with and opinions on the learning process, and presenting the knowledge in a way which engages the audience. LLMs are a long way off from this latter behavior, as far as I can tell.
> The best language models (eg GPT-4) have some understanding of the world
This is another statement that I see variants of a lot, but which seems to way overstate the case. IMO it's like saying that a linear regression "understands" econometrics or a series of coupled ODEs "understands" epidemiology; it's at best an abuse of terminology and at worst a complete misapplication of the term. If I take a picture of a page of a textbook the resulting JPEG is "reproducing" the text, but it doesn't understand the content it's presenting to me in a meaningful way. Sure it has primitives with which it can store the content, but human understanding covers a far richer set of behaviors than merely storing/compressing training inputs. It implies being able to generalize and extrapolate the digested information in novel situations, effectively growing one's own training data. I don't see that behavior in GPT-4
Hey, I wanted to reply to this, but just didn't find the time. I disagree with a lot of what you wrote, but your arguments are stated clearly. I appreciate that. Cheers :)
Yes, LLM:s currently only deal with text information. But GPT-5 will supposedly be multimodal, so then it will also have visual and sound data to associate with many of the concepts it currently only knows as words. How many more modalities will we need to give it to be able to say that it understands something?
Also, GPT-4 indeed doesn't do any additional training in real-time. However, it is being trained on the interactions people have with it. Most likely, near future models will be able to train themselves continuously, so that's another step closer to how we function
I took the liberty of asking GPT-4 to take the second statement and your response to it, and turning it into a fable:
"Once upon a time, in a village nestled by a grand mountain, lived a wise Sage. The Sage was known throughout the lands for his vast knowledge and understanding, for he had spent his life studying the texts of old and the secrets of the world.
In the same village, an Artisan, skilled in the craft of making extraordinary mirrors, lived. These were no ordinary mirrors, for they were said to reflect not just physical appearances, but also knowledge and experiences. Intrigued by the wisdom of the Sage, the Artisan decided to make a mirror that would reflect the Sage's knowledge.
After many days and nights of meticulous work, the Artisan finally crafted a mirror so clear and pure, it held a reflection of all the knowledge and wisdom the Sage had gathered throughout his life. The mirror could answer questions about the world, cite ancient texts, and reflect the wisdom it was imbued with.
Word quickly spread about the Sage's Mirror, and villagers began to claim, "This mirror is as wise as the Sage himself! It knows and understands as much as he does!"
However, a wise Old Woman of the village, known for her insightful observations, gently corrected them, "This mirror, as remarkable as it is, contains a reflection of the Sage's knowledge. It can share what the Sage knows but doesn't truly understand the way the Sage does."
The Old Woman continued, "The Sage has spent years learning, pondering, and experiencing life, which the mirror cannot replicate. The Sage's understanding implies the ability to think, reason, and learn in ways that the mirror, no matter how complete its reflection, simply cannot. The mirror's reflection is static, a snapshot of a moment in time, while the Sage's wisdom continues to grow and adapt."
The villagers learned a valuable lesson that day. They realized the mirror was an extraordinary tool that held vast knowledge, but it was not a substitute for the genuine understanding and wisdom of the Sage."
- Not too bad for a mirror.
I'd be interested to hear what you think is so special about human understanding? We also just absorb a lot of data and make connections and inferences from it, and spit it out when prompted, or spontaneously due to some kind of cognitive loop. Most of it happens subconsciously, and if you stop to observe it, you may notice that you have no control of what your next conscious thought will be. We do have a FEELING that we associate with the cognitive event of understanding something though, and I think many of us are prone to read a lot more into that feeling than is warranted
HN AGI discourse is full of statements like this (eg. all the stuff about stochastic parrots), but to me this seems massively non-obvious. Mimicking and rephrasing pre-written text is very different from conceiving of and organizing information in new ways. Textbook authors are not simply transcribing their grad school notes down into a book and selling it. They are surveying a field, prioritizing its knowledge content based on an intended audience, organizing said information based on their own experience with and opinions on the learning process, and presenting the knowledge in a way which engages the audience. LLMs are a long way off from this latter behavior, as far as I can tell.
> The best language models (eg GPT-4) have some understanding of the world
This is another statement that I see variants of a lot, but which seems to way overstate the case. IMO it's like saying that a linear regression "understands" econometrics or a series of coupled ODEs "understands" epidemiology; it's at best an abuse of terminology and at worst a complete misapplication of the term. If I take a picture of a page of a textbook the resulting JPEG is "reproducing" the text, but it doesn't understand the content it's presenting to me in a meaningful way. Sure it has primitives with which it can store the content, but human understanding covers a far richer set of behaviors than merely storing/compressing training inputs. It implies being able to generalize and extrapolate the digested information in novel situations, effectively growing one's own training data. I don't see that behavior in GPT-4