Hacker News new | past | comments | ask | show | jobs | submit login

>Those are semantic relationships with the rest of the world.

If this is all that counts as semantic relationships, then I see no reason why a language model doesn't have this kind of semantic relationship, albeit in a very different modality. Tokens and their co-occurrences are a kind of sensor to the world. In the same way we discover quantum mechanics by way of induction over indirect relationships among the signals incident to our perceptual apparatus (the sensors and actuators that translate external signals into internal signals), a language model could learn much about the world by way of induction over token co-occurrences. Sure, there are limits, conscious perception of the world being the big one, but I see no reason to think conscious perception of X is required to know or understand X.




> I see no reason why a language model doesn't have this kind of semantic relationship

One certainly could hook up a language model to sensors and actuators to give it semantic relationships with the rest of the world. But nobody has done this. And giving it semantic relationships of the same order of complexity and richness that human brains have is an extremely tall order, one I don't expect anyone to come anywhere close to doing any time soon.

> Tokens and their co-occurrences are a kind of sensor to the world

They can be a kind of extremely low bandwidth, low resolution sensor, yes. But for that to ground any kind of semantic relationship to the world, the model would need the ability to frame hypotheses about what this sensor data means, and test them by interacting with the world and seeing what the results were. No language model does that now.


So much rides on your implicit notion of semantic relationship, but this dependence needs demonstration. The fact that some pattern of signals on my perceptual apparatus is caused by an apple in the real world does not mean that I have knowledge or understanding of an apple in virtue of this causal relation. That my sensory signals are caused by apples is an accident of this world, one we are completely blind to. If all apples in the world were swapped with fapples (fake apples), where all sensory experiences that have up to now been caused by apples are now caused by fapples, we would be none the wiser. The semantics (i.e. wide content) of our perceptual experiences is irrelevant to literally everything we know and how we interact with the world. Our knowledge of the world is limited to our sensory experiences and our deductions, inferences, etc derived from our sensory experiences. Our situatedness in the world is only relevant insofar as it entails the space of our sensory experiences.

>the model would need the ability to frame hypotheses about what this sensor data means, and test them by interacting with the world and seeing what the results were.

Why do we need to actively test our model to come to an understanding of the world? Yes, that is how we biological organisms happen to learn about the world. But it is not clear that it is required. Language models learn by developing internal models that predict the next token. But this prediction implicitly generates representations of the processes that generate the tokens. There is no in principle limit to the resolution of this model given a sufficiently large and diverse training set.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: