But an LLM isn't even trying to simulate cognition. It's a model that is predicting language. It has all the problems of a predictive model... the
"hallucination" problem is just the tyranny of Lorenz.
This is plain wrong due to mixing of concepts. Language is technically something from Chomsky hierarchy. Predicting language is being able to tell if input is valid or invalid. LLMs do that, but they also build a statistical model across all valid inputs, and that is not just the language.
>> Predicting language is being able to tell if input is valid or invalid.
If this were the case then the hallucination problem would be solvable.
That hallucination problem is not only going to be hard to detect in any meaningful way but it's going to be harder to eliminate. The very nature of LLM (mixing in noise aka temperature) means that they always risk going off the rails. This is the same thing Lorenz discovered in modeling weather...
I don't think that "hallucination problem" is a problem at all worth addressing separately from just building bigger/better models that do the same thing. Because 1) it is present in humans, 2) it is clear bigger models have less of it than smaller models. If at scale nothing changes LLMs will eventually just hallucinate less than humans.