There’s definitely a similarity with us in that you need to have been trained on...

There’s definitely a similarity with us in that you need to have been trained on enough data to build up that prediction.

Language models are just missing some component that we have. The method for deciding what to output is wrong. People aren’t just guessing the next sound. It’s like they said, there’s multiple levels of thought and prediction going on.

It needs some sort of scratch pad where it keeps track of states/goals. “I’m writing a book” “I want to make this character scary”

Currently it only works on the next tokens and its context is the entire text so far, but that’s not accurate. I’m not deciding what to say based exactly on the entire text so far, I’m feature extracting and then using those features as context.

e.g She looks sad but she’s saying she is fine and it’s to do with death because my memory says her dad died recently so the key features to use for generation are: her being sad, her dad died, she may not want to talk about it