What GPT-3 lacks that seems to be crucial, is a consistent view of the world.
For example, if you ask me "what is the capital of France?" I will answer "Paris" with 100% probability. No matter how you phrase the question, there is simply no other answer that can be given, if I understand what you are asking.
It turns out that humans, despite their reputation, are actually very good at avoiding cognitive dissonance.
Whereas language models are very bad at it. We don't even think of GPT-3 as having cognitive dissonance because there is not a gram of consonance in there.
There are certainly contexts in which that question can be answered differently. Say the actual question was "what is the [cheese] capital of France?", where the word is simply evident contextually. It's more than having a consistent view of the world, you have fundamentally better conceptual understanding that allow you to generalize the idea of political capitals to something unrelated like cheese and still understand the question. You also have a mental model of what the other person is thinking, so you know that "capital" probably isn't talking about cheese. GPT-3 doesn't have the former, and only knows the latter by statistical inference.
> For example, if you ask me "what is the capital of France?" I will answer "Paris" with 100% probability. No matter how you phrase the question, there is simply no other answer that can be given, if I understand what you are asking.
Context matters too. I could say “the letter F” is the capital of France, and it’s also true but not what you wanted.
Maybe that’s why GPT-3 is so funny occasionally. Comedians are professionals in cognitive dissonance, and every once in a while the million monkeys of the algorithm hit comedy gold.
If you ask people what the capitals of Turkey and Australia are, you might hear Istanbul and Sydney surprisingly often. There seems to be some probabilistic inference in our brain that the largest city in a country should also be the capital.
On the other hand, if you ask the same person what the capital of Turkey is in a hundred different ways, you'll get the same answer (though possibly mistaken) at all times. They may have a wrong internal model of country capitals, but all language communication works with the same model; while a GPT-3 like model approaches each question with a separate probabilistic interference without attempting to "make up its mind" about some internally consistent "mental model", which is probably one of the key gaps that needs fixing to bring it up a level.
> if you ask me "what is the capital of France?" I will answer "Paris" with 100% probability. No matter how you phrase the question, there is simply no other answer that can be given
There are many capitals of France, depending on time period:
I would say that the implied modifier applies to that particular phrasing, but "No matter how you phrase the question" presumes a wide variety of options where the implications may be very different.
The thing is, GPT-3 isn't a fact database, it's a story teller. So depending on how you phrase the input, it may tell you vastly different stories. There's no reason to expect it to reference the fact that Paris is the capital of France consistently in every story iteration.
For example, if you ask me "what is the capital of France?" I will answer "Paris" with 100% probability. No matter how you phrase the question, there is simply no other answer that can be given, if I understand what you are asking.
It turns out that humans, despite their reputation, are actually very good at avoiding cognitive dissonance.
Whereas language models are very bad at it. We don't even think of GPT-3 as having cognitive dissonance because there is not a gram of consonance in there.