>The actual wide narrative is that the current language models hallucinate and lie
So do people. And ChatGPT is a whole lot smarter than most people I know. It's funny that we've blown the Turing test out of the water at this point, and people are still claiming it's not enough.
The comparison is not against "most people" though. When we search the web we usually want an answer from an "expert" not some random internet poster. If you compare ChatGPT even to something like WebMD, well, I'll trust the latter over ChatGPT in an instant.
It's no better for other domains either. It can give programming advice, but it's often wrong in important ways and so no, I'd rather have the answer verified by an actual developer who knows whatever technology I'm asking about.
And finally, when you talk about what is "enough", I'd ask "for what?" This is what people in this thread are saying. That ChatGPT is not enough for the majority of what people wish it to be, but it may be enough for some tasks such as a creative writing aid or other human in the loop tasks.
Also, ChatGPT isn't smart. It is very good at stringing words together, a facility which, when over-developed in humans, is not generally termed "smart". We reserve that sort of term for the capacity to reason, something ChatGPT has absolutely no capability for.
You and I live in a rarefied world. It's far better at writing than most people I know. and i suspect it's (sadly) better at logic/ math than many people, which, i fully admit isn't saying much.
They are being integrated into the most widely used information retrieval systems (search engines). It's not enough that they are "smarter then most people", they have to always be correct when the question asked of them has a definitive answer otherwise they are just another dangerous avenue for misinformation.
Yes, not all questions have definitive answers, which is fine, then you can argue that they are better then going to the smartest human you know and that might be enough. Although I personally would still disagree with this argument, since I think it's better that the answer provided is "I don't know".
We have not blown the Turing Test out of the water. I guarantee you that out of two conversations, I can tell which one is ChatGPT and which is human 95%+ of the time. (even leaving aside cheap tricks like asking about sensitive topics and getting the "I am a bot!" response)
The Turing test was originated in the 1950s. The goal posts haven't moved much in 70 years. The development of these new language models is revealing that, as impressive as the models are at generating language, it is possible that the Turing test was mis-conceived if the goal was to identify AGI.
At the time, it was inconceivable that a program could interact the way that ChatGPT does (or the way that Dall-E does) without AGI. We now know that this is not the case, and that means that it might finally be time to recognize that the Turing test, while a brilliant idea at the time, doesn't actually differentiate in the way that we want to.
70 years without moving the goal posts is, frankly, pretty good.
Au contraire, the whole history of AI is one of moving goal posts. One professor I worked with quipped that a field is called AI only so long as it remains unsolved.
Logic arguments and geometric analogies were once considered the epitome of human thinking. They were the first to fall. Computer vision, expert systems, complex robotic systems, and automated planning and scheduling were all Turing-hard problems at some point. Even Turing thought that Chess was a domain which required human intellect to master, until DeepMind. Then it was assumed Go would be different. Even in the realm of chat bots, Eliza successfully passed the Turing test when it was first released. Most people who interacted with it could not believe that there was a simple algorithm underlying its behavior.
> One professor I worked with quipped that a field is called AI only so long as it remains unsolved.
Not just one professor you worked with, this has been a common observation across the field for decades.
But the deeper debate about this is absolutely not about moving goal posts, it is about research revealing that our intuitions were (and thus likely still are) wrong. People thought that very conscious, high-cognition tasks like playing chess likely represented the high water mark of "intelligence". They turned out to be wrong. Ditto for other similar tasks.
There have been people in the AI field as long as I've been reading pop-sci articles and books about who have cautioned about these sorts of beliefs, but they've generally been ignored in favor of "<new approach> will get us to AGI!". It didn't happen for "expert systems", it didn't happen for the first round of neural nets, it didn't happen for the game playing systems, it didn't happen for the schedulers and route creators.
The critical thing that has been absent from all the high-achieving approaches to AI (or some subset of it) thus far is that the systems do not have a generalized capacity for learning (both cognitive learning and proprioceptive learning). We've been able to build systems that are extremely good at a task; we have failed (thus far) at building systems which start out with limited abilities and grow (exponentially, if you want to compare it with humans and other animals) from there. Some left-field AI folks would also say that the lack of embodiment hampers progress towards AGI, because actual human/animal intelligence is almost always situated in a physical context, and that for humans in particular, we manipulate that context ahead of time to alter the cognitive demands we will face.
Also, most people do not accept that Eliza passed the Turing test. The program was a good model of a Rogerian psychotherapist, but could not engage in generalized conversation (without sounding like a relentlessly monofocal Rogerian psychotherapist, to a degree that was obviously non-human). The program did "fool" people into feeling that they were talking to a person, but in a highly constrained context, which violates the premise of the Turing test.
Anyway, as is clear, I don't think that we've moved the goal posts. It's just that some hyperactive boys (and they've nearly all been boys) got over-excited about computer systems capable doing frontal lobe tasks and forgot about the overall goal (which might be OK, if they did not make such outlandish claims).
So do people. And ChatGPT is a whole lot smarter than most people I know. It's funny that we've blown the Turing test out of the water at this point, and people are still claiming it's not enough.