And of course it will never improve as people work on it / invest in it? I do think this is more incremental than revolutionary but progress continues to be made and it's very possible Bing/Google deciding to open up a chatbot war with GPT models and further investment/development could be seen as a turning point.
There's a difference between working on something until it's a viable and usable product vs. throwing out trash and trying to sell it as gold. It's the difference between Apple developing self driving cars in secret because they want to get it right vs. Tesla doing it with the public on public roads and killing people.
In its current state Bing ChatGPT should not be near any end users, imagine it going on an unhinged depressive rant when a kid asks where their favorite movie is playing...
Maybe one day it will be usable tech but like self driving cars I am skeptical. There are way too many people wrapped up in the hype of this tech. It feels like self driving tech circa 2016 all over again.
Imagine it going on a rant when someone’s kid is asking roundabout questions about depression or SA and the AI tells them in so many words to kill themselves.
I have to say, I'm really enjoying this future where we shit on the AIs for being too human, and having depressive episodes.
This is a timeline I wouldn't have envisioned, and am finding it delightful how humans want to have it both ways. "AIs can't feel, ML is junk", and "AIs feel too much, ML is junk". Amazing.
I think you're mixing up concerns from different contexts. AI as a generalized goal, where there are entities that we recognize as "like us" in quality of experience, yes, we would expect them to have something like our emotions. AI as a tool, like this Bing search, we want it to just do its job.
Really, though, this is the same standard that we apply to fellow humans. An acquaintance who expresses no emotion is "robotic" and maybe even "inhuman". But the person at the ticket counter going on about their feelings instead of answering your queries would also (rightly) be criticized.
It's all the same thing: choosing appropriate behavior for the circumstance is the expectation for a mature intelligent being.
Well, that's exactly the point: we went from "AIs aren't even intelligent beings" to "AIs aren't even mature" without recognizing the monumental shift in capability. We just keep yelling that they aren't "good enough", for moving goalposts of "enough".
I'm glad to see this comment. I'm reading through all the nay-saying in this post, mystified. Six months ago the complaints would have read like science fiction, because what chatbots could do at the time were absolutely nothing like what we see today.
No, the goalposts are different according to the task. For example, Microsoft themselves set the goalposts for Bing at "helpfully responds to web search queries".
Who is "we"? I suspect that you're looking at different groups of people with different concerns and thinking that they're all one group of people who can't decide what their concerns are.
AI is a real world example of Zeno’s Paradox. Getting to 90% accuracy is where we’ve been for years, and that’s Uncanny Valley territory. Getting to 95% accuracy is not “just” another 5%. That makes it sound like it’s 6% as hard as getting to 90%. What you’re actually doing is cutting the error rate in half, which is really difficult. So 97% isn’t 2% harder than 95%, or even 40% harder, it’s almost twice as hard.
The long tail is an expensive beast. And if you used Siri or Alexa as much as they’d like you to, every user will run into one ridiculous answer per day. There’s a psychology around failure clusters that leads people to claim that failure modes happen “all the time” and I’ve seen it happen a lot in the 2x a week to once a day interval. There’s another around clusters that happen when the stakes are high, where the characterization becomes even more unfair. There are others around Dunbar numbers. Public policy changes when everyone knows someone who was affected.
I think this is starting look like it is accurate. The sudden progress of AI is more of an illusion. It is more readily apparent in the field of image generation. If you stand back far enough, the images look outstanding. However, any close inspection reveals small errors everywhere as AI doesn't actually understand the structure of things.
So it is as well with data, just not as easily perceptible at first as sometimes you have to be knowledgeable of the domain to realize just how bad it is.
I've seen some online discussions starting to emerge that suggests this is indeed an architecture flaw in LLMs. That would imply fixing this is not something that is just around the corner, but a significant effort that might even require rethinking the approach.
> but a significant effort that might even require rethinking the approach.
There’s probably a Turing award for whatever comes next, and for whatever comes after that.
And I don’t think that AI will replace developers at any rate. All it might do is show us how futile some of the work we get saddled with is. A new kind of framework for dealing with the sorts of things management believes are important but actually have a high material cost for the value they provide. We all know people who are good at talking, and some of them are good at talking people into unpaid overtime. That’s how they make the numbers work, but chewing developers up and spitting them out. Until we get smart and say no.
I don't think it's an illusion, there has been progress.
And I also agree that the AI like thing we have is nowhere near AGI.
And I also agree with rethinking the approach. The problem here is human AI is deeply entwined and optimized the problems of living things. Before we had humanlike intelligence we had 'do not get killed' and 'do not starve' intelligence. The general issue is AI doesn't have these concerns. This causes a set of alignment issues between human behavior an AI behavior. AI doesn't have any 'this causes death' filter inherent to its architecture and we'll poorly try to tack this on and wonder why it fails.
My professional opinion is that we should be using AI like Bloom filters. Can we detect if the expensive calculation needs to be made or not. A 2% error rate in that situation is just an opex issue, not a publicity nightmare.
Yes, didn't mean to imply there is no progress, just that some perceive that we are all of a sudden getting close to AGI from their first impressions of ChatGPT.
It's incremental between gpt2 and gpt3 and chatgpt. For people in the know, it's clearly incremental. For people out of the know it's completely revolutionary.
That’s usually how these technological paradigm shifts work. EG iPhone was an incremental improvement on previous handhelds but blew the consumer away.
It coalesced a bunch of tech that nobody had put into a single device before, and added a few things that no one had seen before. The tap zoom and the accelerometer are IMO what sold people. When the 3g came out with substantial battery life improvements it was off to the races.
At this point I’m surprised the Apple Watch never had its 3g version. Better battery, slightly thinner. I still believe a mm or two would make a difference in sales, more than adding a glucose meter.
If haters talked about chefs the way they do about Apple we’d think they were nuts. “Everyone’s had eggs and sugar in food before, so boring.”
Yeah I think iPhone is a very apt analogy: certainly not the first product of its kind, but definitely the first wildly successful one, and definitely the one people will point to as the beginning of the smartphone era. I suspect we'll look back on ChatGPT in a similar light ten years from now.