I think you're committing a yet-unnamed logical fallacy that I see a lot on HN. I'm going to call this "the logical fallacy of uncharitable interpretation." The basic premise is this:
Take an assertion like X implies Y. Unfortunately, due to the imprecision of language, X may have multiple definitions, e.g. D1 and D2, such that D1 implies Y but D2 does not. The fallacy of uncharitable interpretation is to take a definition for X such that the stated premise does not hold, even though there exists another definition where the premise would hold.
In this case, large ML algorithms are commonly called AI. Even the company that made GPT-3 is called OpenAI. Even you, yourself referred to them as "AI researchers." What do "AI researchers" produce if not AI? Finally, "talk" can colloquially mean to use words in an prompt-response pattern.
I'm not going to downvote you because I think you're respectful and thoughtful, but I want to vociferously disagree. These clickbait blogs know what they're doing. There's been so much contention about what is and isn't artificial intelligence (including the Google weirdo that thought it was sentient earlier this year), that just about anyone semi-educated on the subject knows how GPT and related models work. We know it's smoke and mirrors; we know it's plagiarizing and combining other, third-party, materials half the time.
She's not "having a conversation with her younger self" and this is incredibly disingenuous and misleading, per @mavu's point. More importantly, it doesn't really educate the uneducated or move the technology forward in any meaningful way. It's pure garbage clickbait nonsense. I regret giving them the impression.
Agreed, the glut of AI text generation programs all capitalize on the misrepresentation of what they do because "AI" holds different meanings in academic and social spheres. And it's better to have a sci-fi marketing hook.
"we know it's plagiarizing and combining other, third-party, materials half the time" does not seem like an accurate description of what a model like GPT3 does either, though, since:
1) No explicit content (that could be plagiarized or combined) is actually stored by GPT3: only statistical relationships between tokens are retained. You can't point at a specific subset of weights and say "see... this is where phrase/idea X is stored".
2) GPT3 makes no specific claim to be generating "original, own work" (which would be required for something to be considered "plagiarism")
> No explicit content (that could be plagiarized or combined) is actually stored by GPT3: only statistical relationships between tokens are retained
When people ask GPT-3 to write a rhyming poem, you see plenty of examples of GPT-3 poems starting with "There once was a cat named Pat..." This is an extremely common first line of a limerick, found anywhere from here[1] to here[2]. I'm sure those "statistical relationships" are very strong; is it plagiarism? I'll leave it up to you to decide that, but I'm willing to bet that with enough finagling you can probably get it to spit out phrases from Moby Dick.
> When people ask GPT-3 to write a rhyming poem, you see plenty of examples of GPT-3 poems starting with "There once was a cat named Pat..." This is an extremely common first line of a limerick, found anywhere from here[1] to here[2].
Knowing that a "rhyming poem" is likely to start with a specific token (or set of tokens) does not exactly constitute "plagiarism", the same way that writing a poem that starts with "There once was a cat named Pat..." is not "plagiarism" by itself: it is just adhering to expected convention/norms of a specific literary format or genre.
Is using the basic-ass I-V-VI-IV chord progression in music "plagiarism", since it has been (and is) used by countless other people before?
> I'm sure those "statistical relationships" are very strong; is it plagiarism? I'll leave it up to you to decide that
Well... my claim is that it is clearly not plagiarism (and I gave specific arguments to support my claim). If you are not interested in arguing (which is fine), then I assume you accept that your characterization of what GPT3 does as being "plagiarism" is (at the very least) overly simplistic (i.e., just as simplistic as claiming that GPT-3 is sentient or actually intelligent).
> but I'm willing to bet that with enough finagling you can probably get it to spit out phrases from Moby Dick.
If GPT-3 (or most humans, for that matter) are asked to complete the phrase "To be or not to..." and decide that the word "be" is the most likely/reasonable completion, does it mean that GPT-3 (or any human, for that matter) is "plagiarizing" Shakespeare? Or does it simply mean that they are trying to address your question/problem to the best of their capabilities (and that they probably have read a passage or two of Shakespeare before, or someone paraphrasing Shakespeare)? In other words, just because you can force GPT-3 to output a specific copyrighted work (or an excerpt of it) still doesn't mean that what GPT-3 is doing should be characterized as "plagiarism".
Again, for something to technically count as "plagiarism", it is required that someone (i.e., not a computer program) try to pass off (incorrectly) something as original, own work, which does not seem to be the case here. That was my main point.
EDIT: if you want to be derisive of things like GPT-3, while still being accurate, it makes more sense to say things like "it is simply imitating" or "has no actual creativity" (which seem defensible to me) than things like "it is literally plagiarizing and copying what it saw before" (which seems much less defensible/accurate).
> Is using the basic-ass I-V-VI-IV chord progression in music "plagiarism", since it has been (and is) used by countless other people before?
This is not at all what's happening here. Complete red herring.
> If GPT-3 (or most humans, for that matter) are asked to complete the phrase "To be or not to..." and decide that the word "be" is the most likely/reasonable completion, does it mean that GPT-3 (or any human, for that matter) is "plagiarizing" Shakespeare?
The short answer is yes (imho), but let me put it this way: does GPT-3 know that when it's regurgitating "to be or not to be" it's actually regurgitating Shakespeare? My argument is that no, it does not know, precisely because it thinks this just happens to be a very strong statistical correlation of stringing words together. When, in fact, it's a very famous phrase by a very famous person. So, in a way, it's "accidentally" plagiarizing, but plagiarizing nonetheless. Like if, for whatever reason, I had heard the phrase "it was the best of times, it was the worst of times" somewhere, but couldn't remember where, my ignorance doesn't preclude me from technically plagiarizing Charles Dickens if I blatantly reused the phrase without attribution.
> things like "it is literally plagiarizing and copying what it saw before" (which seems much less defensible/accurate).
This is literally what it's doing, though, under the guise of "statistical correlation." In fact, I've read reports of people using GPT-3-adjacent models that needed to add specific filtering out of training data.
The same way that you disagree when someone stretches the meaning of the work "talk" to encompass what GPT-3 does, I also disagree when you try to stretch the meaning of the word "plagiarism" to encompass what GPT-3 does (and I've explained exactly why: GPT-3 generates sequences of tokens, but makes no specific claim about the originality of the generated sequences of tokens).
We can agree to disagree, if you can't accept that "plagiarism" literally involves more than just "copyright infringement" or "replicating someone else's work from statistical correlations" or anything along those lines: it must also involve fraud or some other form of misrepresentation.
Even if the headline isn't accurate by HackerNews technical standards, don't under-estimate the power of believing this kind of experience.
For example, "What Happened to Make You Anxious?" by Jaime Castillo proposes a similar technique for addressing anxiety. The technique goes like this:
Think of a traumatic moment in your past, where you wish you could go back in time and comfort or offer yourself advice. Then imagine yourself stepping into the frame of your memory and giving your younger self the support you needed in that moment (but lacked). This might sound woo-woo, but the efficacy is documented.
Sure, GPT-generated text isn't actually equivalent to talking to yourself - but you can see the utility for therapeutic applications like this.
>Think of a traumatic moment in your past, where you wish you could go back in time and comfort or offer yourself advice. Then imagine yourself stepping into the frame of your memory and giving your younger self the support you needed in that moment (but lacked). This might sound woo-woo, but the efficacy is documented.
This is how the Internal Family Systems psychology technique deals with past trauma and it has been around since the early 1980s. No AI required. To go that deep though, it helps to have a therapist trained in the technique to guide the process.
Yeah, I'm torn here because without a trained therapist supervising this technique can easily trigger a spiral.
But health care costs in the US are extraordinarily steep. Mental health care is often out-of-network, if it's covered at all.
I have a hunch that market demand for AI-guided "mental health wellness exercises" will out pace regulations or ethics. Calm would be the best example of this kind of demand. I wonder if they're talking about this internally, and what line is being drawn.
Therapist costs tend to be capped in a way most US medical care isn’t.
Seeing someone once a week without insurance is on the order of 3,000$ to 7,000$ per year depending on area. Not trivial, but something most people could budget for even if they seek treatment for much of their life.
It’s of course possible to spend more than that, especially in the short term. However, the major costs tend to be people being unable to work or take care of themselves rather than blockbuster drugs, emergency care, or surgery.
If you're upper-middle class or wealthier, I think you're right that the cost can be budgeted.
I just Googled the median US rent ($1,771). Based on that, if you frame the expense as an added 2 - 4 months rent per year, I'm guessing there are a lot of people forgoing therapy because the immediate cost is too high.
I wish I could choose where my taxes were allocated. I'd subsidize cost of therapy.
To really drive this home, one of my own memories is my mom (ashamed, embarrassed, depressed) asking to borrow money from me to pay our bills. I was 9. She had issues, I inherited a few, and I've only recently been able to drop 5-10 dimes on therapy each year to sort it out. Money well spent!
But I wonder how my mom would've fared if even an AI facsimile of therapy were available to her cheap, instead of other cheap coping mechanisms. Churches and drugs are the cheapest coping tools available in the US.
I've heard this sentiment before (that AI != ML, always weirdly hostile) but I've also heard key figures in the AI world say strongly that ML of any kind is a subset of the broader AI umbrella. Are you sure it is strictly wrong to refer to this example as AI? Also why are you using such strong words? I'm genuinely curious why there is so much emotion when people, maybe, misuse these terms.
He's going for a middlebrow dismissal. This is like the "bitcoin isn't money!" or "twitter is not a serious medium for communication!" oldschoolism that some HN traditionalists defend in spite of real world application. At the end it doesn't uncover a salient point but gets stuck on semantics to shut down an entire idea.
You're not 100% wrong, but I'm not trying to shut down an entire idea of AI.
Let me put it this way: It's called AI, Artificial intelligence. Are trees inteligent?
Because they are a clump of cells that manage to achieve amazing results. growing to be some of the largest living things on the planet. Extracting nutrients and transporting them 20 or more meters above ground, where they are used to harvest energy from the sun, all the while producing offspring every year and fighting off predators.
I don't think many people would call them "natural intelligence".
The I admit that I maybe apply a narrow definition of "intelligence", but I think the core concept is one of "understanding".
And we are not even close to al ML algo actually understanding anything.
And this is the problem. It masks the inherent shortcommings of ML.
People are delivered the impression that applications that use ML actually do what they are expected to do, because like a person, you train them and then they understand their job and do it.
This is NOT HOW THAT WORKS. The ML algo does not understand that it is asked to identify oncoming traffic. It does not know that it is looking a cancer cells that will soon kill someone.
And even worse, we humans who make these things, are 100% unable to understand the models we create. We can feed them data, and compare the result. But that is it. There is no real way to understand how it works in detail.
They get used anyway. With predictable results. see Tesla Autopilot for a prominent example.
Also, I call them AI researchers for the same reason I call Nuclear Fusion researchers that, not because they are doing it, but because they are researching it.
I'm not the OP, but have been a data person long enough to hazard a guess.
If your job is an engineer or scientist, the term "AI" is basically a synonym for "unrealistic executive expectations." That's a super triggering and stressful situation to be in, especially early career.
I've actually had a CEO describe expected output of my ML team as "magic AI shit" - I bet you can imagine the team's reaction and tone. I'm reading the same strong emotions and frustration here.
The good news is you can always course-correct expectations with communication. I've come to love talking about AI with people who are only somewhat technical, because their wildest dreams are sometimes totally do-able with some duct tape and fine-tuning.
The way I and many others use the term "AI", is based on the understanding that AI is an umbrella term that includes modern deep learning as well as the broader field of machine learning, and also the other pre-machine-learning types of AI such as symbolic AI, expert systems etc.
If the stuff running on Lisp Machines was referred to as AI, I don't see why GPT-3 can't be labelled as such.
To me, "an AI" implies an autonomous agent, even if it is dumb. But many of those you listed are techniques for AI, not AIs themselves. The same goes for these ML models: most are techniques or components suitable for AI, not AIs themselves.
I'd consider these language models AI if they initiated conversations themselves or used conversation as part of a plan to achieve some objective. But so far, game bots are more deserving of the title than these trendy language models and painting generators, IMO.
I dunno. ELIZA is not an AI, and doesn't understand anything, but colloquially it makes sense to say people chatted with ELIZA. https://en.wikipedia.org/wiki/ELIZA
I know this is a technical forum, but you are allowed to appreciate the subjective meanings an artist ascribes to technological output, as well as the sentiments it provoked.
It is her performing talking to herself. She's as much "talking to herself" than if she had written and then performed a play about talking to herself. The computer/AI aspect of it is "just" a prop for the performance.
But there is also a broader absence (destruction?) of our classical understanding of thought and other domains such as classical objective reality and the platonic domain of ideas.
Not yet sure if this is some weird post modernist view or intentional dummying down, maybe two sides of the same coin?
But at least from what I can observe there is little push to teach people how to think.
Stuff a lot simpler than GPT-3 has been called AI by people much more accomplished than you for a lot longer than you've been alive. The term is not ambiguous, "Intelligence" does not exclusively mean human-level or human-like intelligence, and trying to redefine it now would only cause confusion.
Do you have some principled definition of what AI is or what it means to "understand"? GPT-3 outputs certainly show enough high-level non-scripted information processing to qualify as "understanding" in my book – not human understanding, clearly, but an artificial one.
AI researchers do in fact educate the public. [1] They face serious pushback from obscurantists like Hofstadter [2] and Marcus who refuse to commit to any definition and instead go by hunches about what architecture or data representation can support "understanding" and what certain behaviors indicate.
I hate this complete and utter misrepresentation of facts.
GPT is not an AI. It does not talk. It certainly does not understand.
I wish AI researchers would spend some effort to educate the public instead of bathing in all the attention.