> The fact that it can produce working code from a prompt in some cases shows rudimentary non-trivial reasoning.
It doesn't at all. It indicates that it read stackoverflow at some point, and that on a particular user run, it replayed that encoded knowledge. (I'd also argue it shows the banality of most React tutorials, but that's perhaps a separate issue.)
Quite a lot of these impressive achievements boil down to: "Isn't it super cool that people are smart and put things on the internet that can be found later?!"
I don't want to trivialize this stuff because the people who made it are smarter than I will ever be and worked very hard. That said, I think it's valid for mere mortals like myself to question whether or not this OpenAI search engine is really an advancement. It also grates on me a bit when everybody who has a criticism of the field is treated like a know-nothing Luddite. The first AI winter was caused by disillusionment with industry claims vs reality of what could be accomplished. 2020 is looking very similar to me personally. We've thrown oodles of cash and billions of times more hardware at this than we did the first time around, and the most use we've gotten out of "AI" is really ML: classifiers. They're super useful little machines, but they're sensors when you get right down to it. AI reality should match its hype, or it should have less hype (e.g. not implying GPT-3 understands how to write general software).
Assertions aren't particularly useful in this discussion. Nothing you said supports your claim that GPT-3 doesn't show any capacity for reasoning. The fact that GPT-3 can create working strings of source code from prompts it (presumably) hasn't seen before means it can compose individual programming elements into a coherent whole. If it looks like a duck and quacks like a duck, then it just might be a duck.
Here's an example of rudimentary reasoning I saw from GPT-2 in the context of some company that fine-tuned GPT-2 for code completion (made up example but captures the gist of the response):
[if (variable == true) {
print("this sentence is true")
}
else] {
print("this sentence is false")
}
Here's an example I tested using talktotransformer.com:
[If cars go "vroom" and my Ford is a car then my Ford] will also go "vroom"...
The bracketed parts where the prompt. If this isn't an example of rudimentary reasoning then I don't know what is. If your response is that this is just statistics then you'll have to explain how the workings of human brains aren't ultimately "just statistics" at some level of description.
> working strings of source code from prompts it (presumably) hasn't seen before
I'm saying that "presumably" is wrong, especially on what it was: a simple React program. It would not surprise me if the amount of shared structure and text in the corpus is all over the place.
This can be tested by making more and more sophisticated programs in different languages, and seeing how often it returns the correct result. I don't really care, because it can't reliably do basic arithmetic if the numbers are in different ranges. This is dead giveaway it hasn't learned a fundamental structure. If it hasn't learned that, it hasn't learned programming.
The examples are not really that impressive either. They are boolean logic. That a model like this can do copy-pasta + encode simple boolean logic and if-else is... well.. underwhelming. Stuff like that has been happening for a long time with these models, and no one has made claims that the models were "reasoning".
The react programming example isn't the only example of GPT-3 writing code. There was an example of it writing python programs going around before they opened up the API. It was impressive. There was no reason to think it had seen the exact examples before.
Also it isn't the case that one needs to be perfect at algorithmic thinking to be capable of some amount of reasoning. I don't claim that GPT-3 is perfect, but that its not just copy and pasting pieces of text it has seen before. It is coming up with new sequences of text based on the structure of the surrounding text and the prompt, in a manner that indicates it has a representation (albeit imperfect) of the semantic properties of the text and can compose them in meaningful ways. Handwavy responses do nothing to undermine the apparent novelty it creates.
>encode simple boolean logic and if-else is... well.. underwhelming. Stuff like that has been happening for a long time with these models, and no one has made claims that the models were "reasoning".
Seems like you're just moving the goalposts as always happens when it comes to AI advances. What do you take to be "reasoning" if that isn't an example of it?
Logic is sensitive to the meaning of words to a degree and so if I can pick out the context to apply certain deductive rules, then I know what the relevant words mean, at least to the degree that they indicate logical structure.
It's possible that a program could learn when to apply certain rules based on its own if-then statements and bypass understanding, but that's not the architecture of GPT-3. If it learns the statistical/structural properties of a string of text such that it can apply the correct logical transformations based on context, the default assumption should be that it has some rudimentary understanding of the logical structure.
It doesn't at all. It indicates that it read stackoverflow at some point, and that on a particular user run, it replayed that encoded knowledge. (I'd also argue it shows the banality of most React tutorials, but that's perhaps a separate issue.)
Quite a lot of these impressive achievements boil down to: "Isn't it super cool that people are smart and put things on the internet that can be found later?!"
I don't want to trivialize this stuff because the people who made it are smarter than I will ever be and worked very hard. That said, I think it's valid for mere mortals like myself to question whether or not this OpenAI search engine is really an advancement. It also grates on me a bit when everybody who has a criticism of the field is treated like a know-nothing Luddite. The first AI winter was caused by disillusionment with industry claims vs reality of what could be accomplished. 2020 is looking very similar to me personally. We've thrown oodles of cash and billions of times more hardware at this than we did the first time around, and the most use we've gotten out of "AI" is really ML: classifiers. They're super useful little machines, but they're sensors when you get right down to it. AI reality should match its hype, or it should have less hype (e.g. not implying GPT-3 understands how to write general software).