Hacker News new | past | comments | ask | show | jobs | submit login

> I'm not sure about its math, but GPT-4 fails miserably at simple arithmetic questions like 897*394=?

That's, um, about 300,000?

...

353,418 actually. But I'm not going to blame the AI too much for failing at something I can't do either.




One can resort to traditional vertical multiplication (which requires patience), or do

897*394 = (900-3) * (400-6) = 900*400 - 6*900 - 400*3 + 3*6 = 360,000 - (5,400 + 1,200) + 18 = 360,018 - 6,600 = 353,418


   8*3=24 and 800*300 =240000
   8*9=72 and 800* 90 = 72000
   8*4=32 and 800*  4 =  3200
   9*3=27 and  90*300 = 27000
   9*9=81 and  90* 90 =  8100
   9*4=36 and  90*  4 =   360
   7*3=21 and   7*300 =  2100
   7*9=63 and   7* 90 =   630
   7*4=28 and   7*  4 =    28
   --------------------------
                       353418


But you are smart enough to use a computer or calculator. And AI is a computer. So the naive expectation would be that it would be capable of doing as well as a computer.

Also, you probably could do long multiplication with paper and pencil if you needed to. So a reasoning AI (which has read many many descriptions of how to do long multiplication) should be able to also.


> And AI is a computer. So the naive expectation would be that it would be capable of doing as well as a computer.

Why would you judge an AI against the expectations of a naive person who doesn't understand capabilities AIs are likely to have? If an alien came down to earth and concluded humans weren't intelligent because the first person it met couldn't simulate quantum systems in their head, would that be fair?


The original question was whether LLM's are "smart" in a human-like way. I think that if you gave a human a computer, he'd be able to solve 3-digit multiplications. If LLM's were human-like smart, they could do this too.


Did someone train LLMs with "access" to a computer? If not, why would you expect them to be able to use something they have never seen?


“It’s right there, you stupid llm! Dammit, YOU’RE RUNNING ON IT!”


I mean, I'm running on incredible amounts of highly complex physics and maths, but that doesn't mean I can give you the correct answer to all questions on those.


I dunno, I simulate quantum systems (you, myself, my friends) in my head all the time


An AI is a program running on a computer.

Minecraft runs on a computer too, but you don't expect the Minecraft NPCs to be able to do math.

So it's a very naive assumption.

Most people struggle with long multiplication despite not only having learnt the rules, but having had extensive reinforcement training in applying the rules.

Getting people conditioned to stay on task for repetitive and detail oriented tasks is difficult. There's little reason to believe it'd be easier to get AIs to stay on task, in part because there's a tension between wanting predictability and wanting creativity and problem solving. Ultimately I think the best solution is the same as for humans: tool use. Recognise that the effort required to do some things "manually" is not worth it.


> But you are smart enough to use a computer or calculator. And AI is a computer. So the naive expectation would be that it would be capable of doing as well as a computer.

I disagree. The AI runs on a computer, but it isn't one (in the classical sense). Otherwise you could reduce humans the same way - technically our cells are small (non-classical) computers, and we're made up of chemistry. Yet you don't expect humans to be perfect at resolving chemical reactions, or computing complex mathematics in their heads.


They can reason through it they just sometimes make mistakes along the way, which is not surprising. More relevant to your comment is that if you give gpt4 a calculator it'll use it in these cases.


I am indeed smart enough to do that. And so is the AI, if you use the right AI. (I.e, code interpreter.)


I've got an engineers style mindset for these kind of calculations.

897 is about 900. 394 is about 400. 900×400 = 360,000. Only 2% error!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: