I compared GPT-4 Turbo with my previous tests on GPT-4, and the results are quit...

hn_throwaway_99 · on Nov 23, 2023

But the point about how it just "improves" with slightly larger numbers, but still fails at really big numbers, shows that it's not really "reasoning" about math in a logical way - that's the point I was getting at.

For example, once you teach a grade schooler the basic process for addition, they can add 2 30 digit numbers correctly fairly easily (whether they want to do it or not is a different story). The fact that LLMs still make errors at larger numbers points to the fact that they're not really "learning" the rules of arithmetic.

pk-protect-ai · on Nov 23, 2023

Of course, it isn't. It approximates. I bet you'll get better results by increasing the depth of the network, as with each layer, you'll achieve a more accurate approximation. I have an idea for achieving this without significantly increasing the number of layers, and I'm currently working on it as a side project. However, this idea might prove to be useless after all, as it requires training the model from scratch with a lot of synthetic data mixed in. Experiments on small models look promising, but they are negligible, and I can't afford to train a larger model from scratch for a side project.

Davidzheng · on Nov 23, 2023

Isn't actually just impossible for it to do it well on arbitrarily large inputs like this even from computational complexity point of view. If it doens't know it's allowed to do step by step multiplication (addition is maybe ok). I'm not sure it's a criticism against its ability to reason. It's similar to asking someone to do addition in 5 seconds with no paper. like of course at some point it won't be able to do it for a large enough number. BTW strongly disagree that the average grade schooler will be able to add 2 30digit numbers even with paper without making a mistake.

sandinmyjoints · on Nov 23, 2023

It’ll be really interesting when LLMs (or whatever) are like grade schoolers in this regard.

peteradio · on Nov 22, 2023

It isn't fair to expect an LLM to solve arithmetic. It should be able to instruct to various specialized sub-processors, I don't think we really do anything different.