There is a general result in machine learning known as "the bitter lesson"[1], w...

qsort · on Nov 23, 2023

As someone with a CS background myself, I don't think this is what GP was talking about.

Let's forget for a moment that stuff has to run on an actual machine. If you had to represent a quadratic equation, would you rather write:

(a) x^2 + 5x + 4 = 0

(b) the square of the variable plus five times the variable plus four equals zero

When you are trying to solve problems with a level of sophistication beyond the toy stuff you usually see in these threads, formal language is an aid rather than an impediment. The trajectory of every scientific field (math, physics, computer science, chemistry, even economics!) is away from natural language and towards formal language, even before computers, precisely for that reason.

We have lots of formal languages (general-purpose programming languages, logical languages like Prolog/Datalog/SQL, "regular" expressions, configuration languages, all kinds of DSLs...) because we have lots of problems, and we choose the representation of the problem that most suits our needs.

Unless you are assuming you have some kind of superintelligence that can automagically take care of everything you throw at it, natural language breaks down when your problem becomes wide enough or deep enough. In a way this is like people making Rube-Goldberg contraptions with Excel. 50% of my job is cleaning up that stuff.

seanhunter · on Nov 23, 2023

I quite agree and so would Wittgenstein, who (as I understand it) argued that precise language is essential to thought and reasoning[1]. I think one of the key things here is often what we think of as reasoning boils down to taking a problem in the real world and building a model of it using some precise language that we can then apply some set of known tools to deal with. Your example of a quadratic is perfect, because of course now I see (a) I know right away that it's an upwards-facing parabola with a line of symmetry at -5/2, that the roots are at -4 and -1 etc whereas if I saw (b) I would first have to write it down to get it in a proper form I could reason about.

I think this is a fundamental problem with the "chat" style of interaction with many of these models (that the language interface isn't the best way of representing any specific problem even if it's quite a useful compromise for problems in general). I think an intrinsic problem of this class of model is that they only have text generation to "hang computation off" meaning the "cognative ability" (if we can call it that) is very strongly related to how much text it's generating for a given problem which is why that technique of prompting using chain of thought generates much better results for many problems[2].

[1] Hence the famous payoff line "whereof we cannot speak, thereof we must remain silent"

[2] And I suspect why in general GPT-4 seems to have got a lot more verbose. It seems to be doing a lot of thinking out loud in my use, which gives better answers than if you ask it to be terse and just give the answer or to give the answer first and then the reasoning, both of which generally generate inferior answers in my experience and in the research eg https://arxiv.org/abs/2201.11903

qsort · on Nov 23, 2023

> I quite agree and so would Wittgenstein

It depends on whether you ask him before or after he went camping -- but yeah, I was going for an early-Wittgenstein-esque "natural language makes it way too easy to say stuff that doesn't actually mean anything" (although my argument is much more limited).

> I think this is a fundamental problem with the "chat" style of interaction

The continuation of my argument would be that if the problem is effectively expressible in a formal language, then you likely have way better tools than LLMs to solve it. Tools that solve it every time, with perfect accuracy and near-optimal running time, and critically, tools that allow solutions to be composed arbitrarily.

Alpha Go and NNUE for computer chess, which are often cited for some reason as examples of this brave new science, would be completely worthless without "classical" tree search techniques straight out of the Russel-Norvig.

Hence my conclusion, contra what seems to be the popular opinion, is that these tools are potentially useful for some specific tasks, but make for very bad "universal" tools.

vintermann · on Nov 23, 2023

There are some domains that are in the twilight zone between language and deductive, formal reasoning. I've been into genealogy last year. It's very often deductive "detective work": say there are four women in a census with the same name and place that are listed on a birth certificate you're investigating. Which of them is it? You may rule one out on hard evidence (census suggests she would have been 70 when the birth would have happened), one on linked evidence (this one is the right age, but it's definitively the same one who died 5 years later and we know the child's mother didn't), one on combined softer evidence (she was in a fringe denomination and at the upper end of the age range) then you're left with one, etc.

Then as you collect more evidence you find that the age listed on the first one in the census was wildly off due to a transcription error and it's actually her.

You'd think some sort of rule-based system and database might help with these sorts of things. But the historical experience of expert system is that you then often automate the easy bits at the cost of demanding even more tedious data-entry. And you can't divorce data entry and deduction from each other either, because without context, good luck reading out a rare last name in the faded ink of some priest's messy gothic handwriting.

It feels like language models should be able to help. But they can't, yet. And it fundamentally isn't because they suck at grade school math.

Even linguistics, not something I know much about but another discipline where you try to make deductions from tons and tons of soft and vague evidence - you'd think language models, able to produce fluent text in more languages than any human, might be of use there. But no, it's the same thing: it can't actually combine common sense soft reasoning and formal rule-oriented reasoning very well.

igleria · on Nov 23, 2023

> You'd think some sort of rule-based system and database might help with these sorts of things.

sounds like belief change systems (a bit) to me!

https://plato.stanford.edu/entries/logic-belief-revision/

ben_w · on Nov 23, 2023

I assumed seanhunter was suggesting getting the LLM to convert x^2 + 5x + 4 = 0 to a short bit of source code to solve for x.

IIRC Wolfram Alpha has (or had, hard to keep up) a way to connect with ChatGPT.

seanhunter · on Nov 23, 2023

It does. This is the plugins methodology described in the toolformers paper which I've linked elsewhere[1]. The model learns that for certain types of problems certain specific "tools" are the best way to solve the problem. The problem is of course it's simple to argue that the LLM learns to use the tool(s) and can't reason itself about the underlying problem. The question boils down to whether you're more interested in machines which can think (whatever that means) or having a super-powered co-pilot which can help with a wide variety of tasks. I'm quite biased towards the second so I have the wolfram alpha plugin enabled in my chat gpt. I can't say it solves all the math-related hallucinations I see but I might not be using it right.

[1] But here it is again https://arxiv.org/abs/2302.04761

vidarh · on Nov 23, 2023

GPT4 does even without explicitly enabling plugins now, by constructing Python. If you want it to actually reason through it, you now need to ask it, sometimes fairly forcefully/in detail, before it will indulge you and not omit steps. E.g. see [1] for the problem given above.

But as I noted elsewhere, training its ability to do it from scratch matters not for the ability to do it from scratch, but for the transferability of the reasoning ability. And so I think that while it's a good choice for OpenAI to make it automatically pick more effective strategies to give the answer it's asked for, there is good reason for us to still dig into its ability to solve these problems "from scratch".

[1] https://chat.openai.com/share/694251c9-345b-4433-a856-7c38c5...

Jeff_Brown · on Nov 23, 2023

Ideally we'd have both worlds -- but if we're aiming for AGI and we have to choose, using a language that lets you encode everything seems preferable to one that only lets you talk about, say, constrained maximization problems.

wegfawefgawefg · on Nov 23, 2023

the ml method doesnt require you to know how to solve the problem at all, and could someday presumably develop novel solutions. not just high efficiency symbolic graph search.

omnicognate · on Nov 23, 2023

The bitter lesson isn't a "general result". It's an empirical observation (and extrapolation therefrom) akin to Moore's law itself. As with Moore's law there are potential limiting factors: physical limits for Moore's law and availability and cost of quality training data for the bitter lesson.

rcarr · on Nov 24, 2023

Surely the "efficiency" is just being transferred from software to hardware e.g the hardware designers are having to come up with more efficient designs, shrink die sizes etc to cope with the inefficiency of the software engineers? We're starting to run into the limits of Moore's law in this regard when it comes to processors, although it looks like another race might be about to kick off for AI but with RAM instead. When you've got to the physical limits of both, is there anywhere else to go other than to make the software more efficient?

patrick451 · on Nov 23, 2023

When you say "a general result", what does that mean? In my world, a general result is something which is rigorously proved, e.g., the fundamental theorem of algebra. But this seems to be more along the lines of "we have lots of examples of this happening".

I'm certainly no expert, but it seems to me that Wolfram Alpha provides a counterexample to some extent, since they claim to fuse expert knowledge and "AI" (not sure what that means exactly). Wolfram Alpha certainly seems to do much better at solving math problems than an LLM.

seanhunter · on Nov 24, 2023

As someone else pointed out I've used that term wrong. Rule of thumb/observation you might better say.

rgavuliak · on Nov 23, 2023

I would mention, that while yes, you can just throw computational power at the problem, the addition of human expertise didn't disappear. It moved from creating an add instruction, to coming up with a new Neural Net Architecture, and we've seen a lot of the ideas being super useful and pushing the boundaries.