Most people, as far as I'm aware, don't have an issue with the idea that LLMs ar...

og_kalu · 2024-11-22T14:00:15 1732284015

If it displays the outwards appearances of reasoning then it is reasoning. We don't evaluate humans any differently. There's no magic intell-o-meter that can detect the amount of intelligence flowing through a brain.

Anything else is just an argument of semantics. The idea that there is "true" reasoning and "fake" reasoning but that we can't tell the latter apart from the former is ridiculous.

You can't eat your cake and have it. Either "fake reasoning" is a thing and can be distinguished or it can't and it's just a made up distinction.

suddenlybananas · 2024-11-22T14:13:12 1732284792

If I have a calculator with a look-up table of all additions of natural numbers under 100, the calculator can "appear" to be adding despite the fact it is not.

sourcepluck · 2024-11-22T15:14:58 1732288498

Yes, indeed. Bullets know how to fly, and my kettle somehow knows that water boils at 373.15K! There's been an explosion of intelligence since the LLMs came about :D

og_kalu · 2024-11-22T15:38:03 1732289883

Bullets don't have the outward appearance of flight. They follow the motion of projectiles and look it. Finding the distinction is trivial.

The look up table is the same. It will fall apart with numbers above 100. That's the distinction.

People need to start bringing up the supposed distinction that exists with LLMs instead of nonsense examples that don't even pass the test outlined.

int_19h · 2024-11-22T20:01:17 1732305677

This argument would hold up if LMs were large enough to hold a look-up table of all possible valid inputs that they can correctly respond to. They're not.

og_kalu · 2024-11-22T15:30:24 1732289424

Until you ask it to add number above 100 and it falls apart. That is the point here. You found a distinction. If you can't find one then you're arguing semantics. People who say LLMs can't reason are yet to find a distinction that doesn't also disqualify a bunch of humans.

atemerev · 2024-11-22T08:04:50 1732262690

LLMs can _play chess_. With the game positions previously unseen. How’s that not actual logical reasoning?

sourcepluck · 2024-11-22T10:48:48 1732272528

I guess you don't follow TCEC, or computer chess generally[0]. Chess engines have been _playing chess_ at superhuman levels using neural networks for years now, it was a revolution in the space. AlphaZero, Lc0, Stockfish NNUE. I don't recall yards of commentary arguing that they were reasoning.

Look, you can put as many underscores as you like, the question of whether these machines are really reasoning or emulating reason is not a solved problem. We don't know what reasoning is! We don't know if we are really reasoning, because we have major unresolved questions regarding the mind and consciousness[1].

These may not be intractable problems either, there's reason for hope. In particular, studying brains with more precision is obviously exciting there. More computational experiments, including the recent explosion in LLM research, is also great.

Still, reflexively believing in the computational theory of the mind[2] without engaging in the actual difficulty of those questions, though commonplace, is not reasonable.

[0] Jozarov on YT has great commentary of top engine games, worth checking out.

[1] https://plato.stanford.edu/entries/consciousness/

[2] https://plato.stanford.edu/entries/computational-mind/

atemerev · 2024-11-22T21:48:23 1732312103

I am not implying that LLMs are conscious or something. Just that they can reason, i.e. draw logical conclusions from observations (or, in their case, textual inputs), and make generalizations. This is a much weaker requirement.

Chess engines can reason about chess (they can even explain their reasoning). LLMs can reason about many other things, with varied efficiency.

What everyone is currently trying to build is something like AlphaZero (adversarial self-improvement for superhuman performance) with the state space of LLMs (general enough to be useful for most tasks). When we’ll have this, we’ll have AGI.