Exactly this. I too find this to be the best intuition for LLMs right now: they'...

shanebellone · on May 16, 2023

"Exactly this. I too find this to be the best intuition for LLMs right now: they're not comparable to an entire combined human mind - they're comparable to subconscious, or inner voice"

Strongly disagree.

LLM traps you inside an intellectual bell curve.

TeMPOraL · on May 16, 2023

What is your take then? And please don't say "stochastic parrot" or "hype train".

shanebellone · on May 16, 2023

I view it as long-form autocomplete.

TeMPOraL · on May 16, 2023

> I view it as long-form autocomplete.

My wife sometimes views me as long-form autocomplete, and sometimes as a spell and grammar checker. Hell, my reply to your comment here is indistinguishable from a "long-form autocomplete".

Point being, that autocomplete has to work in some way. Our LLM autocompletes have been getting better and better at zero-shot completion to arbitrary long-form text, including arbitrary simulated conversations with a simulated human, without commensurate increase in complexity or resource utilization. This means they're getting better and better at compressing their training data - but in the limit, what is the difference between compression and understanding? I can't prove it formally, but I rather strongly believe they are, fundamentally, the same thing.

Also: if it walks like a duck, quacks like a duck, swims like a duck, ducks like a duck, and is indistinguishable from a duck on any possible test you can think of or apply to it, then maybe your artificial faux-duck effectively turned into a real duck?

mrtranscendence · on May 17, 2023

> what is the difference between compression and understanding? I can't prove it formally, but I rather strongly believe they are, fundamentally, the same thing.

I'm not sure this is true in general. I feel as if I understand something when I grasp it in its entirety, not when I've been able to summarize it concisely. And conceptually I can compress something without understanding it by manually implementing compression algorithms and following their instructions by rote.

I think understanding and compression are plausibly related; one test of whether I understand something is whether I can explain it to a layperson. But I don't see how they're equivalent even asymptotically.

> then maybe your artificial faux-duck effectively turned into a real duck?

I can't really get behind this sentiment. If a language model behaves like a duck in every readily observable particular then we can substitute language models for ducks, sure. But that does not imply that a language model is a duck, and whether it even could be a duck remains an interesting and important question. I'm sympathetic to the argument that it doesn't really matter in day-to-day practice, but that shouldn't stop us from raising the question.

TeMPOraL · on May 17, 2023

> But I don't see how they're equivalent even asymptotically.

You wrote:

> I feel as if I understand something when I grasp it in its entirety, not when I've been able to summarize it concisely.

But what does it mean to "grasp it in its entirety"? To me, it means you learned the patterns that predict the thing and its behavior. That understanding lets you say, "it is ${so-and-so}, because ${reason}", and also "it will do ${specific thing} when ${specific condition} happens, because ${reason}", and have such predictions reliably turn true.

To me, replacing a lot of memorized observations with more general principles - more general understanding - is compression.

A simplified model: you observe pairs of numbers in some specific context. You see (1, 2) and (3, 6), then (9, 18), then (27, 54), and then some more numbers you quickly notice all follow a pattern:

  Pair_n = (x, y), where:
  - y = 2*x
  - x = 3^n

A thousand of such pairs pass you by, before they finally stop. Do you remember them all? It's not a big deal ever since you figured out the pattern - you don't need to remember all the number pairs, you only need to remember the formula above, and that n started at 0 and ended at 999.

This is what I mean by understanding being fundamentally equivalent to compression: each pattern or concept you learn lets you replace memorizing some facts with a smaller formula (program) you can use to re-derive those facts. It's exactly how compression algorithms work.

And yes, in this sense, we are lossy compressors.

pizza · on May 16, 2023

The devil’s in the details, or, in this case, the joint distribution between what a person would produce and what the model produces. If you came up with a way to train monkeys to write Hamlet on a typewriter, it’s still Hamlet. We’re not there yet - to the point where they consistently expand human potential for thought - but we could be, someday.

joquarky · on May 16, 2023

I have been thinking along the same lines. The chain of thoughts that arise during meditation remind me a lot of language generators.

TeMPOraL · on May 16, 2023

I've had that thought for over a decade now. I felt that my inner voice is a bit of a Markov chain generator at the border between my conscious and unconscious, randomly stringing some thoughts in form of sentences (often mixed-language, to boot), and conscious-level thinking involves evaluating those thought streams - cutting some off completely, letting others continue or mixing them and "feeding back" to the generator, so it iterates more on those.

Markov chains (and a lot of caching) were a good high-level working model, but quite inadequate in power when inspected in detail. Deep language models I initially ignored, as they felt more like doubling down on caching alone and building convoluted lookup tables. But, to my surprise, LLMs turned not only to be a better high-level analogy - the way they work in practice feels so close to my experience with my own "inner voice", that I can't believe this is just a coincidence.

What I mean here is, in short: whenever I read articles and comments about strengths and weaknesses of current LLMs (especially GPT-4), I find that they might just as well be talking about my own "inner voice" / gut-level, intuition-driven thinking - it has the same strengths and the same failure modes.