Call me when models understand when to convert the token into actual letters and...

antonvs · 2024-07-24T19:46:43.000000Z

That's misleading.

When you read and comprehend text, you don't read it letter by letter, unless you have a severe reading disability. Your ability to comprehend text works more like an LLM.

Essentially, you can compare the human brain to a multi-model or modular system. There are layers or modules involved in most complex tasks. When reading, you recognize multiple letters at a time[], and those letters are essentially assembled into tokens that a different part of your brain can deal with.

Breaking down words into letters is essentially a separate "algorithm". Just like your brain, it's likely to never make sense for a text comprehension and generation model to operate at the level of letters - it's inefficient.

A multi-modal model with a dedicated model for handling individual letters could easily convert tokens into letters and operate on them when needed. It's just not a high priority for most use cases currently.

[]https://www.researchgate.net/publication/47621684_Letters_in...

baq · 2024-07-24T19:54:14.000000Z

I agree completely, that wasn’t the point though: the point was that my 6 yo knows when to spell the word when asked and the blob of quantized floats doesn’t, or at least not reliably.

So the blob wasn’t trained to do that (yeah low utility I get that) but it also doesn’t know it doesn’t know, which is an another much bigger and still unsolved problem.

stanleydrew · 2024-07-24T20:54:36.000000Z

I would argue that most sota models do know that they don't know this, as evidenced by the fact that when you give them a code interpreter as a tool they choose to use it to write a script that counts the number of letters rather than try to come up with an answer on their own.

(A quick demo of this in the langchain docs, using claude-3-haiku: https://python.langchain.com/v0.2/docs/integrations/tools/ri...)

patall · 2024-07-24T20:10:53.000000Z

The model communicates in a language, but our letters are not necessary for such and in fact not part of the english language. You could write english using per word pictographs and it would still be the same english&the same information/message. It's like asking you if there is a '5' in 256 but you read binary.

jahsome · 2024-07-24T19:09:32.000000Z

Is anyone in the know, aside from mainstream media (god forgive me for using this term unironically) and civillians on social media claiming LLMs are anything but word calculators?

I think that's a perfect description by the way, I'm going to steal it.

dTal · 2024-07-24T20:53:05.000000Z

I think it's a very poor intuition pump. These 'word calculators' have lots of capabilities not suggested by that term, such as a theory of mind and an understanding of social norms. If they are a "merely" a "word calculator", then a "word calculator" is a very odd and counterintuitively powerful algorithm that captures big chunks of genuine cognition.

jahsome · 2024-07-25T19:59:15.000000Z

Do they actually have those capabilities, or does it just seem like they do because they're very good calculators?

dTal · 2024-07-25T21:25:26.000000Z

There is no philosophical difference. It's like asking if Usain Bolt is really a fast runner, or if he just seems like it because he has long legs and powerful muscles.

jahsome · 2024-07-26T03:04:27.000000Z

I think that's a poor a comparison, but I understand your point. I just disagree about there being no philosophical difference. I'd argue the difference is philosophical, rather than factual.

You also indirectly answered my initial question -- so thanks!

dTal · 2024-07-26T11:53:35.000000Z

What is the difference?

jahsome · 2024-07-26T18:52:01.000000Z

I'm not sure I'm educated (or rested) enough to answer that in a coherent manner, certainly not in a comment thread typing on mobile. So I won't waste your time babbling.

I don't disagree they produce astonishing responses but the nuance of why it's producing that output matters to me.

For example, with regard to social mores, I think a good way to summarize my hang up is that my understanding is LLMs just pattern match their way to approximations.

That to me is different from actually possessing an understanding, even though the outcome may be the same.

I can't help but draw comparisons to my autistic masking.

robbiep · 2024-07-24T21:31:49.000000Z

They’re trained on the available corpus of human knowledge and writings. I would think that the word calculators have failed if they were unable to predict the next word or sentiment given the trillions of pieces of data they’ve been fed. Their training environment is literally people talking to each other and social norms. Doesn’t make them anything more than p-zombies though.

As an aside, I wish we would call all of this stuff pseudo intelligence rather than artificial intelligence

dTal · 2024-07-25T21:24:16.000000Z

I side with Dennett (and Turing for that matter) that a "p-zombie" is a logically incoherent thing. Demonstrating understanding is the same as having understanding because there is no test that can distinguish the two.

Are LLMs human? No. Can they do everything humans do? No. But they can do a large enough subset of things that until now nothing but a human could do that we have no choice but to call it "thinking". As Hofstadter says - if a system is isomorphic to another one, then its symbols have "meaning", and this is indeed the definition of "meaning".