For anyone who still doesn't understand why more and more folks are pointing out...

Drakim · 2024-05-31T19:06:15.000000Z

What is the difference between a google map style application that shows pixels that are "right" for a road, and pixels that are "wrong" for a road?

A pixel is a pixel, colors cannot be true or false. The only way we can say a pixel is right or wrong in google maps is whether the act of the human utilizing that pixel to understand geographical information results in the human correctly navigating the road.

In the same way, an LLM can tell me all kinds of things, and those things are just words, which are just characters, which are just bytes, which are just electrons. There is no truth or false value to an electron flowing in a specific pattern in my CPU. The real question is what I, the human, get out of reading those words, if I end up correctly navigating and understanding something about the world based on what I read.

surfingdino · 2024-05-31T20:19:12.000000Z

Unfortunately, we do not want LLMs to tell us "all sorts of things," we want them to tell us the truth, to give us the facts. Happy to read how this is the wrong way to use LLMs, but then please stop shoving them into every facet of our lives, because whenever we talk about real-life applications of this tech it somehow is "not the right fit".

viraptor · 2024-06-01T11:31:42.000000Z

> we want them to tell us the truth, to give us the facts.

That's just one use case out of many. We also want it to tell stories, make guesses, come up with ideas, speculate, rephrase, ... We sometimes want facts. And sometimes it's more efficient to say "give me facts" and verify the answer then to find the facts yourself.

surfingdino · 2024-06-01T15:00:51.000000Z

What if other sources of facts switch to confabulating LLMs? How will you be able to tell facts from made up information?

mercer · 2024-06-01T16:51:54.000000Z

how do you do that now?

anon373839 · 2024-06-01T05:41:42.000000Z

I think the impact of LLMs is both overhyped underestimated. The overhyping is easy to see: people predicting mass unemployment, etc., when this technology reliably fails very simple cognitive tasks and has obvious limitations that scale will not solve.

However, I think we are underestimating the new workflows this tech will enable. It will take time to search the design space and find where the value lies, as well as time for users to adapt to a different way of using computers. Even in fields like law where correctness is mission-critical, I see a lot of potential. But not from the current batch of products that are promising to replace real analytical work with a stochastic parrot.

basch · 2024-05-31T21:47:23.000000Z

That's a round peg in a square hole. As ive seen them called elsewhere today, these "plausible text generators" can create a pseudo facsimile of reasoning, but they don't reason, and they don't fact check. Even when they use sources to build consensus, its more about volume than authoritativeness.

theturtletalks · 2024-05-31T22:16:15.000000Z

I was watching the show, 3 Body Problem, and there was a great scene where a guy tells a woman to double check another man’s work. Then goes to the man and tells him to triple check the woman’s work. MoE seems to work this way, but maybe we can leverage different models that have different randomness and maybe we can get to a more logical answer.

We have to start thinking about LLM hallucination differently. When it’s follows logic correctly and provides factual information, that is also a hallucination, but one that fits our flow of logic.

dougabug · 2024-06-01T03:09:06.000000Z

Sure, but if we label the text as “factually accurate” or “logically sound” (or “unsound”) etc., then we can presumably greatly increase the probability of producing text with targeted properties

salawat · 2024-06-01T14:12:18.000000Z

What on Earth makes you think that training a model on all factual information is going to do a lick in terms of generating factual outputs?

At that point, clearly our only problem has been we've done it wrong all along by not training these things only on academic textbooks! That way we'll only probabilistically get true things out, right? /s

tines · 2024-05-31T19:23:52.000000Z

> The real question is what I, the human, get out of reading those words

So then you agree with the OP that an LLM is not intelligent in the same way that Google Maps is not intelligent? That seems to be where your argument leads but you're replying in a way that makes me think you are disagreeing with the op.

Drakim · 2024-05-31T19:42:40.000000Z

I guess I am both agreeing and disagreeing. The exact same problem is true for words in a book. Are the words in a lexicon true, false, or do they not have a truth value?

tines · 2024-05-31T20:28:00.000000Z

The words in the book are true or false, making the author correct or incorrect. The question being posed is whether the output of an LLM has an "author," since it's not authored by a single human in the traditional sense. If so, the LLM is an agent of some kind; if not, it's not.

If you're comparing the output of an LLM to a lexicon, you're agreeing with the person you originally replied to. He's arguing that an LLM is incapable of being true or false because of the manner in which its utterances are created, i.e. not by a mind.

williamcotton · 2024-05-31T20:40:28.000000Z

So only a mind is capable of something making signs that are either true or false? Is a properly calibrated thermometer that reads "true" if the temperature is over 25C incapable of being true? But isn't this question ridiculous in the first place? Isn't a mind required to judge whether or not something is true, regardless of how this was signaled?

tines · 2024-05-31T20:49:08.000000Z

Read again; I said he’s arguing that the LLM (i.e. thermometer in your example) is the thing that can’t be true or false. Its utterances (the readings of your thermometer) can be.

This would be unlike a human, who can be right or wrong independently of an utterance, because they have a mind and beliefs.

williamcotton · 2024-05-31T20:52:55.000000Z

A human can be categorically wrong? Please explain.

And of course a thing in and of itself, be it an apple, a dog or an LLM, can’t be true or false.

williamcotton · 2024-05-31T21:26:22.000000Z

I’ll cut to the chase. You’re hung up on the definition of words as opposed to the utility of words.

That classical or quantum mechanics are at all useful depends on the truthfulness of their propositions. If we cared about the process then we let the non-intuitive nature of quantum mechanics enter into the judgement of the usefulness of the science.

The better question to ask is if a tool, be it a book, a thermometer, or an LLM are useful. Error rates affect utility which means that distinctions between correct and incorrect signals are more important than attempts to define arbitrary labels for the tools themselves.

You’re attempting to discount a tool based on everything other than the utility.

contravariant · 2024-05-31T19:36:01.000000Z

Any reply will always sound like someone is disagreeing, even if they claim not to.

Though in this case I'm not even sure what the comment they're supposedly disagreeing with is even claiming. Is it even claiming anything?

tines · 2024-05-31T20:26:21.000000Z

> Any reply will always sound like someone is disagreeing, even if they claim not to.

Disagree! :)

> Though in this case I'm not even sure what the comment they're supposedly disagreeing with is even claiming. Is it even claiming anything?

It's offering support for the claim that LLMs hallucinate 100% of the time, even when their hallucinations happen to be true.

contravariant · 2024-05-31T20:50:54.000000Z

Ah okay I understand, I think. So basically that's solipsism applied to a LLM?

I think that's taking things a bit too far though. You can define hallucination in a more useful way. For instance you can say 'hallucination' is when the information in the input doesn't make it to the output. It is possible to make this precise, but it might be impractically hard to measure it.

An extreme version would be a En->FR translation model that translates every sentence into 'omelette du fromage'. Even if it's right the input didn't actually affect the output one bit so it's a hallucination. Compared to a model that actually changes the output when the input changes it's clearly worse.

Conceivably you could check if the probability of a sentence actually decreases if the input changes (which it should if it's based on the input), but given the nonsense models generate at a temperature of 1 I don't quite trust them to assign meaningful probabilities to anything.

dougabug · 2024-06-01T03:53:53.000000Z

No, your constant output example isn’t what people are talking about with “hallucination.” It’s not about destroying information from the input, in the sense that it you asked me a question and I just ignored you, I’m not in general hallucinating. Hallucinating is more about sampling from a distribution which extends beyond what is factually true or actually exists, such as citing a non-existent paper, or inventing a historical figure.

williamcotton · 2024-05-31T20:31:57.000000Z

> It's offering support for the claim that LLMs hallucinate 100% of the time, even when their hallucinations happen to be true.

Well this makes the term "hallucinate" completely useless for any sort of distinction. The word then becomes nothing more than a disparaging term for an LLM.

tines · 2024-05-31T20:47:28.000000Z

Not really. It distinguishes LLM output from human output even though they look the same sometimes. The process by which something comes into existence is a valid distinction to make, even if the two processes happen to produce the same thing sometimes.

williamcotton · 2024-05-31T20:51:07.000000Z

Why is it a valid distinction to make?

For example, does this distinction affect the assessed truthfulness of a signal?

Does process affect the artfulness of a painting? “My five year old could draw that?”

tines · 2024-05-31T20:53:40.000000Z

It makes sense to do so in the same way that it’s useful to distinguish quantum mechanics from classical mechanics, even if they make the same predictions sometimes.

williamcotton · 2024-05-31T21:03:46.000000Z

A proposition of any kind of mechanics is what can be true or false. The calculations are not what makes up the truth of a proposition, as you’ve pointed out.

_boffin_ · 2024-05-31T19:20:52.000000Z

But then again, that road, according to a neighboring country who thinks that it’s their land and it isn’t a road. Depending on what country you’re in, it makes a difference

pessimizer · 2024-05-31T20:17:02.000000Z

> The real question is what I, the human, get out of reading those words, if I end up correctly navigating and understanding something about the world based on what I read.

No, the real question is how you will end up correctly navigating and understanding something about the world from a falsehood crafted to be optimally harmonious with the truths that happen to surround it?

dougabug · 2024-06-01T03:04:49.000000Z

A pixel (in the context of an image) could be “wrong” in the sense that its assigned value could lead to an image that just looks like a bunch of noise. For instance, suppose we set every pixel in an image to some random value. The resulting would look like total noise and we humans wouldn’t recognized it as a sensible image. By providing a corpus of accepted images, we can train a model to generate images (arrays of pixels) which look like images and not, say, random noise. Now it could still generate an image of some place or person that doesn’t actually exist, so in that sense the pixels are collectively lying to you.

genman · 2024-05-31T20:42:08.000000Z

> What is the difference between a google map style application that shows pixels that are "right" for a road, and pixels that are "wrong" for a road?

Different training methodology and objective and possibility to correct obviously wrong outcomes by comparing it against reality.

gwern · 2024-05-31T22:19:47.000000Z

> let me put it to you this way: what is the LLM doing differently when it generates tokens that are "wrong" compared to when the tokens are "right"?

It is conditioning on latents about truth, falsity, reliability, and calibration. All of these inferred latents have been shown to exist inside LLMs, as they need to exist for LLMs to do their jobs in accurately predicting the next token. (Imagine trying to predict tokens in, say, discussions about people arguing or critiques of fictional stories, or discussing mistakes made by people, and not having latents like that!)

LLMs also model other things: for example, you can use them to predict demographic information about the authors of texts (https://arxiv.org/abs/2310.07298), even though this is something that pretty much never exists IRL, a piece of text with a demographic label like "written by a 28yo"; it is simply a latent that the LLM has learned for its usefulness, and can be tapped into. This is why a LLM can generate text that it thinks was written by a Valley girl in the 1980s, or text which is 'wrong', or text which is 'right', and this is why you see things like in Codex, they found that if the prompted code had subtle bugs, the completions tend to have subtle bugs - because the model knows there's 'good' and 'bad' code, and 'bad' code will be followed by more bad code, and so on.

This should all come as no surprise - what else did you think would happen? - but pointing out that for it to be possible, the LLM has to be inferring hidden properties of the text like the nature of its author, seems to surprise people.

timr · 2024-05-31T22:43:43.000000Z

> It is conditioning on latents about truth, falsity, reliability, and calibration. All of these inferred latents have been shown to exist inside LLMs, as they need to exist for LLMs to do their jobs in accurately predicting the next token.

No, it isn't, and no, they haven't [1], and no, they don't.

The only thing that "needs to exist" for an LLM to generate the next token is a whole bunch of training data containing that token, so that it can condition based on context. You can stare at your navel and claim that these higher-level concepts end up encoded in the bajillions of free parameters of the model -- and hey, maybe they do -- but that's not the same thing as "conditioning on latents". There's no explicit representation of "truth" in an LLM, just like there's no explicit representation of a dog in Stable Diffusion.

Do the thought exercise: if you trained an LLM on nothing but nonsense text, would it produce "truth"?

LLMs "hallucinate" precisely because they have no idea what truth means. It's just a weird emergent outcome that when you train them on the entire internet, they generate something close to enough to truthy, most of the time. But it's all tokens to the model.

[1] I have no idea how you could make the claim that something like a latent conceptualization of truth is "proven" to exist, given that proving any non-trivial statement true or false is basically impossible. How would you even evaluate this capability?

kromem · 2024-05-31T23:40:16.000000Z

This was AFAIK the first paper to show linear representations of truthiness in LLMs:

https://arxiv.org/abs/2310.06824

But what you should really read over is Anthropic's most recent interpretability paper.

timr · 2024-06-01T15:07:01.000000Z

> In this work, we curate high-quality datasets of true/false statements and use them to study in detail the structure of LLM representations of truth, drawing on three lines of evidence: 1. Visualizations of LLM true/false statement representations, which reveal clear linear structure. 2. Transfer experiments in which probes trained on one dataset generalize to different datasets. 3. Causal evidence obtained by surgically intervening in a LLM's forward pass, causing it to treat false statements as true and vice versa. Overall, we present evidence that language models linearly represent the truth or falsehood of factual statements.

You can debate whether the 3 experiments cited back the claim (I don't believe they do), but they certainly don't prove what OP claimed. Even if you demonstrated that an LLM has a "linear structure" when validating true/false statements, that's whole universe away from having a concept of truth that generalizes, for example, to knowing when nonsense is being generated based on conceptual models that can be evaluated to be true or false. It's also very different to ask a model to evaluate the veracity of a nonsense statement, vs. avoiding the generation of a nonsense statement. The former is easier than the latter, and probably could have been done with earlier generations of classifiers.

Colloquially, we've got LLMs telling people to put glue on pizza. It's obvious from direct experience that they're incapable of knowing true and false in a general sense.

cmorez · 2024-06-02T04:51:22.000000Z

> [...] but they certainly don't prove what OP claimed.

OP's claim was not: "LLMs know whether text is true, false, reliable, or is epistemically calibrated".

But rather: "[LLMs condition] on latents *ABOUT* truth, falsity, reliability, and calibration".

> It's also very different to ask a model to evaluate the veracity of a nonsense statement, vs. avoiding the generation of a nonsense statement [...] probably could have been done with earlier generations of classifiers

Yes. OP's point was not about generation, it was about representation (specifically conditioning on the representation of the [con]text).

Your aside about classifiers is not only very apt, it is also exactly OP's point! LLMs are implicit classifiers, and the features they classify have been shown to include those that seem necessary to effectively predict text!

One of the earliest examples of this was the so-called ["Sentiment Neuron"](https://arxiv.org/abs/1704.01444), and for a more recent look into kind of features LLMs classify, see [Anthropic's experiments](https://transformer-circuits.pub/2024/scaling-monosemanticit...).

> It's obvious from direct experience that they're incapable of knowing true and false in a general sense.

Yes, otherwise they would be perfect oracles, instead they're imperfect classifiers.

Of course, you could also object that LLMs don't "really" classify anything (please don't), at which point the question becomes how effective they are when used as classifiers, which is what the cited experiments investigate.

timr · 2024-06-07T03:06:30.000000Z

> But rather: "[LLMs condition] on latents ABOUT truth, falsity, reliability, and calibration".

Yes, I know. And the paper didn't show that. It projected some activations into low-dimensional space, and claimed that since there was a pattern in the plots, it's a "latent".

The other experiments were similarly hand-wavy.

> Your aside about classifiers is not only very apt, it is also exactly OP's point! LLMs are implicit classifiers, and the features they classify have been shown to include those that seem necessary to effectively predict text!

That's what's called a truism: "if it classifies successfully, it must be conditioned on latents about truth".

cmorez · 2024-06-09T02:25:01.000000Z

> "if it classifies successfully, it must be conditioned on latents about truth"

Yes, this is a truism. Successful classification does not depend on latents being about truth.

However, successfully classifying between text intended to be read as either:

- deceptive or honest

- farcical or tautological

- sycophantic or sincere

- controversial or anodyne

does depend on latent representations being about truth (assuming no memorisation, data leakage, or spurious features)

If your position is that this is necessary but not sufficient to demonstrate such a dependence, or that reverse engineering the learned features is necessary for certainty, then I agree.

But I also think this is primarily a semantic disagreement. A representation can be "about something" without representing it in full generality.

So to be more concrete: "The representations produced by LLMs can be used to linearly classify implicit details about a text, and the LLM's representation of those implicit details condition the sampling of text from the LLM".

rustcleaner · 2024-05-31T22:52:19.000000Z

My sense is an LLM is like Broca's area. It might not reason well, but it'll make good sounding bullshit. What's missing are other systems to put boundaries and tests on this component. We do the same thing too: hallucinate up ideas reliably, calling it remembering, and we do one additional thing: we (or at least the rational) have a truth-testing loop. People forget that people are not actually rational, only their models of people are.

kromem · 2024-05-31T23:36:49.000000Z

One of the surprising results in research lately was the theory of mind paper the other week that found around half of humans failed the transparent boxes version of the theory of mind questions - something previously assumed to be uniquely a LLM failure case.

I suspect over the next few years we're going to see more and more behaviors in LLMs that turn out to be predictive of human features.

temporarely · 2024-05-31T18:53:48.000000Z

The terminology is wrong but your point is valid. There is no internal criteria or mechanism for statement verification. As the mind likely is also in part a high dimensional construct and LLMs to an extent represent our collective jumble of 'notions' it is natural that their emits resonate with human users.

Q1: A ""correct" symbolic representation" of x. What is x? Your "Is there an intent to communicate, or" choice construct is problematic. Why would one require a "symbolic representation" of x, x likely being a 'meaningful thought'. So this is a hot debate whether semantics is primary or merely application. I believe it is primary in which case "symbolic representation" is 'an aid' to gaining a concrete sense of what is 'somehow' 'understood'. You observe a phenomena, and understand its dynamics. You may even anticipate it while observing. To formalize that understanding is the beginning of 'expression'.

Q2: because while there is a function LLM(encodings, q) that emits 'plausible' responses, an equivalent function for Pi does not exist outside of 'pure inexpressible realm of understanding' :)

salawat · 2024-05-31T19:39:18.000000Z

>I believe it is primary in which case "symbolic representation" is 'an aid' to gaining a concrete sense of what is 'somehow' 'understood'.

There is nothing magic about perception to distinguish it meaningfully from symbolic representation; in point of fact, that which you experience is in and of itself a symbolic representation of the world around you. You do not sense the frequencies outside the physical transduction capabilities of the human ear, or the wavelengths similarly beyond the capability to transduce of the human eye, or feel distinct contact beyond the density of haptic transduction of somatic nerves. Nevertheless, those phenomena are still there, and despite their insensible nature, have an effect on you. Your entire perception is a map, which one would be well advised to not mistake for the territory. To dismiss symbolic representation as something that only occurs on communication after perception is to "lose sight" of the fact that all the data your mind integrates into a perception is itself, symbolic.

Communication, and symbolic representation is all there is, and it happens long before we even get to the partnof reality where I'm trying to select words to converse about it with you.

Gormo · 2024-06-01T19:49:26.000000Z

> There is nothing magic about perception to distinguish it meaningfully from symbolic representation; in point of fact, that which you experience is in and of itself a symbolic representation of the world around you.

You're right that there's nothing magic about it at all. The mind operates on symbolic representations, but whether those are symbolic representations of external sensory input or symbolic representations of purely endogenous stochastic processes makes for a night-and-day difference.

Perception is a map, but it's a map of real territory, which is what makes it useful. Trying to navigate reality with a map that doesn't represent real territory is not just not useless, it's dangerous.

Gormo · 2024-06-01T19:45:12.000000Z

> As the mind likely is also in part a high dimensional construct and LLMs to an extent represent our collective jumble of 'notions' it is natural that their emits resonate with human users.

But humans are equipped with sensory input, allowing us to formulate our notions by reference to external data, not just generate notions by internally extrapolating existing notions. When we fail do do this, and do formulate our notions entirely endogenously, that's when we say we are hallucinating.

Since LLMs are only capable of endogenous inference, and are not able formulate notions based on empirical observation, the are always hallucinating.

visarga · 2024-05-31T19:48:32.000000Z

> what is the LLM doing differently when it generates tokens that are "wrong" compared to when the tokens are "right"?

When they don't recall correctly, it is hallucination. When they recall perfectly, it is regurgitation/copyright infringement. We find issue either way.

May I remind you that we also hallucinate, memory plays tricks on us. We often google stuff just to be sure. It is not the hallucination part that is a real difference between humans and LLMs.

> Why do humans produce speech?

We produce language to solve body/social/environment related problems. LLMs don't have bodies but they do have environments, such as a chat room, where the user is the environment for the model. In fact chat rooms produce trillions of tokens per month worth of interaction and immediate feedback.

If you look at what happens with those trillions of tokens - they go into the heads of hundreds of millions of people, who use the LLM assistance to solve their problems, and of course produce real world effects. Then it will reflect in the next training set, creating a second feedback loop between LLM and environment.

By the way, humans don't produce speech individually, if left alone, without humanity as support. We only learn speech when we get together. Language is social. Human brain is not so smart on its own, but language collects experience across generations. We rely on language for intelligence to a larger degree than we like to admit.

Isn't it a mystery how LLMs learned so many language skills purely from imitating us without their own experience? It shows just how powerful language is on its own. And it shows it can be independent on substrate.

jgalt212 · 2024-06-01T00:54:43.000000Z

> When they don't recall correctly, it is hallucination. When they recall perfectly, it is regurgitation/copyright infringement. We find issue either way.

You nailed it right there.

yarg · 2024-05-31T23:49:18.000000Z

Bonus question 2 is the most ridiculous straw man I've seen in a very long time.

The existence of arbitrary string encodings in transcendental numbers proves absolutely nothing about the processing capabilities of adaptive algorithms.

pennomi · 2024-06-01T02:59:27.000000Z

Exactly. Reading digits of pi doesn’t converge toward anything. (And neither do infinite typewriter monkeys.) Any correct value they get is random, and exceedingly rare.

LLMs corral a similar randomness to statistically answer things correctly more often than not.

apantel · 2024-06-01T01:53:38.000000Z

Here’s the issue: humans do the same thing: the brain builds up a model of the world but the model is not the world. It is a virtual approximation or interpretation based on training data: passed experiences, perceptions, etc.

A human can tell you the sky is blue based on its model. So can any LLM. The sky is blue. So the output from both models is truthy.

Gormo · 2024-06-01T20:05:42.000000Z

> A human can tell you the sky is blue based on its model. So can any LLM. The sky is blue. So the output from both models is truthy.

But a human can also tell you that the sky is blue based looking at the sky, without engaging in any model-based inference. An LLM cannot do that, and can only rely on its model.

Humans can engage in both empirical observation and stochastic inference. An LLM can only engage in stochastic inference. So while both can be truthy, only humans currently have the capacity to be truthful.

It's also worth pointing out that even if human minds worked the same way as LLMs, our training data consists of an aggregation of exactly those empirical observation -- we are tokenizing and correlating our actual experiences of reality, and only subsequently representing the output of our inferences with words. The LLM, on the other hand, is trained only on that second-order data -- the words -- without having access to the much more thorough primary data that it represents.

ToValueFunfetti · 2024-06-01T00:10:18.000000Z

A witness to a crime thinks that there were 6 shots fired; in fact there were only 2. They remember correctly the gender of the criminal, the color of their jacket, the street corner where it happened, and the time. There is no difference in their mind between the true memories and the false memory.

I write six pieces of code that I believe have no bugs. One has an off-by-one error. I didn't have any different experience writing the buggy code than I did writing the functional code, and I must execute the code to understand that anything different occurred.

Shall I conclude that myself and the witness were hallucinating when we got the right answers? That intelligence is not the thing that got us there?

Gormo · 2024-06-01T20:02:36.000000Z

> Shall I conclude that myself and the witness were hallucinating when we got the right answers?

If you were recalling stored memories of experiences that were actual interactions with external reality, and some of those memories were subsequently corrupted, then no, you were not hallucinating.

If you were engaging in a completely endogenous stochastic process to generate information independently of any interaction with external reality, then yes, you were hallucinating.

> That intelligence is not the thing that got us there?

It's not. In both cases, the information you are recalling is stored data generated by external input. The storage medium happens to be imperfect, however, and occasionally flips bits, so later reads might not exactly match what was written. But in neither case was the original data generated via a procedural algorithm independently of external input.

sangnoir · 2024-06-01T06:41:56.000000Z

People who are consistently unable to distinguish fiction from reality make terribly witnesses; if an obviously high crackhead would fare better than an LLM on the witness stand.

dangerwill · 2024-05-31T19:07:41.000000Z

Do we actually think this way though? When I am talking with someone I am cognating about what information and emotion I want to impart to the person / thinking about how they are perceiving me and the sentence construction flows from these intents. Only the sentence construction is even analogous to token generation, and even then, we edit our sentences in our heads all the time before or while talking. Instead of just a constant forward stream of tokens from the LLM.

Lerc · 2024-05-31T23:45:26.000000Z

>what is the LLM doing differently when it generates tokens that are "wrong" compared to when the tokens are "right"? If there is a difference, where does that exist? In the mechanism of the LLM, or in your mind?

If there were a detectable difference within the mechanism, the problem of hallucinations would be easy to fix. There may be ways to analyze logits to find patterns of uncertainty characteristics related to hallucinations. Perhaps deeper introspection of weights might turn up patterns.

The difference isn't really in your mind either. The difference is simply that the one answer correlates with reality and the other does not.

The point of AI models is to generalize from the training data, that implicitly means generating output that it hasn't seen as input.

Perhaps the issue is not so much that it is generalizing/guessing but the degree to which making a guess is the right call is dependent on context.

jellicle · 2024-05-31T19:08:05.000000Z

If I make a machine that makes random sounds in approximately the human vocal range, and occasionally humans who listen to it hear "words" (in their language, naturally), then is that machine telling the truth when words are heard and "hallucinating" the rest of the time?

bombcar · 2024-05-31T19:11:37.000000Z

Now I'm imagining a device that takes as input the roaring of a furnace, and only outputs when it recognizes words.

mcguire · 2024-05-31T19:27:48.000000Z

(An aside: Writing is a representation of speech, not the other way around.)

sigmoid10 · 2024-05-31T18:47:41.000000Z

>what is the LLM doing differently when it generates tokens that are "wrong" compared to when the tokens are "right"?

When an LLM is trained, it essentially compresses the knowledge of the training data corpus into a world model. "Right" and "false" are thereby only emergent when you have a different world model for yourself that tells you a different answer, most likely because the LLM was undertrained on the target domain. But to the LLM, the token with the highest probability will be the most likely correct answer, similarly to how you might have a "gut feeling" when asked about something which you clearly don't understand and have no education in. And you will be wrong just as often. The perceived overconfidence of wrong answers likely stems from human behaviour in the training data as well. LLMs are not better than humans, but they are also not worse. They are just a distilled encoding of human behaviour, which in turn might be all that the human mind is in the end.

swatcoder · 2024-05-31T19:08:38.000000Z

No.

LLM's become fluent in constructing coherent, sophisticated text in natural language from training on obscene amounts of coherent, sophisticated text in natural language. Importantly, there is no such corpus of text that contains only accurate knowledge, let alone knowledge as it unambiguously applies to some specific domain.

It's unclear that any such corpus could exist (a millennias old discussion in philosophy with no possible resolution), but even if you take for granted that such a corpus could, we don't have one.

So what happens is that after learning how to construct coherent, sophisticated text in natural language from all the bullshit-adled general text that includes truth and fiction and lies and fantasy and bullshit and garbage and old text and new text, there is a subsequent effort to sort of tune the model in on some generating useful text towards some purpose. And here, again, it's important to distinguish that this subsequent training is about utility ("you're a helpful chatbot", "this will trigger a function call that will supplement results", etc) and so still can't focus strictly on knowledge.

LLM's can produce intelligent output that may be correct and may be verifiable, but the way they work and the way they need to be trained prevents them from ever actually representing knowledge itself. The best they can do is create text that is more or less fluent and more or less useful.

It's awesome and has lots and lots of potential, but it's a radically different thing than a material individual that's composed of countless disparate linguistic and non-linguistic systems that have never yet been technologically replicated or modeled.

sigmoid10 · 2024-06-01T07:32:56.000000Z

Wrong. This is the common groupspeak on uninformed places like HN, but it is not what the current research says. See e.g. this: https://arxiv.org/abs/2210.13382

Most of what you wrote shows that you have zero education in modern deep learning, so I really wonder what makes you form such strong opinions on something you know nothing about.

emporas · 2024-06-01T18:28:03.000000Z

The person you are replying to, said it clearly: "there is no such corpus of text that contains only accurate knowledge"

Deep learning, learns a model of the world, and this model can be as inaccurate as it goes. Earth may as well have 10 moons for a DL model. In order for Earth to have only 1 moon, there has to be a dataset which encodes only that information, and not even once more moons. A drunk person who stares at the moon, sees more than one moon and writes about that on the internet, has to be excluded from the training data.

Also the model of the Othello world, is very different from a model of the real world. I don't know about Othello, but in chess it is pretty well known that all possible chess positions, are more than there are atoms in the universe. For all practical purposes, the dataset of all possible chess positions is infinite.

The dataset of every possible event on earth, every second is also more than all the atoms in the universe. For all practical purposes, it is infinite as well.

Do you know that one dataset is more infinite than the other? Does modern DL state that all infinities are the same?

sigmoid10 · 2024-06-03T09:21:51.000000Z

Wrong again. When you apply statistical learning over a large enough dataset, the wrong answers simply become random normal noise (a consequence of the central limit theorem) - the kind of noise which deep learning has always excelled at filtering out, long before LLMs where a thing - and the truth becomes a constant offset. If you have thousands of pictures of dogs and cats and some were incorrectly labelled, you can still train a perfectly good classifier that will achieve more or less 100% accuracy (and even beat humans) on validation sets. It doesn't matter if a bunch of drunk labellers tainted the ground truth as long as the dataset is big enough. That was the state of DL 10 years ago. Today's models can do a lot more than that. You don't need infinite datasets, they just need to be large enough and cover your domain well.

emporas · 2024-06-03T20:22:50.000000Z

> You don't need infinite datasets, they just need to be large enough and cover your domain well.

When you are talking about distinguishing noise from a signal, or truth from not-totally-truth, and the domain is sufficiently small, e.g a game like Othello or data from a corporation, then i agree with everything in your comment.

When the domain is huge, then distinguishing truth from lies/non-truth/not-totally-truth is impossible. There will not be such a high quality dataset, because everything changes over time, truth and lies are a moving target.

If we humans cannot distinguish between truth and non-truth, but the A.I. is able to, then we are talking about AGI. Then we put the machines to discover new laws of physics. I am all for it, i just don't see it happening anytime soon.

sigmoid10 · 2024-06-05T10:15:06.000000Z

What you're talking about is by definition no longer facts but opinions. Even AGI won't be able to turn opinions into facts. But LLMs are already very good at giving opinions rather than facts thanks to alignment training.

smogcutter · 2024-05-31T19:02:38.000000Z

> But to the LLM, the token with the highest probability will be the most likely correct answer

This is precisely what people are identifying as problematic

Gormo · 2024-06-01T20:17:43.000000Z

> When an LLM is trained, it essentially compresses the knowledge of the training data corpus into a world model

No, you added an extra 'l'. It's not a world model, it's a word model. LLMs tokenize and correlate objects that are already second-order symbolic representations of empirical reality. They're not producing a model of the world, but rather a model of another model.

mcguire · 2024-05-31T19:53:26.000000Z

Do you have a citation for the existence of an LLM "world model"?

My understanding of retrieval-augmented generation is that it is an attempt to add a world model (based on a domain-specific knowledge database) to the LLM; the result of the article is that the combination still hallucinates frequently.

I might even go further to suggest that the latter part of your comment is entirely anthropomorphization.

edmara · 2024-06-01T13:55:34.000000Z

https://arxiv.org/abs/2310.02207 https://transformer-circuits.pub/2024/scaling-monosemanticit...

throw46365 · 2024-05-31T18:46:08.000000Z

> If there is a difference, where does that exist? In the mechanism of the LLM, or in your mind?

Thank you for this sentence: it is hard to get across how often Gen-AI proponents are actually projecting perceived success onto LLMs while downplaying error.

empath75 · 2024-05-31T19:43:31.000000Z

I mostly see the reverse.

throw46365 · 2024-05-31T20:11:57.000000Z

You mostly see people projecting perceived error onto LLMs?

I don't think I've seen a single article about an AI getting things wrong, recently, where there was a nuanced notion about whether it was actually wrong.

I don't think we're anywhere close to "nuanced mistakes are the main problem" yet.

empath75 · 2024-06-01T16:58:09.000000Z

I mostly see people ignoring successes and playing up every error.

throw46365 · 2024-06-01T18:28:58.000000Z

But the errors are fundamental, and the successes actually subjective as a result.

That is, it appears to get things right, really a lot, but the conclusions people draw about why it gets things right are undermined by the nature of the errors.

Like, it must have a world model, it must understand the meaning of... etc.; the nature of the errors they are downplaying fundamentally undermines the certainty of these projections.