Hacker News new | past | comments | ask | show | jobs | submit login
Inductive or deductive? Rethinking the fundamental reasoning abilities of LLMs (arxiv.org)
107 points by belter 13 days ago | hide | past | favorite | 169 comments





I'm really tired of these papers and experiments.

You cannot test reasoning when you don't know what's in the training set. You have to be able to differentiate reasoning from memorization, and that's not trivial.

Moreso, the results look to confirm that at least some memorization is going on. Do we really not think GPT has extensively been trained on arithmetic in base 10, 8, and 16? This seems like a terrible prior. Even if not explicitly, how much code has it read that performs these tasks. How many web pages, tutorials, Reddit posts cover oct and hex? They also haven't defined zero shot correctly. Arithmetic in these bases aren't 0-shot. They're explicitly in distribution...

I'm unsure about base 9 and 11. It's pretty interesting to see that GPT 4 is much better at these. Anyone know why? Did they train on these? More bases? Doesn't seem unreasonable but I don't know.

The experimentation is also extremely lacking. The arithmetic questions only have 1000 tests where they add two digits. This is certainly in the training data. I'm also unconvinced by the syntax reasoning tasks since the transformer (attention) architecture seems to be designed for this. I'm also unconvinced these tasks aren't in training. Caesar ciphers are also certainly in the training data.

The prompts are also odd and I guess that's why they're in the appendix. For example, getting GPT to be better at math or many tasks by having it write python code is not novel.

There's some stuff here but this really doesn't seem like a lot of work for 12 people from a top university and a trillion dollar company. It's odd to see that many people when the experiments can be run in a fairly short time.


We can tell some of what's in the training set. One of the answers for the inductive reasoning test begins "begin from the rightmost digit". Look that phrase up in Google. It shows up in Chegg, Course Hero, and Brainly content for elementary arithmetic. If you bash on those how-to articles, available for bases 2 and 10, you can probably generate the pattern for base 8.

This looks like an LLM doing the usual LLM thing - finding relevant items and combining them to fit. This doesn't require the level of abstraction and induction the authors impute to the LLM. Ordinary LLM behavior explains this, once you've found the relevant training data.

People often solve problems that way too, of course.


That reminds me of an old paper about "Benny's Rules", a case-study focused on a kid who seemed to be doing better than average in math tests when it came to final answers... but for all the wrong reasons, using an inferred set of arbitrary text manipulation rules.

The intent was to point out that the educational approach was flawed, but I think there are interesting parallels to token processing in LLMs, which--unlike a human child--are built in such a way that crazy partial-fit rules are likely their only option.

> Benny believed that the fraction 5/10 = 1.5 and 400/400 = 8.00 because he believed the rule was to add the numerator and denominator and then divide by the number represented by the highest place value.

https://blog.mathed.net/2011/07/rysk-erlwangers-bennys-conce...


This is a problem with some tests. The students may detect a pattern in the test answers which reflects the work of those generating the answers, not the content.

See this article on SAT test prep.[1] The requirement that only one answer can be right means that wrong answers have easily identifiable properties.

[1] https://blog.prepscholar.com/the-critical-fundamental-strate...


You'll probably find this talk [1] interesting. They control all the training data for small LLMs and then perform experiments (including reasoning experiments).

[1] Physics of LLMs: https://www.youtube.com/watch?v=yBL7J0kgldU&t=7s


How to you define memorization and reasoning? There is a large grey area in between them. Some say that if you can memorize facts and algorithms and apply them to new data, it is a memorization. Some say that it is reasoning.

More than that -- It's not clear that what humans do is not "just" a memorization. We can always look at human experience mechanisticly and say that we don't think -- we just memorized thinking patterns and apply them when speaking and "thinking"


  >  It's not clear that what humans do is not "just" a memorization.
While I agree that there is a lot of gray in-between I think you are misrepresenting my comment. And I'm absolutely certain humans do more than memorization. Not all humans, but that's not the bar. Some humans are brain damaged and some are in fact babies (and many scientific do agree that sentience doesn't appear at birth).

If you doubt me I very much encourage you to dive deeper into the history of science and get doing deep deep knowledge on any subject. Because you'll find this happen all the time. But if you apply a loose enough definition to memorization (that isn't one that would be generally agreed upon if you used it's logical conclusions) then yeah, everything is memorization. But everything is foo if I define everything to be foo, so let's not.


A lot of reasoning is similar to interpolation within a sparse set of observations. Memorization is rounding up to the nearest known example. Basic guess is linear interpolation. And reasoning is about discovering the simplest rule that explains all the observations and using this rule to extrapolate.

>> Some say that if you can memorize facts and algorithms and apply them to new data, it is a memorization. Some say that it is reasoning.

Who are those "some" who say it is reasoning?

Here's a question. If you enter a command in your terminal do you expect your shell to have memorised the result of the command from some previous experience, or do you expect your shell to compute the result of your command according to its programming and your inputs? A rhetorical question: we all assume the latter.

Which one is more like what most peoples' informal conception of "reasoning"? Retrieving an answer from storage, or computing an answer from the inputs given to a program?

>> We can always look at human experience mechanisticly and say that we don't think -- we just memorized thinking patterns and apply them when speaking and "thinking"

I think this is confusing memorisation of the rules required to perform a computation, like a program stored in computer memory, with memorisation of the results of a computation. When we distinguish between memorisation and reasoning we usually make a distinction between computing a result from scratch and retrieving it from storage without having to re-compute it, like in caching or memoization, or getting data from a database.

For a real world example, we memorise our time tables but we don't memorise the result of every sum x + y = z, instead we memorise a summation algorithm that we then use to derive the sum of two numbers.


> Some say that if you can memorize facts and algorithms and apply them to new data, it is a memorization. Some say that it is reasoning.

Memorizing facts and algorithms is memorization. The rest of what you are talking about is not.

Applying existing knowledge on new data without deriving new information is generalization. An example of this is the case of a semantic segmentation model classifying a car that it has never seen. If the model was not trained on birds, it will never classify a bird as a bird.

Computation of decidable problems is a large, possibly the largest subset of reasoning. Most humans do not struggle with solving decidable problems, the problem is that they are slow and can only solve small problem sizes, but most problems encountered in practice aren't one large decidable problem, but a long chain of many small, dozens to hundreds of heterogeneous problems that are seamlessly mixed with one another. LLMs struggle with decidable problems that are out of distribution, but you can give a human instructions on how to do something they have never done before and they will follow them with no problem.

> More than that -- It's not clear that what humans do is not "just" a memorization.

I hope it is clear that I did not memorize this message I am writing here and that it is the unique result of processes inside my brain that were not captured in the training process of an LLM.

>We can always look at human experience mechanisticly and say that we don't think -- we just memorized thinking patterns and apply them when speaking and "thinking"

Again you are trying to twist this in an absurd direction. Let's come up with a teleoperated humanoid robot on Mars that is controlled by a human on Earth. The robot acts exactly like a human does. Does this mean the robot is now capable of reasoning and thinking like a human, simply because it is replaying a recording of the human's body and speech? This is the argument you are making. You're arguing that the robot's ability to replay a human's actions is equivalent to the processes that brought about that human action.


  > Let's come up with a teleoperated humanoid robot on Mars
One example I've always liked is from Star Trek. They got holodecks and no one thinks those are sentient people even though they are adaptive.

I don't care what Iilya said, mimicking a human does not make a human. It may look like a duck, swims like a duck, and quack like a duck, then it's probably a duck, but you haven't ruled out an advanced animatronic. In fact, I'm betting right now people could make an animatronic that would convince most people it is a duck because most people just don't know the nuances of duck behavior.


I think your "advanced animatronic" is a duck until you can devise a test that cleanly separates it from a "real duck". A test of "duckness".

If todays LLMs are not really intelligent (hardly anyone is arguing that LLMs are literally humans) then by all means, devise your intelligence test that will cleanly qualify Humans (and all creatures you ascertain are capable on intelligence) and disqualify LLMs. In good faith, it should be something both or all can reasonably attempt.

I'll save you some time but feel free to prove me wrong. You will not be able to do so. Not because LLMs can solve every problem under the Sun but because you will at best create something that disqualifies LLMs and a good chunk of humans along the way.

Anybody who cannot manage this task (which should be very simple if the difference is as obvious as a lot of people claim) has no business saying LLMs aren't intelligent.


  > I think your "advanced animatronic" is a duck until you can devise a test that cleanly separates it from a "real duck". A test of "duckness".
I think you're reaching and willfully misinterpreting.

  > You will not be able to do so.
I am unable to because any test I make or from any other researcher in the field makes will result in an answer you don't like.

River crossing puzzles are a common test. Sure, humans sometimes fail even the trivial variations, but the important part is how they fail. Humans guess the wrong answer. LLMs will tell you the wrong answer while describing steps that are correct and result in a different answer. It's the inconsistency and contradiction that's the demonstration of lack of reasoning and intelligence, not the failure itself.


>I think you're reaching and willfully misinterpreting.

That's the spirit of the popular phrase isn't it ? I genuinely never read that and thought that literally only those 3 properties were the bar.

>I am unable to because any test I make or from any other researcher in the field makes will result in an answer you don't like.

This is kind of an odd response. I mean maybe. I'll be charitable and agree wholeheartedly.

But i don't know what my disappointment has to do with anything? I mean if i could prove something i deeply believed to be True and which would also be paper worthy, i certainly wouldn't let the reactions of an internet stranger stop me from doing it.

>River crossing puzzles are a common test. Sure, humans sometimes fail even the trivial variations, but the important part is how they fail. Humans guess the wrong answer. LLMs will tell you the wrong answer while describing steps that are correct and result in a different answer. It's the inconsistency and contradiction that's the demonstration of lack of reasoning and intelligence, not the failure itself.

Inconsistency and contradiction with the reasoning they say (and even believe) and the decisions they make is such a common staple of human reasoning we have a name for it...At worst, you could say these contradictions don't always take the same form but this just kind of loops back to my original point.

Let me be clear here, if you want to look at results like these and say - "This is room for improvement", then Great!, I agree. But it certainly feels like a lot of people have a standard of reasoning (for machines) that only exists in fiction or their own imaginations. This general reasoning engine that makes neither mistake nor contradiction in output or process does not exist in real life whether you believe humans are the only beings capable of reasoning or are gracious enough to extend this capability to some of our animal friends.

Also, i've seen LLMs fail trivial variations of said logic puzzles, only to get them right when you present the problem in a way that doesn't look exactly like the logic puzzle they've almost certainly memorized. Sometimes, it's as simple as changing the names involved. Isn't that fascinating ? Humans have a similar cognitive shortcoming.


I think the results still tell us something.

Discrepancies in mathematical ability between the various bases would seem to suggest memorization as opposed to generalization.


>> The arithmetic questions only have 1000 tests where they add two digits. This is certainly in the training data.

Yeah, it's confirmation bias. People do that sort of thing all the time in machine learning research, specially in the recent surge of LLM-poking papers. If they don't do that, they don't have a paper, so we're going to see much more of it before the trend exhausts itself.


Is there a good reason to exclude abductive reasoning from an analysis like this? It's even considered by at least one of the referenced papers (Fangzhi 2023a).

Abductive reasoning is common in day-to-day life. It seeks the best explanation for some (often incomplete) observations, and reaches conclusions without certainty. I would have thought it would be important to assess for LLMs.


My instinct is it is a distinction without a difference in this context. i.e. if deductive is "I watched the cue ball hit the 8 ball, therefore, the 8 ball is moving" and abductive is "the 8 ball is moving towards me, therefore the cue ball must have hit it. I cannot claim to have deduced this because I did not observe it", LLMs cannot observe the situation, so any deduction (in the binary induction/deductive sense) must be done by abduction.

https://en.wikipedia.org/wiki/Abductive_reasoning

Is abductive inference synonymous with bayesian inference?


I like to think of abductive reasoning as the basis for science that explains natural processes that happened in the past -- like astronomy and geology and evolution -- where experiments are too big to conduct or processes too slow to observe in real-time. So we propose mechanistic explanations for nonobvious outcomes like the formations of stars, or motion of large land mass via plate tectonics or glaciation, or long-range organism speciation over millennia. That's the role for abduction, to explain how all that happened.


No, but agreement with priors is one way one might choose between possibilities.

For example suppose you go outside and the streets are wet. Perhaps it rained, or perhaps someone drove a fire truck around spraying water all over the streets. You might select the former because of its higher prior probability.


Large Language Model algorithms do not reason.

They are statistical text generators, whose results are defined by their training data set. This is why the paper cited reads thusly:

  Despite extensive research into the reasoning capabilities
  of Large Language Models (LLMs), most studies have failed
  to rigorously differentiate between inductive and deductive
  reasoning ...
There is no differentiation because what was sought is the existence of what does not.

The authors then postulate:

  This raises an essential question: In LLM reasoning, which
  poses a greater challenge - deductive or inductive reasoning?
There is no such thing as "LLM reasoning." Therefore, the greatest challenge is accepting this fact and/or that anthropomorphism is a real thing.

I really dislike these non-sequitur arguments of "LLMs do not reason, because <detail on how they work>", as if a detail on how they work unequivocally proves the impossibility of reasoning.

I've noticed that on HackerNews about 80% of all debates or discussions where people disagree, boils down to a disagreement about the definition of a word.

feel free to extend to other contexts. It was basically Socrates' argument.

That is certainly true, for some definition of 80%.

I see. Well, I claim my pet rock can think. What do you mean it doesn't have brain cells, what kind of argument is that?

My pet silicon rock was processed by TSMC in a design by Nvidia, and does this trick called inference that very much does look like thinking. Who made yours?

I deem your rock to be disqualified on the basis of its performance clearly having been enhanced by doping.

"This alien super-computer made from entangled iron atoms clearly is not thinking because it doesn't have neurons"

you mean like this?


ChatGPT - Our very own alien super-computer!

  > They are statistical text generators, whose results are defined by their training data set
Honestly I'm pissed at the research community. It's fraud. If you don't know what's in the training data you simply cannot differentiate reasoning from memorization.

Beyond that, the experiments are almost certainly in the training data. Like come on! I feel like I'm going crazy here. How can any coder not think LLMs are training on oct and hex‽

https://news.ycombinator.com/item?id=41422751


Of course they've been trained on oct and hex.

The question is would the results be largely valid if this was done on a human who had learnt how to perform base 8, 9, 11 etc arithmetic instead ?

I mean, they're clearly not trying to test the ability to derive base arithmetic from scratch.


They are not statistical text generators. They are black boxes wrapped in an interface part of which is a statistical text generator.

You can write programs into transformer models and run them deterministically; they aren't "statistical".


Can you cite the deterministic part? And I don't think fixing your seeds count as making a model not statistical, though yes deterministic.

It's also still statistical if it gives you the same answer 99999/100000 times. You can definitely tune these things to be much more certain about certain problems but it tends to decrease capabilities in others.

I also don't like the term "black box." While they aren't transparent they aren't completely opaque either. A big reason I didn't like the term is that I feel it also encourages people to not research this more. While we don't completely know what's going on I see no reason we can't. Is there some proof they their calculations are an unbreakable encryption or is it just a hard problem. I'm pretty sure it's the latter, and I think it's good to differentiate "we're dumb" from "indeterminate"


> Can you cite the deterministic part? And I don't think fixing your seeds count as making a model not statistical, though yes deterministic.

I was thinking temperature=0, but what more do you need than that?

(nb there is also some randomness from non-associative FP operations happening in different orders)

> I also don't like the term "black box." While they aren't transparent they aren't completely opaque either.

A white box is a subset of a black box here. The important part is that the sampler/text generation is wrapped around the transformer model and is not actually the same thing as it.


  > I was thinking temperature=0, but what more do you need than that?
Temperature isn't a thing exclusive to LLMs or ML. It's a general technique from statistics where you're modifying the variance of the distribution (also see truncation). And as you're aware, temperature zero isn't actually temperature zero.

So say such a thing makes something non-statistical is the same as saying prime numbers don't exist because any prime p * 0 is 0.

  > A white box is a subset of a black box here
No. A white box and black box sit at opposite ends of a spectrum which is just saying how much we understand something. Complexity isn't what makes something a black box, it's our lack of understanding that does. Of course this tends to coinside, but there's plenty of extremely complex things we have a great understanding of, even if not to infinite precision.

Please take your passion and use it to drive you to learn about these things deeper. It's actually the same advice I give to the LLM fanboys who think it's AGI. But trying to dispel that myth without correct information makes it harder. It muddies the waters. These topics are extremely complex and the issue with the ML field is itself a hypocritical claim that it is easy (scale is all you need while calling these machines black boxes). I mean we can talk about the elephant in the room but certainly Von Neumann has said enough.


> No. A white box and black box sit at opposite ends of a spectrum which is just saying how much we understand something.

So it's a subset, because "understanding something" is a refinement of the state of "not understanding something" - you can just pretend you don't know what's happening in that part of the system and it is now a black box.


Semantic word games aren't a crowd-winning form of argument.

I don't think the concept of encapsulation in programming is a semantic word game.

Any part of a program can become a black box if you decide it is one (if you can figure out an interface anyway…), and that's good because it means you can swap it out.


It's worse when you play that semantic game to make your point stronger (changing your original meaning) and willfully misinterpreting the other person. Everyone loves being willfully misinterpreted

The issue with comprehending LLM's internal workings is just that they're so big. The operation of the network itself is deterministic - but it's output is a set of probabilities over the value of the next token in the stream. At that point you have to choose one, based on that probability distribution, and then start the process again. That choice is where the randomness comes in.

Just to clarify, are you disagreing with me? Correcting me? Adding context for others? I guess I'm missing the intent

Just adding context, elaborating a little on the deterministic / statistical aspects of an LLM. I don't disagree with what you said, I'm afraid - so not much prospect for a back and forth :-)

LLMs are statistical next-word predictors, wrapped in a stochastic next-word generator.

The next-word predictions output by the model itself are based on the patterns and statistics of the training set. The next word that is actually generated is typically a random selection per a top-k or top-p sampling scheme.


> Large Language Model algorithms do not reason.

My belief is that what LLMs do is best described as approximating the outputs of reasoning. This is different from reasoning itself and also different from simply regurgitating the most similar training examples.

I also think it's extremely challenging for people to understand the significance of LLM output because language is intertwined with meaning. So if you construct a valid and seemingly responsive sequence of words in response to a prompt, there is a strong illusion of purpose behind those words, even if they were assembled in a completely naive fashion.


This! One simple argument is that language is NOT a magical reasoning substance in itself, but a communication medium. It is medium for passing (a) meaning. So first there is a meaningful thought (worth of sharing), then an agent puts a SIGNIFIER on that meaningful thought, then communicates it to the recipient. Communication medium can be a sentence, it can also be an eyewink or a tail wiggle. Or a whistle. The "language" can be created on the spot, if two subjects get a meaning of signifier by intuition (e.g. I look at the object, you follow my gaze).

So the fallacy of the whole LLM field is the belief that language has some intrinsic meaning. Or if you mix the artifacts of language in some very smart way, the meaning will emerge. But it doesn't work if meaning occurs before the word. The text in books has no reasoning, it was authors. The machine shuffling the text fragments does not have a meaningful thought. The engineer which devised a shuffling machine had some meaningful thought, the users of the machine have same thoughts, but not the machine itself. To put it another way, if there was an artificial system capable of producing meaningful thoughts, it is not a presence of language which produces a proof, it's communication. Communication requires an agent (as in "agency") and an intent. We have neither in LLM. As to the argument that we ourselves are mere stochastic parrots - of course we can produce word salads, or fake mimics of coherent text, it is not a proof that LLM IS the way our minds work. It is just a witness to the fact language is a flexible medium for the meanings behind - it can just as well be used for cheating, pretending, etc.


  > One simple argument is that language is NOT a magical reasoning substance in itself, but a communication medium.
I'm wildly impressed by how many people think language is thinking. My best guess is they're conflating inner speech with thinking. But if you can't figure out that the words you vocalize aren't an isomorphic representation of the things you try to convey, well... It's hard for me to believe you've spent enough time thinking about what it means to think. Miscommunication is quite common and so there's sufficient feedback to learn this without being explicitly taught. Then again, there are people in the world that I fear...

> So the fallacy of...

The text from this point on seems to have lost contextual awareness of what preceded it (which was excellent imho).


Define reason before making grandiose claims.

I think "reasoning" is the best word in the English language that I know of to describe what LLMs are doing. Lots of people disagree because for about 2000 years that word always implied consciousness, and LLMs definitely don't have consciousness.

It's not possible to know if anyone has consciousness, other than them themselves.

We must be *very* careful before making authoritative statements about who and what has qualia.


The logic goes the other way. The claim that is extraordinary is that a rock with electricity has consciousness. Nothing else we've built has shown evidence of consciousness and there's good explanations for the results we see without needing consciousness.

Sure, it's still debatable for us humans (the term isn't well defined) but that doesn't change the direction of required proof.


Any assertion has a burden of proof, you are likely referring to cultural/colloquial norms/beliefs on the matter.

There is cultural/normative truth, and then there are various stricter/technical interpretations (using forms of logic other than binary, which causes reality to appear very differently), and making such distinctions is typically culturally "discouraged" (it is rude, pedantic, not what HN is for, etc).


Consciousness is not binary.

The burden of proof does have direction. It's in the direction of the more complicated thing. Do you think a rock is conscious? What about lightning? If you're 99.999% of people, then the answer is no. So if you want to claim AI is conscious you need make the claim where inanimate becomes animate. And you know what they say about extraordinary claims... But personally I don't believe in ghosts, even those in machines


See my other comment too, but I agree. Rocks and lightning aren't conscious of anything, because they have no state that represents anything. Amoeba are conscious of some things, because they have state that represents a tiny slice of the world. Cats are conscious of much more, then LLMs and humans in turn.

It's not about animate vs inanimate, it's about models that represent something.

There's also some confusion because we have limited language here. A person can be "unconscious", but that doesn't mean they've lost their world model. It's just currently not executing.


Unconscious brains appear to be functioning normally as far as all of science can tell, except for one big difference: Brain Wave frequencies. Only the lower frequencies exist. Higher frequencies are scientifically proven to be required for consciousness. You can call that a meaningless correlation that happens to be totally inconsequential, but most neuroscientists strongly disagree with that.

EDIT: I mean neuroscientists know brain frequency is more than an accidental correlation with consciousness.


TBH, I don't really know what you're driving at with brain waves. They're not magic. I think you're trying to use "consciousness" to mean "high level brain activity", as opposed to something like the lizard brain? To which I say that you're making a distinction that isn't there. Your lizard brain is perfectly conscious of many things. It's just conscious of fewer things than your prefrontal cortex.

I mention "unconscious" only to help disambiguate an overloaded word. It's used for both "this model includes something in its representation" as well as "this model is currently executing".


I just don't think Qualia emerges from computations. I think it's a wave phenomena, because all the neuroscience evidence indicates that. There's nothing magical about EM waves. Waves are a well understood physics with well understood mathematics governing their behaviors.

People often just hear the phrase "brain waves" and immediately think of "woo woo spirituality" because all the spiritual crack pots are constantly talking about waves without even having a physics background.


What you keep describing is computation. Or as best as I can tell because "brain waves" and "em waves" aren't that detailed of descriptions. Luckily computation is just as vague

I'm actually leaning towards a postulate that Schrodinger Waves are the core of the dynamic we call qualia (which is quite different from EM field itself), but at some point it has to interact with EMF in order to be able to effect motor neurons and cause actions.

As a physicist, I've NEVER heard anyone say "Schrodinger Waves." Surely you mean "probability waves".

How about picking up a physics textbook? The status quo for Quantum is Griffith's Introduction to Quantum Mechanics. Might I also suggest his E&M book (again, a standard). If you're half as passionate as you portray, I'd also pick up Jackson. That's the standard graduate level E&M book (but a warning on this: to get through Chapter 1 you're going to need to be intimately familiar with PDEs and Green's Function. To get further, I sure hope you like Bessel functions).

Take your passion and use it to drive you to prove. Don't just conjecture and don't try to derive things from first principles. You will get it wrong because you won't actually be working from first principles. Again, don't ignore humanity's greatest invention. It's so much easier to catch up than reinvent the wheel.

Also, you should become familiar with Penrose. I'll leave it to you to figure out why.


Stuart Hameroff's work in microtubules I think might be the one doing the closest work do discovering Quantum Effects in consciousness, and I tweet with him a lot on twitter. He hasn't yet been convinced that brains are entangled with PAST copies of themselves which is a core precept of my theory. The other person getting close to discovering the truth of my theory is Leroy Cronin, who has come up with `Assembly Theory` which is very reminiscent of my theory, but lacks the aspect of entangled negentropic structures being able to self-resonate. And also Michael Levin's work showing how he can get worms to have two heads even with the same exact DNA of the single headed ones (just as one example), is something I can derive directly from my theory. Since you're obsessed with my credentials: BS in Mech. Eng., been studying QM for decades, a software developer by profession since 1991, but thanks for the book recommendations, and most especially the genuine politeness.

I'm not obsessed with your credentials, I can just tell you haven't read much of the literature.

ad hominem for the win! Nice one.

Rhetoric FTW also.

Have you done substantial reading?


  > because they have no state that represents anything.
Are you sure about that? Both have lots of representation to them. Rocks have their atomic structures and you can easily view this as code. You can also view lightning as a representative of the intersection of cloud and ground states.

The point I'm making is that representation is not enough. Storage (or memory) is not enough even recall is not enough. If it is, then any mechanical process is. Because you can build computation machines out of nearly anything. You can build memory out of almost anything. And the hello world python program I wrote has both!

You can argue scale, but we see animals with much smaller scale than these ML models perform unambiguous reasoning tasks and critical thinking. You can then argue architecture and data, but then that's not just scale.

The problem with all these people making claims is that they don't consider base assumptions and they don't follow the consequences of the model (your mental model of what is consciousness) is and must mean! So many people argue that they are working from "first principles" but I don't see anyone writing down their fucking axioms.

Of course you should try to prove what you want, but not at the cost of not being wrong. Doing so will prevent you from achieving what you want. I'm just so tired of people shooting themselves in the foot and calling it medicine. We need to stop with all the bullshit because I actually want to build AGI


AGI exists now. We need better terminology.

You want something more, and that's fine because I do too. Our state-of-the-art models are baby AGI, because they can accomplish a wide variety of tasks without being specifically trained for it. They can do limited reasoning on new tasks. That's intelligence that's both artificial and general. But it's just the start.

There's a few axes that need to be captured. Breadth of knowledge, ability to reason, and ability to update the model based on new information. LLMs are great at 1, possible superhuman. For 2, they're 1/10 on a scale of 0 to human. For 3, fine tuning and prompt stuffing get the job done for now, but they're a far cry from human ability.

And yes, you can build computation machines out of nearly anything. If you simulate a human brain with sticks and stones, what does it matter the substrate?


  > AGI exists now
AGI exists if we redefine AGI to mean what we have now. But otherwise, no, it doesn't. What we have today does not meet the criteria that have been standard for the last half century[0]. I know people have a hard time keeping up, but no, the Turing Test is not the accepted threshold and hasn't really ever been. It was a starting point, and I should mention we beat it since at least 1967 (<20 years since Turings proposal). Nor is Chess, which we succeeded at in 1997.

What I hate about these conversations is that people learn the basics but there's been a lot of advancements since the 50's. We've learned a lot since the 90's too! Yes, the metrics keep changing, but they are not in response to the current state of the art in a specific effort to make the systems not AGI[1]. But rather the criteria keeps changing because we keep learning. It would be ludicrous if it didn't! Because WE STILL HAVE NOT CONCRETELY DEFINED CONSCIOUSNESS, REASONING, THINKING, and many other similar terms. The criteria are generally guides created by best guesses. But until we have formal definitions, expect it to change (and when we do have formal definitions, be sure as hell to understand their assumptions!).

But I can confidently tell you that LLMs do not exhibit generalization in the way we've been using it for decades. This is easy to demonstrate but if you're a true believer then there's no convincing. And in that case, just don't bother arguing either.

I really do encourage people to take their passion and use it to drive gaining a deep expertise on these subjects. You're passionate right? I hope that passion is not about "being right" or "having answers" and is actually about "understanding." And if you can't differentiate that, I suggest starting at the beginning, since these topics are covered there.

[0] Sure, you can find examples of people stating criteria, but I can find you samples of idiots, even among scientists.

[1] There are some bad faith people who do this but they're terrible people too. You may freely disregard their opinions.


>Are you sure about that? Both have lots of representation to them. Rocks have their atomic structures and you can easily view this as code. You can also view lightning as a representative of the intersection of cloud and ground states.

Sure then following OP's model outlined here https://news.ycombinator.com/item?id=41427796, rocks are simply conscious of the passage of time. Perhaps it can be said that is the one thing every matter in the universe is conscious of.

>The point I'm making is that representation is not enough. Storage (or memory) is not enough even recall is not enough.

You don't know what is or isn't enough.

And by the way, what 99% of people think of a concept they can scarcely define, test for or prove doesn't mean shit. We are regularly wrong about concepts with much stronger standings.

Humans don't have any special insight to consciousness. We can't even make decisions without unknowingly altering our preferences and fabricating reasons in the process. What makes the conscious mind tick was clearly lost along the way.

>We need to stop with all the bullshit because I actually want to build AGI

No, you want to build the idea of AGI that exists in your mind. Clearly not Artificial General Intelligence seeing as what exists today checks all those boxes.


> It's in the direction of the more complicated thing.

You are incorrect. Any claim of fact has a burden of proof.

> So if you want to claim AI is conscious

You made a claim that it is not. How did you determine that is True? Or did you?


It's not the silicon that's conscious of things, it's the model that happens to use that silicon.

Again, at what point does the program become conscious?

  ```python
  def main():
     print("Hello gaganyaan")

  if __name__  == '__main__':
     main()
  ```
That program is not conscious.

So... The burden of proof is about what level of complexity is sufficient.

Sure, no one has this answer. But also despite a few billion people believing in ghosts, including some saying a large portion of that saying they've seen ghosts, the burden of proof is to show ghosts exist, not showing ghosts don't exist. The reason being that one is falsifiable while the other isn't.

Don't confuse "the absence of proof is not proof of absence" with "the absence of evidence is not evidence of absence." If you do you might start a war in the middle east.


Computer chips don't have any brain waves. Even some theoretical computer chip able to run a "Perfect in Every Way" simulation of a brain including even simulating the physics of the actual brain waves...still has no brain waves. Nothing but 1s and 0s turning on and off, and not even turning on and off in anything resembling a wave pattern at all.

This is why people who believe the "Consciousness is a Computation" view never want to discuss brain waves, despite the overwhelming evidence that brain waves is precisely the thing out of which consciousness is composed.


Why would a perfect simulation of a brain not have brain waves? They're not magic. I'm perfectly happy to talk about brain waves. I suspect anyone else in the "Consciousness is a Computation" camp is happy to as well, I have no idea where you got that idea from.

I'd be interested in seeing any of this overwhelming evidence, but it's also not relevant, unless you're arguing that brain waves are supernatural in origin and can't be simulated. I think the crux of your argument can be boiled down to: "What's inferior about a perfect brain simulation, regardless of medium?"


I can write a computer program that analyzes, or even simulates, massive magnetic fields, but that program doesn't create any actual magnetic fields, right? Sometimes people say don't "Confuse the Map with the Territory" right? You get what that means surely.

You're correct about the magnetic fields[0]. But you can write a program to control things that do make magnetic fields.

But I don't think this is a relevant analogy. It seems that you're arguing a specific form of the embodiment hypothesis; that consciousness requires a body. Surely this is a sufficient condition (and is why I think roboticists have a good chance at cracking AGI), but is it a necessary condition? I'm not sure anyone has any convincing evidence that cognition requires "a body" (is a standard computer "a body"? What about a laptop that has a camera? What body parts are necessary?) and if it does, it is unclear what constitutes the minimal necessary requirements. It is quite possible that this is all "software" (e.g. a person in a vegetative state may be completely oblivious to their surroundings but still may have conscious experiences. Most people do this every night!).

I think you are erroring too far in the other direction. While I highly disagree with the person you're responding to, I do no think you are making strong arguments yourself. With so little known about what actually constitutes consciousness, I think it is difficult to make bold claims about what it is (though we've ruled out things that it isn't; e.g. a rock. Which is our typical method of doing reasoning in science; we generally disprove things, not prove. Like you said, the map is not the territory, and in this say way our models (yes, plural) of physics is not physics itself).

I think the is evidence to believe that consciousness may be a "software" but that is not proof. Similarly I think we have lots of reasons to believe that embodiment is highly beneficial to creating consciousness, as not just by nature of examples, but our understanding of consciousness is in part about responding to environments and stimuli. But that's clearly not sufficient either and pinning down where this is, is quite complex (even our definition of life is ill-defined). I'm highly convinced that embodiment is the best path to pursue to achieve AGI, but I'm not convinced it is required either.

[0] If we get pedantic, we should note that any simulation has to run on hardware and that creates magnetic fields but these are not in fact necessarily the fields being simulated. I don't think we need to pursue this avenue for this conversation.


I don't think consciousness/qualia requires a body, but definitely a brain (excluding plantlife to keep this short). I think it's a dynamic phenomenon involving waves. As I've said, all neuroscience data is consistent with it being waves.

I think neurons do 4 distinct types of things. They 1) gather data (sensory inputs) and route it to 2) locations where it's oriented into 3D spatial shapes that function essentially like "Fractal Radio Antenna" [for lack of better description] where the 3) waves resonate to create what we call qualia, and then 4) the waves induce a current/action going to motor neurons. It works similar to a television set working in reverse, with the screen being an eye, and everything going backwards in time, not forwards. Pixels are sensors, not lights, in the TV analogy. Thousands of pixels are input (rods/cones) and it all routs to (not from) an antenna.

You can explain all of that with Maxwellian Electrodynamics. There's no "woo woo" there. It's physics. If you look at it like this, you can see that most of the neural network wiring is I/O and signal routing, except for the antenna-like structures, the best of which being the Corpus Callosum and Hippocampus, which evolved first, which is why they're deepest in the brain. Their physical shape is critical, because they are both indeed functioning as a kind of resonator/antenna, and are getting memories (what most of consciousness is built on is memories) thru wave effects.

CORRECTION: You don't need a biological brain for qualia, but you need something that generates the same actual wave mechanics in actual physical reality. 0s and 1s in a computer ain't that.


  > but definitely a brain (excluding plantlife to keep this short)
"brains" are quite diverse and come in many variations. Squids have donuts. Sea Squirts devour their own brains. Leeches have multiple brains, Octopi (cephalopod) don't have a centralized brain and it's distributed, mostly in their tentacles. They also don't show cognitive decline when losing limbs. Maybe Wittgenstein can't understand a lion, but certainly a cephalopod is completely alien.

  > I think neurons do 4 distinct types of things. 
I'm a bit curious, how much have you read about brains? What about signal theory?

I ask because I suspect you're trying to solve the problem yourself and not building on the wide breadth of literature and research we have.

Brains do not operate in waves the same way a wave of water or electromagnetic wave does. Though anything can be approximated by a wave too, but signals need not be waves. Yes neurons conduct electric signals, but neuron to neuron communication (synaptic transmission) is chemical. These are discrete particles. But as I mentioned before, if we're talking about the electrical transmission inside the neuron, these are spikey[0,1]. When you look at neurons firing, there is a digital like signal. If anything, these look like Gaussian mixtures.

  > There's no "woo woo" there.
I agree that there's no magic, and so I ask that you don't introduce it. You're imbuing waves with magic while not recognizing what waves actually are. Analogue and digital are both waves. They both can integrate, resonate, interfere, and all that jazz. The same wave mechanics work. Your call to Maxwell is meaningless because ALL OF ELECTROMAGNETISM (minus quantum) is explained through those 4 equations (and have been extended to incorporate the latter). And just think about what you are suggesting. That digital signals are not described by physics? That software can't be described by physics? It's all physics. Sure, the abstract representations of these things are not physics, but that's true for any abstract representation.

So don't try to dispel people who are calling to magic by making a call to a different kind of magic. I don't think you're dumb, but rather it is easy to oversimplify and end up with something that requires magic. It's FAR easier to do this than to get the right answers. There's a reason people study very narrow topics for decades. Because it is all that complicated and trying to reinvent their work (while a good exercise) is just more likely to end up making many of the same mistakes that were then later resolved. We have the power of literature and the internet makes it highly available (though can be hard to differentiate from hogwash and it can be hard to find truth). Use the tools to your advantage instead of trying to do everything from scratch. Do the exercise, but also check and find the mistakes, improve, and do this until you catch up. Hell, you may even find mistakes the experts made along the way! The good news is that its SO much easier to catch up than it is to extend knowledge. You can learn what took thousands of years to get to in a week! It's hard, but in comparison catching up is trivial. So don't squander humanity's greatest innovation just because you worked hard at building your model. But refine it and make sure it doesn't invoke magic.

[0] https://en.wikipedia.org/wiki/Biological_neuron_model

[1] https://en.wikipedia.org/wiki/Soliton_model_in_neuroscience


Every sentence of that was pure strawmanning, so this is where we stop talking; but you were a good sounding board for a few back and forths up until here.

Asking you to step up your game is not strawmanning. But if you're unwilling to then I guess the other comments should have made it clear this would happen

There was no strawmanning

Strawmanning is when someone intentionally invents an incorrect interpretation of someone else's words, and then disproves or shows the faults with that wrong interpretation. That's precisely what we saw above.


Asking clarifying questions is the exact opposite of Strawmanning. Rather than assuming someone is wrong, it's better to ask for a clarifications first. That's what's happening in #41440133.

  > This is why people who believe the "Consciousness is a Computation" view never want to discuss brain waves, despite the overwhelming evidence that brain waves is precisely the thing out of which consciousness is composed.

Fwiw, I do actually believe that the brain is a computer. What I don't like about "consciousness is a computation" is it's utterly meaningless. Everything is a computation because computation is a fairly abstracted concept (same with simulation. Yes, we live in "a simulation" but that doesn't mean entities on the "outside" programming it. A ball rolling down a hill is also a simulation...)

But I really disagree with your argument. The difference doesn't come down to analogue vs digital and that seems like quite a bold assumption, especially considering 1) digital can get quite approximate to continuous signals and it can represent continuous signals 2) the brain doesn't function entirely continuously. Neuron signals are quite spikey. The neuromorphic people drew their inspiration from these and other aspects of the brain. But the brain is not operating on just continuous signals. There's a lot going on in there.

But if your true complaint is about how few ML people and far fewer ML enthusiasts will passionately discuss these subjects but also refuse to read literature from neuroscience, mathematical logic & reasoning, and other domains, then yeah, I also have that complaint. When you're a hammer I guess....


Do you think a computer simulation of an EMF wave is an actual EMF wave? If everything, according to you, is just a computation aren't you also claiming that actual EMF waves are "just computations"? Also does the substrate for the computation matter? What if I'm moving beads on billions of abacuses (as a thought experiment), are those abacuses therefore identical to an EMF wave if they're calculating wave mechanics? How does this abacus know I'm doing EMF wave calculations instead of, say, quantum probability wave calculations?

  > Do you think a computer simulation of an EMF wave is an actual EMF wave?
Of course not. I'm pretty sure we've also established this.

  > aren't you also claiming that actual EMF waves are "just computations"?
No. But actual real world EMF waves perform computation.

You are confusing what simulation means, which is why I explicitly put it in quotes. There's a dumb "simulation hypothesis" (the likes of which people like Elon Musk discuss) and there's a "yeah the terms are broad enough so it can be viewed as a simulation - simulation hypothesis". The latter isn't really contested because it also isn't very meaningful in the first place. But it does matter here for what you're talking about and why we brought up that a rock has "memory" since we can store information in the atomic structure. You're missing the abstraction of the stuff and how we're just talking about entropy.

Look at it this way, how does nand flash store memory? You trap electrons in states, right? This is to say you isolate charge to one side of a transistor gate from another. You can also do this with magnetic storage. There's plenty of people who build macro computers with things like water, and in those you're separating buckets of water. Physicists love to build computers out of things, and there are even thermal computers, using heat flow for computation.

That's the thing, computation is just about differentials in entropic states. So literally everything is a computer and everything is computing. Thanks emergence (another extremely misinterpreted word that people believe holds much more meaning than it does). But this does not mean there's some programmer, god, or any of that. It just means that information exists... You're making the same mistake these Musk like people talking about "living in a simulation" make, conflating the computer you're sitting in front of now that you control and have power over with just information flowing (obviously you can affect things in the physical world, and that's why experiments are physical simulations -- since they are controlled and limited settings -- but good luck programming "the universe" -- whatever the fuck that means. You're going to need more energy that it's got...)

So view physics as the instruction set of the universal computer or whatever. We're dealing with huge levels of abstraction so you can't go about understanding these things with reference to how we normally operate (this is why physics is so fucking hard and why advanced math -- well beyond calculus -- gets so crazy). Just like you can't work with high dimensional data by trying to relate to things in 2 or 3d, because the rules are very different at those abstraction levels (for example, the concept of a distance measure becomes useless. This is the curse of dimensionality that people very often misinterpret due to looking at it in a limited scope).


I think this field of discussion needs a new word for a new type of fallacy I'll name "Appeal to Base Reality" (ABR).

When someone makes a conjecture about physical reality like "Consciousness is made of EMF Waves, not Computations" (like I did) and then someone else tries to refute that with an appeal to the Simulation Theory (like you did) then that's called "Appeal to Base Reality".

Even if we're living in a simulation we still should consider EMF waves a very different phenomena than computations. Computations are not a phenomena of physics. Computations can be done on an abacus.


Much of the confusion here is from terms that you use like "become conscious", which binarizes it.

When run, that program will have a very simple model that includes only a few things like stdout. More complex programs will have models that represent more things. A game of tic tac toe written in Python is conscious of the state of the board, because it models it.

It's not "when does it become conscious.", it's "how much is it conscious of?". Asking the prior question is like asking "When does someone become rich?" or "When is the apple ripe?"

This isn't about ghosts. This is exactly a way of looking at it that provides a concrete, falsifiable definition, because far too many people are talking right past each other whenever "consciousness" comes up. This is trying to get past the magic words, and get at something useful.


The Map is not the Territory. In all your posts you seem to believe a map of data equals consciousness/qualia experience of said data. Computer 0s and 1s make Maps. Qualia is an actual Territory. There is nothing you can do with maps that equals a territory.

How do you know that qualia is an actual Territory?

In that popular expression the word Territory means what is real, and Map means what is merely a symbolic representation of reality.

I'm aware. How do you know that qualia are truly what is real? I don't see any reason that should be so

Well, your sentence "I'm aware" is my most recent bit of evidence. lol.

  > like "become conscious", which binarizes it.
Only if you pigeonhole me.

There are countless things where we recognize they are on a spectrum yet we use shortcuts in language to specify a sufficient threshold. If you don't believe me, may I introduce to you the color blue. Or any other color for that matter.

There is still the aspect that even when we consider consciousness to lie upon a continuum this does not mean that the function yields a non-zero result (or even approximately) prior to some threshold. In this case there is no ambiguity of "becomes conscious" as you can interpret this in the most weak form of consciousness. If you believe that the consciousness is only zero at the domain minimum then please state so, as I believe you are smart enough to not misinterpret my meaning and we can actually have a productive conversation. If you believe in that and also that it is a very slow growing function until a certain point, well... if you're going to be fucking pedantic then be fucking pedantic. In either case, you don't have to force this conversation into an argument and reconstruct it so that you're smart and I'm dumb. Who knows that relationship and such definitions are typically obtuse anyways.

So please, engage with at least a little good faith here.

  > This isn't about ghosts.
If you want to purposefully misinterpret me, okay. We're done.

Don't talk about people "talking past one another" when you are egregiously misinterpreting. In any conversation you have to make sure that you are correctly decoding the other person's words. At least if you do indeed want to "get at something useful."


I realize you're using ghosts as an analogy. I'm saying this isn't about anything mystical. I'm exactly trying to avoid any woo, by introducing more concrete definitions.

I'm not trying to pigeonhole you. I do believe that consciousness is only zero at the domain minimum and it's a slowly growing function, but I disagree that there's any sort of "certain point". Humans are "more conscious" than cats (and better terminology is that they are conscious of more things than cats, or their world model encompasses more than cats, or something like that), but that doesn't mean humans are conscious and cats aren't.

My entire point is that "consciousness" is entirely unlike "blue". You can ask the question "is this blue?" and have a coherent answer. You can't ask the question "is this conscious?". You need to ask the question "what is this conscious of?".


  > I realize you're using ghosts as an analogy
I think you are missing the point. The analogy is about why burden of proof is in a certain direction and the error you're making in logic is the exact same those that argue for ghosts make. It is also a call to "Ghost in the machine"

  > I'm saying this isn't about anything mystical.
You may think this, but you are relying on mysticism. But don't think this is me calling you dumb. It's so fucking easy to accidentally invent ghosts. There's a reason it takes so long to invent new knowledge, why it took humans so long to get to this point. Because we keep unintentionally inventing ghosts along the way. In the same advice I gave to the other person you're arguing with, stop trying to do it all on your own. Maybe you are smarter than the millions of people who have tried to figure it out before you, but certainly you have to recognize that leveraging the works of others will greatly increase your chances of success, right? You didn't try to invent calculus from scratch, so why this?

  > I'm not trying to pigeonhole you.
I don't believe you are trying to, but there's plenty of things I try not to do and fail at. I am frustrated, but not angry. Like what makes you think I don't think humans are more conscious than cats? I've explicitly stated we're in agreement in a continuum. But you do seem to still have ignored the part of semantics with regards to thresholds and come on, if you understand don't argue something you know I'm not arguing. If you think the semantic difference is critical, then be fucking pedantic, not continue to talk at such a high level. But maybe you think our fictitious conscious function is linearly growing. I would highly doubt that. There's quite a large gap between many creatures. Not to mention that even in a single person we see consciousness rapidly develop during childhood. So I don't know why you're harping on this point, because I haven't seen any single person in the comments contend with this argument.

  >  You can ask the question "is this blue?" and have a coherent answer. 
But this is wrong! This is demonstrably false. We don't even have to look at people with colorblindness nor people from history[0] (I specifically used blue because the history part makes this far more apparent), optical illusions[1], nor other (mother) languages[2]. While everyone is going to agree that #0000ff is blue, people are not going to agree on #7B68EE, which plenty will call purple. Here's a literal example[3]. This is the point with this example. It is easy to think that these are discrete terms but concepts like blue (or any other color) are categorical representations, not discrete. They also have multi-dimensional boundaries and there's a continuum of disagreement by people when you move from one category to another. It helps to create subcategories, but Turquoise is still a blue that many people will call green[4]. Worse, the same person can disagree with themselves, and no way is that coherent. Fwiw, I score 0 on this test[5], so if you want to see if we disagree you can check (I've also been tested in person).

  > You need to ask the question "what is this conscious of?".
Whatever answer anyone gives, is almost certainly wrong.

The thing is, everything is fuzzy. Embrace the chaos, because if we're going to talk about brains and consciousness and abstractions, well you're not going to be able to work with things that are concrete. But, this too is fuzzy and I think you're likely to misinterpret. Just because everything is fuzzy doesn't mean there aren't things that are more wrong than others. It's more that you can't have infinite precision. So stop trying to deal in absolute answers, especially when we're in topics that people are have been unable to solve for thousands of years.

[0] https://en.wikipedia.org/wiki/Blue#History

[1] https://en.wikipedia.org/wiki/The_dress

[2] https://en.wikipedia.org/wiki/Linguistic_relativity_and_the_...

[3] https://www.reddit.com/r/colors/comments/1ao2osm/is_this_col...

[4] https://www.color-meanings.com/shades-of-blue-color-names-ht...

[5] https://www.xrite.com/hue-test


Define consciousness before making that claim either.

I'm fine with the standard dictionary definition in this case. :P

There's as many definitions as there are dictionaries, but let's try picking "awareness". LLMs are aware of many things, so they are conscious of those things, meaning they display some level of consciousness.

Nice one. Now do "qualia"

Qualia is a silly god of the gaps style argument. There's no difference between "seeing red" and "having your neurons tickled so that you think you're seeing red".

People really don't understand consciousness, but it's actually quite simple when you realize that the question isn't "is this conscious?", it's "what is this conscious of?". A rock is not conscious of anything, because it has no state that represents anything. A sunflower is conscious of the sun's position, because it has state that represents that. A cat is conscious of much more, such as "this mouse is food". An LLM is conscious of much more, but less than a human.

The LLM is a model that is conscious of many things, as experienced through the medium of human text.


The fact that we can tell even from an EEG (a grossly inaccurate measurement of brain activity) whether a person is conscious or unconscious, indicates there's a physical reality to it, which might be able to be understood by science some day.

I think Qualia is a wave phenomena, and all wave phenomena simultaneously can be described as "existing" and "not existing" as follows:

Think of a football stadium "wave". It has a beginning, ending, location, velocity, etc., all being scientifically measurable observables. Yet I ask you "Does the wave exist, or is it merely people in various states of arm positions?" There is no right or wrong answer. This is the nature of waves.

So if you want to argue consciousness "does not exist" I say fine, that is a valid "perspective", just like the denial of the existence of any other wave phenomena is a valid perspective. However keep in mind also that the entire universe is also made of waves.


Brain waves existing or even being crucial to consciousness in humans doesn't really say anything in itself.

Like for all anyone knows, they're simply present to trigger (and keep triggered) "consciousness computations".


If qualia itself is the actual waves themselves, then that [arguably] disproves the "Substrate Independence" view, which is that consciousness is a set of computations. You can build logic gates out of dominoes or even physical abacuses, but if qualia is made of waves no amount of "simulating it" via calculations is going to truly conjure it into existence.

Sure, waves exist, whatever. Not to be rude, but so what? This kind of comes across as "quantum waves therefore the Universe Is Love" or something. What about the waves matters here?

I'm not arguing that consciousness doesn't exist, I'm saying that it's not a binary thing, and trying to provide a useful, falsifiable definition of it that provides a way to actually communicate about it instead of talking past each other.


I partially answered this on the other post. Some people think Qualia is emergent purely from computation. Some people (like me) think there's a different and very specific and very unique thing going on inside brains that is responsible for Qualia, and I think to understand it scientifically it's going to be a "Wave Theory".

I'm not even pushing the panpsychism view that "Everything has Consciousness, because all atoms have EM fields" at all, but when you get right down to it, EM fields are more "real" than a computation is. A computation is purely theoretical. We can build computer logic circuits out of water pipes or dominoes. So it's just obvious to me Qualia is not built on computations.

Believing Qualia is made of computations is more of a naieve "Universe is Love" type position than believing that Qualia is built of something REAL like EM fields. Calculations are not even real. For example the number 0100101110110 in a computer is purely a "label" for something real. Numbers in computers are just labels. EM waves are the actual thing, and actually real. The Map is not the Territory. Computers are Maps. Qualia is a Territory.


Well we're not going to convince each other of fundamentally different worldviews over the internet. Until proven otherwise, I think qualia is just a trick smart people play on themselves to convince themselves humans are special because it makes them feel better, and indelibly tied to the concept of a soul and like woo. I'm willing to be proven wrong, and that's what science is for. Hopefully we'll see answers in our lifetimes.

I think it takes more "faith" (belief without evidence) to deny the existence of Qualia as being real, than it does to accept that it's real. Of course even the word "real" is problematic because of the simple analogy: "Are Football Stadium waves real, or is it nothing but people's arms moving." Anything that's purely a dynamical set of state changes can be legitimately argued to "not exist".

The grandiose claim would be that LLMs have reason to begin with. It is quite normal for things to not have reasoning capacity.

They can be trivially shown to reason. It's quite normal for many things to have reasoning capacity, including amoeba, sunflowers, cats, LLMs, and humans.

The real issue is establishing the limitations of their reasoning capabilities, and how to improve them.


As has been noted in other threads, for a low enough definition of reason, yes, certainly. But even an amoeba appears to reason better than an LLM, which only regurgitates by nature. There is no independent cognition whatsoever, no logical constructs and decisions, no abstractions.

Reducing reasoning beings to the level of AI is an affront to—and demonstrates a genuine lack of understanding of—the nature of organisms and reason itself, and the nature of AI and its capabilites.


LLMs are statistical text generators whose results depend on the model and the context given. They have gotten so good because the context they can effectively operate over keeps getting really big. If you take the model and operate on a small context, you will get very uninteresting results.

The only reason it seems like it is reasoning is because it’s probably stuffing a lot of reasoning in its context, and regurgitating that out in ways that are statically weighted with other things in the context on what is being reasoned about.

Frankly, even most commenters on HN don’t get how LLMs operate, thinking the model itself is what knows about different bases like hex and oct, when really, it searched up a bunch of material on different bases to include in the context before the model was ever activated.


Brains do not reason - they are a neural network whose results are defined by their experiences, memories, and sensory inputs.

I'm tired of reading comments from people who keep repeating that LLMs don't think, don't reason, isn't intelligence because it is not human. If your definition of the above is because it's not human, it's quite useless as a comment. We know LLMs aren't biological human brains. So what?

Define what reasoning is to you. Then tell us why LLMs don't reason and why it matters.


Not OP. But:

1. Reasoning is ability to at least carry out proofs in FOL (first-order logic). FOL can simulate Turing Mach

2. LLMs are formally equivalent to only a subset of FOL.

Why is this important? To model human mathematics, you need at least first-order logic.

These arguments have been around for decades, e.g., Penrose. I am tired of people bringing up strawmen arguments ("Not intelligent because not human!")


Is there a proof that collections of human neurons are capable of carrying out proofs in first order logic, in full generality?

Is anyone trying to prove that all humans can?

Transformers with chain of thought can use FOL: https://arxiv.org/abs/2310.07923

"Turing complete as long as you only use a polynomial number of steps in the size of the input" is another way of saying "not Turing complete" (doubly so when you look at the actual polynomials they're using). In fact, that paper just reaffirms that adding context doesn't improve formal expressive power all that much. And this is not equivalent to "well, Turing Machines are fake anyway because of finite space"--I can pretty trivially write programs in Coq, a total and terminating language, that will take far longer than the age of the universe to normalize but can be described in very little input.

Can a cat reason? It knows nothing of FOL or math, but can be very smart when figuring out how to catch a mouse.

The term 'reason' is being overloaded here.

The type of 'reason' a cat uses is different from the 'reason' used in math

The information a cat uses is incomplete whereas the information used in math and logic is theoretically all accessible.

The reasoning a cat uses allows for inconsistencies with its model because of its use of incomplete information whereas no inconsistencies are permissible in math and logic.

Formally speaking, math uses deductive reasoning whereas the cat uses abductive reasoning.


> Brains do not reason ...

They can, at times, and do so best when emotion is not involved.

> I'm tired of reading comments from people who keep repeating that LLMs don't think, don't reason, isn't intelligence because it is not human.

LLM's represent a category of algorithm. Quite elegant and useful in some circumstances, but an algorithm none the less. A quick search produced this[0] example discussing same.

Another reference, which may not be authoritative based on whatever most recent edit the link produces, is[1]:

  A large language model (LLM) is a computational model
  capable of language generation or other natural language
  processing tasks. As language models, LLMs acquire these
  abilities by learning statistical relationships from vast
  amounts of text during a self-supervised and semi-supervised
  training process.
> Define what reasoning is to you.

Reasoning was the process I went through to formulate this response, doing so with intent to convey meaning as best as I can, and understand as best as possible the message to which I am replying.

> Then tell us why LLMs don't reason and why it matters.

LLM's do not possess the ability to perform the process detailed above.

This is why it matters.

0 - https://github.com/rasbt/LLMs-from-scratch

1 - https://en.wikipedia.org/wiki/Large_language_model


> Reasoning was the process I went through to ...

That's not a useful definition for judging whether LLMs reason or not. It's not something we can measure on an objective level and introduces another concept of intent which is just as vague as reasoning.

Specifically, an LLM can produce a similar message to what you posted. Everything else about that process is not defined well enough to differentiate you.


> > Reasoning was the process I went through to ...

> That's not a useful definition for judging whether LLMs reason or not.

I formulated this portion of my post specifically to induce contemplation as to what it is when we, as humans, reason. My hope was that this exercise could provide perspective regarding the difference between our existence and really cool, emergent, novel mathematical algorithms.


Ok, that's cool, now ask GPT to solve any programming or logic problem at all and maybe you'll start to understand why you can reason and why it can't.

It can solve some problems and not others. But that would be again a different definition of reasoning, not what the parent wrote. And it would exclude animals/humans as reasoning, because they can't solve all logic problems.

Artificial intelligence is still intelligence, even if it is just a shallow copy of human intelligence.

What irritates me when I see comments like yours is that precise knowledge of weaknesses of LLMs is necessary to either improve LLMs, so most of the people who claim LLMs reason or are already AGI basically deny the ability to improve them, since they are already perfect. Research into studying the limitations of the current generation of AI is unwanted and by extension so is the next generation of AI.


The conclusions of the authors that LLMs can reason inductively very well runs counter to what I've read elsewhere. A big part of doing induction is the ability to generalize a shared pattern from multiple disparate examples, recognizing the essential elements that are necessary and sufficient to satisfy that pattern's operators' constraints. To date, I've seen consensus that LLMs can match verbs or basic relational operators across examples, thereby associating the mechanisms in similar events that lead to similar outcomes. But extending that facility further, to employing predicate logic operators, or even the simpler propositional ones appears to fall largely outside LLM capabilities. To suggest then that LLMs can then perform higher-order reasoning skills yet, like the modeling of contrapositives, this seems quite a stretch.

I've just successfully chatted to ChatGPT about equivalence or at least high similarities between QFT, neural networks and cellular automata (referencing Wofram's work). Does that pattern matching count?

And you were able to verify, of course, that anything new or surprising to you (as in, not a simple derivation of your own prompts), was true ?

I noticed that if I ask it to tell me how good cryptocurrencies are, it'll do it, and then if I say I disagree and they're wrong, it'll simply switch and agree with me as well. The thing has no care for truth, no opinion of its own, no ability to insist, and just feeds you whatever is statistically close to your own questions.


No but GPT is really good at fooling laymen who are not experts of a field, and it stands to reason that it just fed you a bunch of bs

Can LLM-based tools even handle the donkey's bridge of syntactic consequence ("A->B & A; B?") yet?

Lagniappe: https://www.youtube.com/watch?v=9B2ww3fiX30


Confession: I haven't read the paper.

But any mention of LLM reasoning ability ought to address the obvious confound: the LLM is trained on examples of deductive reasoning, inductive reasoning, abductive reasoning, SAT-solver reasoning, geniuses' musings, etc. If they replicate one of those examples, then should that be called "reasoning" of any sort or not? Regurgitating those examples may even involve some generalization, if the original topics of an example are swapped out (perhaps by a nearby topic in latent space).

Given that it appears they're training and testing on synthetic problems, this objection probably does not apply to their actual results. But given the fuzziness it creates for the definition of "reasoning" of any sort, I would have expected some working definition of reasoning in the paper's abstract.

Training on Moby Dick and thus being able to regurgitate text from Moby Dick does not mean the LLM is capable of writing a new Moby Dick-like book. (Thankfully; one is more than enough!)


The tasks used are artificially created and don't exist in the training sets. For example there's very little practical math in base 11 on the internet, or English with explicitly mixed up but rule based grammar.

LLMs don’t feel like a whole brain, they feel like the impulsive side that gets filtered by other processes.

> LLMs don’t feel like a whole brain ...

Because they are not.

LLM's are a better form of Bayesian inference[0] in that the generated tokens are statistically a better fit within a larger context.

0 - https://en.wikipedia.org/wiki/Bayesian_inference


This strikes me as saying "Brains aren't brains, they're merely a collection of neurons that output some voltage in response to their input voltage".

You're comparing the physical item with the reasoning capabilities of that item.

A popular LLM joke at the moment is "How many Rs in strawberry?"

You, a human being, understand this question. The purpose of the activity is to count the letter R in that word and give the one correct answer.

LLMs don't understand any of that. They don't know what words are, what letters are, what numbers are, what counting is, that a question can have one definite correct answer or many answers, what a question is, nor what an answer is.

They break your input into tokens and then look at the most likely set of output tokens given your input. That's all.

If you train a model on enough sensible input tokens and sensible responses then the output will mostly seem human and sensible. But it's never reasoning.


> They break your input into tokens and then look at the most likely set of output tokens given your input. That's all.

That isn’t right: the pre processor provides a lot of material on strawberries and counting r’s, which is then pretending to the question…and then they predict the next sentence as an answer to the question. The model by itself doesn’t know anything, it it just a statistical processor of context, just tokenizing the question and using the model to predict the answer would actually give you less than a wrong answer, it would be gibberish. It messes up on the question because the context it retrieves based on the question text isn’t useful in producing the correct answer.


>because the context it retrieves based on the question text isn’t useful in producing the correct answer.

"retrieves" is the wrong word. Each token (in GQA a small tuple of tokens is summarized into a single token), becomes an element in the KV cache. The dot product of every token with every other token is taken (matrix multiplication) and then a new intermediate token is produced using softmax() and multiplication by the V matrix. What the attention mechanism is supposed to do is combine two tokens and form a new token. In other words, it is supposed to perform the computation of a function f(a,b) = c. The attention layer is supposed to see "count r" and "strawberry" and determine the answer 3.

Well, at least in theory. Given the combination "count r" and "r", it is practically guaranteed that the attention mechanism succeeds. What this tells us is that the tokenization of the word "strawberry" is causing the model to fail, since it doesn't actually see the letters on the character level. So it is correct to say that the attention mechanism does not have the correct context to produce the correct answer, but it is wrong to say that "retrieval" is necessary here.

The reason why it doesn't make sense to label what is happening as "reasoning" is that the model does not consider its own limitations and plans around them. Most of the work so far has been to brute force more data and more FLOPS, with the hope that it will just work out anyway. This isn't necessarily a bad strategy as it follows the harsh truth learned from the bitter lesson, but the bitter lesson never told us that we can't improve LLMs through human ingenuity, just that human ingenuity must scale with compute and training data. For example, the human ingenuity of "self play" training (as opposed to synthetic data) works just fine, precisely because it scales so well.

Instead of complaining so much about humans trying to "gotcha" the LLMs, what we really ought to build is an adversarial model that learns to "gotcha" LLMs automatically and include it in the training process.


> They break your input into tokens and then look at the most likely set of output tokens given your input. That's all.

That's humans as well, possibly. This description gets repeated over and over and is factually correct, but we still don't know if brains do anything more. The "it's not reasoning" may be true, but it doesn't follow from just that description.


They very much do understand words, letters, etc. Why do you think otherwise? Because of one trick question?

You are fooled into thinking that producing a set of output tokens which mimic actual human input and output tokens displays understanding. It doesn't.

"How many Rs in strawberry?" is not a trick question. That's about as straight and factual as a question could be.


Intelligence is a vast spectrum of capabilities. There are people who function normally but cant't recognize faces. There are people who can't see or understand anything on their left side. Even savantism is an example where some brains have nearly superhuman capabilities in one area, but are almost mentally retarded in other areas, like common sense, and yes even letter counting. All the letter counting failure shows is that maybe all LLMs lean a bit more towards autistic brains than towards the actual average human brain.

Humans have many well-documented cognitive shortcomings and failure modes. Does that mean we don't reason either?

We aren't talking about a human mind which operates in unexpected or ways which don't conform with society. Don't bring an appeal to neurodivergence to an LLM argument.

We are talking about a computer program whose operation we understand, can observe, and can debug. Modern LLMs certainly take a lot of human effort to do this, but it can be done.


This isn't about neurodivergence. Every single human suffers from a long list of cognitive shortcomings. We just call them things like "optical illusions" and find them interesting, but don't then go on to make silly claims like "humans can't reason".

Because humans can reason. There is a difference between faulty reasoning and no reasoning at all.

Ask it this.

"Please spell the word strawberry with a phonetic representation of each letter then tell me how many 'r's are in the word strawberry?"

It's gotten the answer right every time I've tried.


Another one is to just ask it to spell it out with dashes in between each letter. If anyone's wondering, it then gets it right because that changes the tokenization, changing how the model actually "sees" it.

>"How many Rs in strawberry?" is not a trick question.

To a system that sees letters and words, sure. To a system that doesn't, you're just asking a blind man to count how many apples are on the table and patting yourself on the back for his failure. It just doesn't make any sense to begin with.

And this is before the fact that human intelligence is rife with absurd seeming failure modes and cognitive biases.

It's frankly very telling that these token related questions are the most popular kind of questions for these discussions.

This is no testable definition of reasoning or intelligence that will cleanly separate LLMs and Humans. That is the reality today. And it should make anybody pause.


> you're just asking a blind man to count how many apples are on the table

Which input is it you think an LLM doesn’t have to be able to count the number of letters in a word you literally provide it


The text is converted to embeddings after tokenization. The neural networwk only sees vectors.

Imagine the original question is posed in English but it is translated to Chinese and then the LLM has to answer the original question based on the Chinese translation.

It's a flaw of the tokenization we choose. We can train an LLM using letters instead of tokens as the base units but that would be inefficient.


By that definition the LLM literally does not see anything. LLMs predict tokens. That's it.

The LLM sees tokens, and predicts next tokens. These tokens encode a vast world, as experienced by humans and communicated through written language. The LLM is seeing the world, but through a peephole. This is pretty neat.

The peephole will expand soon, as multimodal models come into their own, and as the models start getting mixed with robotics, allowing them to go and interact with the world more directly, instead of through the medium of human-written text.


It sees embeddings that is trained to encode semantic meanings.

The way we tokenize is just a design choice. Character level models(e.g. karpathy's nanoGPT) exist and are used for educational purpose. You can train it to count number of 'r' in a word.

https://x.com/karpathy/status/1816637781659254908?lang=en


ChatGPT answer: There are *two* "R"s in the word "strawberry."

>> A popular LLM joke at the moment is "How many Rs in strawberry?"

> ChatGPT answer: There are two "R"s in the word "strawberry."

Given enough instances of the "LLM joke" in a training data set, the joke itself having a consistent form (sequence of tokens) and likely followed by the answer having a similarly consistent form (sequence of tokens), the probability of the latter being produced as quoted is high.


There is a lot of broken English on the internet and yet LLMs are better at English than the average native speaker. This failure mode has nothing to do with the training data.

While there is sometimes an exaggeration of the differences, I have always found LLMs to behave like (and have many of the weaknesses of) the left hemisphere of the brain as described in books like “The Master and His Emissary.” Considering the left is the language center, this shouldn’t be surprising.

Transformers are amazing pattern matchers and terrible use of GPUs for reasoning, which is mostly search + execution of highly non-linear programs (lambda calculus).

I love seeing Victor Taelin experimenting with parallizing these programs (with HVM and other experiments with proof languages), but it's sometimes a bit sad how much time researchers take in making papers about existing things instead of trying to improve the state-of-the art in something that's most probably missing from the current models.


>> Reasoning encompasses two typical types: deductive reasoning and inductive reasoning.

I don't know about "typical" but every source that classifies reasoning (or, more appropriately, logical inference) as deductive and inductive, also includes the abductive category. This categorisation scheme goes all the way back to Charles Sanders Peirce:

'[Abduction] is logical inference( ... ) having a perfectly definite logical form. ( ... ) Namely, the hypothesis cannot be admitted, even as a hypothesis, unless it be supposed that it would account for the facts or some of them. The form of inference, therefore, is this:

The surprising fact, C, is observed;

But if A were true, C would be a matter of course,

Hence, there is reason to suspect that A is true.' (Collected Papers of Charles Sanders Peirce. Peirce, 1958)

(Quote copied from Abduction and Induction, Essays in their Relation and Integration, Peter Flach and Antonis Kakas eds. 200)

Consider a logical theory, formed of rules in the form of implications like A -> B (premise A implies conclusion B). Abduction is the inference of the premises after observation of the conclusions, i.e. if A -> B AND B is observed, then A may be inferred.

That's a different inference mode than both deduction: inferring a conclusion from a premise, e.g. if A -> B AND A, then B may be inferred; and induction: inferring a rule from an observation, e.g. inferring A -> B after observing A and B. Note that this is a simplification: induction assumes a background theory of more rules A1 -> A2, .... An -> A that can be applied to the observation A and B to infer A -> B.

Anyway, abduction is generally associated with probabilistic reasoning, albeit informally so. That probably means that we should categorise LLM inference as abductive, since it guesses the next token according to a model of probabilities of token sequences. But that's just a, er, guess.


Why does it have to be an either-or? Maybe LLMs are doing a bit of both, in a weird hybrid way; it is both doing a statistical calculation and yet the network is parameterized to do some very rudimentary (and nonhuman) reasoning computations. That's plausible to me, and explains why the reasoning is so hard to isolate... Just like how looking at a human brain it is hard to isolate the reasoning capacities.

The human mind wonders and takes time to dream autonomously. Perhaps the llm.c we need for the next breakthrough addresses rounds of meditation in it's training in order to provoke more reason alike features to the NextGen LLM.

Asking them to make ASCII art is the final test, to me.

Neither. LLMs are just really, really good pattern matchers with enormous set of patterns.

LLMs are incredible about mapping to a space of already seen things. When this space is unimaginably large, you can be fooled for a long time.

But, they clearly struggle with generalization and rule following. This failure to generalize (extrapolate, deduce, compute) is why we still can't fire all of our DBAs.

Has anyone encountered an LLM-based text-to-SQL engine that actually gets the job done? I think that's your best canary. I stopped caring somewhere around "transpose these 2 letters of the alphabet" not working consistently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: