Hacker News new | past | comments | ask | show | jobs | submit login
Natural language instructions induce generalization in networks of neurons (nature.com)
186 points by birriel 9 months ago | hide | past | favorite | 89 comments



>Tasks that are instructed using conditional clauses also require a simple form of deductive reasoning (if p then q else s)

> Our models ofer several experimentally testable predictions outlining how linguistic information must be represented to facilitate flexible and general cognition in the human brain.

Aren't those claims falsified by more recent studies that show that even in flys, preferred direction to a moving stimulus uses the timing of spikes. And that fear conditioning in even mice uses Dendritic Compartmentalization?

Or that humans can even do xor with a single neuron.

If "must be represented" was "may be modeled by" I would have less of an issue and obviously spikey artificial NNs have had problems with riddled basins and make autograd problematic in general.

So ANNs need to be binary and it is best to model biological neurons as such for practical models... but can someone please clarify why 'must' can apply when using what we know now is an oversimplified artificial neuron models?

Here are a couple of recent papers but I think dendritic compartmentalization and spike timing sensitivity has been established for over a decade.

https://pubmed.ncbi.nlm.nih.gov/35701166/

https://www.sciencedirect.com/science/article/pii/S009286741...


There's also many issues with just interpreting the results of the work.

  Our best models can perform a previously unseen task with an average performance of 83% correct based solely on linguistic instructions (that is, **zero-shot learning**).
They used GPT-2 from HuggingFace. I'm unsure what data this model is trained on. If it is the original GPT-2 checkpoint then that data is unknown. I just refuse to let anyone casually claim "zero-shot" when the training data is unknown. GPT-2 was trained on 40GB of text data (which is A LOT! It includes 8 million documents and 45 million web pages). This may not be the crazy sizes we see today, but even then the community was concerned about accurately stating what was in distribution and out of distribution. You can't know if you don't know what it was trained on AND how it was trained (since the mathematics can also put pressure on certain things that may not be realized at first).

In addition to this, their efforts look to be mainly using clustering techniques. CLIP itself is a clustering algorithm. ANNs frequently do clustering as well, but you know, there's some black box nature to them (but not entirely opaque either).

It is very hard to draw causal conclusions when you use either of these two things. Not to mention the fact that causality itself is difficult given that different graphs can be indistinguishable.


Yes, the word “represented” is too widely used and abused in neuroscience—to the point where a frog has “fly detector” neurons. Humberto Maturana pushed back against this pervasive idea. Chapter 4 of Terry Winograd’s and Francesco Valera’s Understanding Computers and Cognition has a good overview of common presumptions.

Given that CNS is a 700 million year hack, there will be lots of odd tricks used to generate effective behaviors.


> Or that humans can even do xor with a single neuron.

That's news to me.

I'm not hugely surprised given I've heard a biological neuron is supposed to be equivalent to a small ANN network, but still, first I've heard of that claim.



Thanks :)


It's a similar situation to terminology for the atom.

It originally in Greek meant "the smallest indivisible unit of matter".

Scientists then took the name and named various elements (hydrogen, gold, etc) as various atoms.

So, this is like when computing took the idea of a neuron as "the smallest indivisible unit of memory and calculation" and ran with it.

Fast forward to now, when we know that each "atom" has a bunch of smaller stuff internally, but by now it's too late to change the terminology.

And now we also know that a biological "neuron" is something more like an embedded CPU or FPGA in its own right, each with a bunch of computing and storage capability and modes.


There’s a long debate in neuroscience about whether information is encoded in timing of individual spikes or only their rates (where rate coding is a bit more similar to how ANNs work, but still different). It hasn’t been decided by any one paper, nor is it likely to be: it seems that different populations of neurons in different parts of the brain encode information through different means.


Not either-or. It is both. Spike rate variation is way too slow for some types of low level compute. Spike timing us critical for actions as “simple” as throwing a fast ball into the strike zone.


> Or that humans can even do xor with a single neuron.

having a single neuron that has learned xor != understanding xor

Function approximation is trivial, understanding of what said functions can do and when to use them is much harder (though is arguably still function approximation)


Well xor is linearly inseparable, which is impossible with a single perceptron.

> Our models by contrast make tractable predictions for what popu- lation and single-unit neural representations are required to support compositional generalization and can guide future experimental work examining the interplay of linguistic and sensorimotor skills in humans.

Do you see where that causes an issue with supervenience? Especially when mixed with STDP which could change that more?

It is confusing the map with the territory. At least with the extreme strength of their claim.


I'm early, so to say, in biochemistry, but could any of this relate to the level "below" neurons - to ion channels?


Birds inspired planes. Later, aerodynamics fed back into ornithology. I've been waiting for LLMs to be evaluated as a model of human thought. Complexity and scale have held neuroscience back. It's nowhere close to building a high-level brain model up from biological primitives. Like ornithology, it could use some feedback.

All arguments about AGI aside, a machine was built that writes like a human. Its design is very un-biological in places, so it's tempting to dismiss it. Why not see how deep the rabbit hole goes?

Like all conjectures it just might surprise us. For one, the intuition that language is central to thought predates LLMs, but it's certainly consistent with it.


> a machine was built that writes like a human

But the real hero here is not the LLM, but the training set. It took ages to collect all the knowledge, ideas and methods we put in books. It cost a lot of human effort to provide the data. Without the data we would have nothing. Without GPT we could use RWKV, Mamba, S4, etc and still get similar results. It's the data not the model.

> the intuition that language is central to thought predates LLMs

Language carries AI and humans. The same distribution of language can be the software running in our brains and in LLMs. I think humans act like conditional language models with multi modality and actions. We use language to plan and solve our problem, work together and learn (a lot) from others.

Language itself is an evolutionary system and a self replicator. Its speed is much faster than biology. We've been on the language exponential for millennia, but just now hit the critical mass for LLMs to be possible.

It's not so important that GPT-4 is a 2T weights model, what matters is that it was trained on 13T tokens of human experience and it now "writes like a human". Does that mean humans also learn the same skills GPT-4 has learned from its training set mostly by language as well?


> But the real hero here is not the LLM, but the training set.

And in the case of windmills the hero is the wind. But the mill is still a fantastic achievement.


And if you rearrange the letters in “mill”? “I, LLM.”

qed


Correlation checks out.


But can mills easily move? No, people can, so people are still viable.


>But can mills easily move?

Yes, that's their whole thing: moving when the air blows.


But can they relocate?


Indeed, they are mostly parroting that knowledge with some reasoning from combinations of those patterns. That counts for something. It’s not what we do, though. Or not all we do.

Humans can just sit around aimlessly toying with stuff, reading books, etc. They’ll figure out some of these patterns on their own. Whereas, we have to give these things a ton of highly-curated, pre-processed data made by human minds of all kinds. Then, it’s usually 800GB-4TB for the good ones. They’re appear to be not in our league yet as learning machines.

We’ll be able to assess it better as multimodal models come online. We can train them like infants, then children, on books, random observations through cameras, TV, people reading to them, supervised feedback… all the stuff we do with humans. Then see if and how they match up in performance.


> Later, aerodynamics fed back into ornithology.

You mean to imply that birds have learned from jet fighter designs?

I fail to understand the point you're making.


I believe the point is that we understand the birds' flight better applying the principles we learnt designing and flying airplanes. Similarly, we can learn more about human brain by applying things we learn from building the ANN back to the study of humans (anthropology in general, maybe neuroscience and psychology )


Concepts from the mechanics of flight (e.g. lift and drag) have helped ornithologists understand birds better. Birds themselves did not learn much.


He didn't say it fed back into birds, but into the study of birds.


Knowledge learned from airplane propeller development, as well as from jet plane wing aerodynamics, has had a great impact on the design of wind turbine blades, which are now affecting the evolution of birds.


Why assume this is sensible? Aerodynamics can help discover general principles that apply to both birds and planes or whatever else. I don't see how this holds for LLMs and brains. The similarity between LLMs and brains is superficial.

Besides, with human beings, we have a host of philosophical problems that undermine the neuroscientific presumption that a mechanistic and closed view of the brain can account for mental activity entirely, like the problem of intentionality.


>I don't see how this holds for LLMs and brains. The similarity between LLMs and brains is superficial.

It's really not any more superficial than planes and bird flight.


> We found that language scaffolds sensorimotor representations such that activity for interrelated tasks shares a common geometry with the semantic representations of instructions, allowing language to cue the proper composition of practiced skills in unseen settings.

Sapir-Whorf with the surprise comeback?


> comeback

Did it fall out of favour?


Strong Sapir-Whorf (linguistic determinism - language constrains thought) became pretty much seen as a joke by the 1980s. Linguistic relativism (weak Sapir-Whorf - language shapes thought) is still respectable (because, I mean, of course it does).

Actually, this research might just as well be evidence for linguistic universalism (Chomsky - language enables thought).

In general linguistic philosophers have been coming out with either laughably obvious or utterly untestable hypotheses for a century and it’s amusing to see how these AI studies shake up the hornets.


Oftentimes I find myself understanding complex concepts before I can describe them, even internally. I am sure everyone has this, as I often read comments praising others' submissions for formulating their thoughts efficiently. So thoughts occur independent of language, but need it to be expressed and shared, even if through pictures and sounds.


Thoughts occur independent of language is same as saying sentient beings think. The question is does the thought you have depend on the language?

I speak tamil and english and can distinctly see how the language drives some of my understanding. If you have a language that has evolved to describe 3D space, would be understand spatial ideas better/faster?

If we are pattern matching creatures, then the patterns are built over a period of time and our earliest scaffolding for the patterns come from our mother tongue (or the languages learnt in early childhood). Subsequent understanding depends on building and expanding on those patterns.


I grew up trilingual, and have noticed that I understand mechanical concepts better in one language, industrial concepts in another… but have mostly defaulted to English nowadays. I find learning new concepts easier by playing translation games; Which language is the root for this word, and how does it mechanically relate to the concept?


I a programmer I have to make a sharp distinction between "feeling of understanding" and understanding. The former can easily dissolve when you try to operationalize it, that is to make something that works based on the feeling that you understand it as opposed to producing a string of words based on that feeling.


>So thoughts occur independent of language

Independent of language as the conscious surface level mechanism, maybe - as in, they don't have to be in English, say. But independent of language altogether, including symbolic language encoded into brain structures, I wouldn't be so sure.

Language doesn't have to mean conscious internal monologue.


> Language doesn't have to mean conscious internal monologue.

Agreed, but boy do I wish we had better words (ha!) for this.

Calling everything "language" even if someone internally has a more visual or tactile or some other kind of "internal grammar" really gives an unfortunate tilt to casual conversation.

For most people, in everyday discussion, "language" means words/text. I wish we had some term for "structured knowledge" that did not rely on the words/text analogy, since it can leave different-minded people feeling a bit sidelined.


Sounds like a description of understanding a concept in some latent space while not having fully verbalized it yet. (:


Expression is not language. What you're having trouble doing is expressing what you understand.


But that would be an impossibility if understanding requires expressing it in language.


The thinking language doesn't have to be the same thing as the expression language.

It can still have a language form (manipulation of groups of symbolic structures, terms, and associations), but doesn't have to be English, or even at the conscious "internal monologue" level.


How can you know that it's symbolic if you have no conscious access to it? Processes happen in the brain, some results in explicit symbolic representation, others not so much. Bringing "language" into this does not achieve much besides the fact that language we use to communicate plays some role in internal monologue.


>How can you know that it's symbolic if you have no conscious access to it?

Well, it has to be able refer to things (it can't include the actual objects), not to mention handle abstractions and concepts.

An animal can do that with direct response/manipulation/pointing etc, humans must do it at a symbolic level to handle the world at the level we do.


>Strong Sapir-Whorf (linguistic determinism - language constrains thought) became pretty much seen as a joke by the 1980s

Was there any substantial empirical reason it was "seen as a joke", or just changing philosophical fashion?


In general, as I understand it, there was only ever evidence for a weak version, and many of the cited anthropological examples that make the case turn out to have dubious factual basis - the old ‘Eskimos have hundreds of words for snow’ and ‘there’s a tribe in Africa who have no word for numbers greater than three’ stuff, all filtered through layers of academic anecdote and institutional racism.


> institutional racism

are you absolutely certain that your thoughts are not being constrained by your language?


The distinction between language and "thought" to me is odd. Language and "thought" are the same thing. The mouth sounds or hand scribbles aren't the language, but expressions of it.


You don't need language to catch a ball, but clearly thinking is required to intercept its trajectory correctly.

Language is about communication.


There’s an argument that communication which is internal is still communication, and that a language of trajectories required for coordination is still linguistic in a meaningful sense. Most of the ways to differentiate thought from language are probably going to end up splitting hairs. It all comes back to Wittgenstein, and it’s arguable whether the POV is useful, but it’s certainly coherent and defensible.


I think this entire thread of discussion would benefit from remembering multimodal models exist. In other words, pictures are worth a thousand words and have their own place in thought. The existence of a way to translate between modalities doesn't make any of them superior overall--they each have their roles to play.


The idea that language is about thought is only referring to the kind of higher level thinking specific to humans. Any animal can catch a tennis ball, but only humans can construct and then execute complex plans of actions.

This line of thought stems mostly from two observations: one, that the vast majority of language you use is your internal monologue; and two, that this internal monologue would have been extremely helpful for the hominids that would have first developed it even while the rest of the population had not developed language (in contrast, if language is about communication, then it's only useful to a hominid if the whole population speaks language, but then it's hard for it to spread initially).


One issue here is semantics. The things that happen in our brains which we can put into words tend to be the things we categorize as ‘thoughts’. But there are things that happen in our brains which we struggle to connect to language too, and we might call those ‘feelings’ or ‘emotions’ or ‘instincts’ instead. So we’re trying to use language to think about how we think about language and I suspect this might be why that end of neurolinguistics falls off the deep end into philosophy.


>But there are things that happen in our brains which we struggle to connect to language too, and we might call those ‘feelings’ or ‘emotions’ or ‘instincts’ instead

Yes, and we could argue that those are not thoughts, while there still being a distinction between thought language (which could very well be subconscious) and inner monologue/spoken language.


I consider a sentence as a formatted thought. That implies a thought exists before it is expressed in words. There's a ton of thoughts in my head which can't be transformed into any language I speak. I wish I could somehow acquire some proficiency in the other thousands of languages spoken by humans on this planet, just to proof their immense lack of features.

Also our natural languages restrict information bandwidth to a few bytes per second. Imagine doing sports like tennis, chess or soccer at this speed...


It's as if language is itself the latent space for these psychophysical tasks, especially compositional instruction. Their description of it as a scaffolding also seems apt.


i've always assumed that language was required to give your brain the abstractions needed to reference things in the past compared to your current perception (aka now), like an index. if you think about your earliest memories, they almost certainly came after language. i'd be interested to know if any of the documented 'wild child' cases (infants 'raised by wolves') ever delved into what the children remembered before, after being taught language as an adolescent.


There were efforts to teach them language as adolescents, but they didn't acquire it - as far as we know, it's not possible to acquire language if you don't do it as an infant.

This is similar to other brain functions that aren't present at birth and require stimulation, such as sight. That is, if your eyes are forced closed for the first few months of your life, you will never be able to see, even if later they are uncovered, and keep working perfectly. The brain functions responsible for interpreting visual signals can only develop if they get visual signals in a (quite short) developmental window - and we know this with quite a bit of certainty from quite cruel animal studies.

Language acquisition is not proven to be the same, as the required studies would be deeply unethical, but the few experiences with feral children are highly suggestive that the same applies.


Went down the rabbit hole and found the case of Danish Bear Boy, he was reportedly taught to speak but he claimed to have no memory of his time living with the bears. Fascinating stuff https://books.google.com/books?id=k2MRHJuQiVEC&dq=hesse+wolf...


i just had the uncomfortable thought that it's possible a disease could kill off everyone older than 1, sterilizing the species to language at some scale. for the few that survive, their world would be feral.


I hate the reductive nature of the concept of "latent spaces".

A good enough formula for a task isn't a solution for every task. Yes Newtonian mechanics work, but Einstein is a better reflection of reality.


I'm not sure I understand the analogy. The very idea of NNs is that it's not perfect, it is messy and not optimal, but is very generalizable.


>> The very idea of NNs is that it's not perfect, it is messy and not optimal, but is very generalizable.

Newton: Do you need more than that to describe the speed of a thrown baseball on a train? No. DO you you need more than newton to get to the moon? No. Is it going to be accurate at high speed in a large scale system (anything traveling near C)? NO, it fails spectacularly.

NN's are great at simulation, language, weather... But what people using them for weather seem to understand and the ML folks (screaming about AI and AGI) dont is that simulation is not a path to emulation. Lorenz showed that there were limits in weather, that most other disciplines have embraced these limits.


The entire innovation (discovery?) of LLMs is that a good formula for the task of sequence completion turns out to also be a good formula for a wide range of AI tasks. That emergent property is why language models are called language models.


That's an inductive bias, not an emergent property. The inductive bias of transformers is that they're good at integrating global context from different parts of a sequence without a particular bias towards recent time steps or localized regularities. And that happens to be a good fit for many (but not all) real-world sequence learning tasks.

The "emergent property" aspect is when LLMs are good at a task at scale X*3 but were incompetent at scale X.


My point was that this particular inductive bias doesn't inherently beget a "language model".


It does beget a Transformer which we choose to call a language model when it's applied to language data


The usefulness is why the term is so widespread in familiarity but I think the term would have existed to describe the linguistic mapping even if they hadn't proven to have direct problem-solving capabilities.


I'm not pretending to understand half the words uttered in this discussion but I'm constantly reminded of how much it helps me to articulate things (explain them to others, write them down, etc) to understand them. Maybe that thinking indeed happens almost entirely on a linguistic level and I'm not doing half as much other thinking (visualization, abstract logic, etc.) in the process as I thought. That feels weird.


Or is the real thinking sub-linguistic and “you” and those you talk to are the target audience of language? Sentences emerge from a pre-linguistic space we do not understand.


I do find it funny that this discussion thread has tried to represent language as a universal form of thought when it would be messy to encode the inner workings of a LLM (the weightings/relationships) themselves as natural language.

You could sort of represent the deterministic contents of an LLM by compiling all the algorithms and training data in some form, or maybe a visual mosaic of the weights and tokens, or what have you...but that still doesn't really explain the outcome when a model is presented with novel strings. The patterns are emergent properties that converge on familiar language--they're something deeper than the individual words that result.


TL;DR: The authors embed task instructions in a vector space with a language model, and train a sensorimotor-controlling model on top to perform tasks given the instruction embeddings. The authors find that the models generalize to previously unseen tasks, specified in natural language. Moreover, the authors show that the hidden states learn to represent task subcomponents, which helps explains why the model is able to generalize.


What is the language model pretrained on? Wouldn’t the far more robust LLM’s priors impact ability to generalize to “unseen” tasks?


here is a video of the author explaining the work https://www.youtube.com/watch?v=miEwuSz7Pts


It's clear that both biological sentient beings and sentient being made in factories in the future will essentially be two sides of the same coin, differing only in their physical composition. Humans today operate as biological AI, powered by cells, and the sentient beings instead operate on transistors. As we progress toward a future where both sentient beings exhibit comparable intelligence, emotions, and learned experiences, the distinction between the two becomes increasingly blurred. It wouldn't be crazy to think that we'll program a person made out of transistors to go through the same life as a biological human. In such a scenario, why should we consider the sentient being made of cells inherently superior to its transistor-based counterpart?


Hm, no. To start, cells are able to replicate themselves, whereas most silicon used today is not even close to doing so.

The story you suggest seems to be built on a limited understanding of the processes involved. It's pretty hard to predict the future, especially given incorrect assumptions.


Compute is substrate independent.

We use transistors because they are wicked fast and efficient. But a 4090 built from metal balls and wood blocks would still be able to perform all the same calculations. Or a 4090 made by drawing X's and O's on a (really massive) piece of paper. Or one made by connecting a bunch of neurons together for that matter.

Saying cells can multiply doesn't really mean anything, unless is gives ability to access some higher form of compute that is outside the reach of Turing machines. Which it doesn't, because if it did, it would be supernatural.


Compute may be substrate independent, but that isn't the point that the person you're replying to is refuting.

> It's clear that both biological sentient beings and sentient being made in factories in the future will essentially be two sides of the same coin...

> why should we consider the sentient being made of cells inherently superior to its transistor-based counterpart?

The defining characteristic of life as we know it, and unique quality of Earth as compared to other planets as we have measured them is the ability to self replicate, and the presence of self replicating matter.

Supposedly a single cell created all life on earth and is therefore directly responsible for every facet of the nature of Earth today. Over hundreds of millions of years the entire atmosphere, and entire surface of the earth as well as unmeasured parts of the sub surface of earth have been completely reshaped and defined by living cells all descended from that original cell.

We're very likely able to create human level intelligence or even super human level intelligence that operates in a factory produced robot body that can survive a human level life span with some sort of maintenance comparable to that of a mechanic/surgeon but if that being needs a factory to generate parts or progeny then it is intrinsically lacking compared to self-replicating cellular life.

Self-replication is a powerful ability that is intrinsic to life and deeply related to intelligent systems. As fabulous as a 4090 may be to us at this point in time, it is incomparable to a group of entities such as a humans that can create a civilization that can make a 4090, whatever material that 4090 may be made of.

A far more profoundly fascinating object is the factory that can make robots that can operate the factory that can make more of the robots, and better. And that's what multi-cellular life really is.

Until a 4090 type device can make more of itself and other by-products like multi-cellular life, the multi-cellular life that can make more of itself and by-products like a 4090 type device is intrinsically better.


Analogue computers aren't equivalent to Turing machines.


No physical computer is a perfect Turing machine, thanks to random noise.

Analogue computers can always implement a Turing machine to within the bounds of that noise (and that's how transistors, which are really analogue devices, get used for digital signal and information processing).

A computer that used infinite-precision real numbers would be more powerful than any Turing machine, however unlimited-precision real numbers in the physical universe are prohibited by the holographic principle and the Bekenstein bound so we can't have them.


Cell division is a solved problem. Install new ram module and ctrl+c, ctrl+v from backup, done.


That's "solved" in the way that viruses "solved" life; I think we might want higher standards than that.

Un/fortunately (depending on who you ask), there's also a lot of automation being developed for every stage of the industrial processes from "where do we even look for the right rocks to get out of the ground?" to "here's the RAM chip you wanted to stick in your socket".


My point is, it's irrelevant. Yes we can't replicate the process of cell replication but in this case it isn't necessary. We can achieve the end objective (which is to repair/extend) through other means, this is akin to just replacing organs with factory build versions (memories pre-installed too).


proof left as exercise to the reader


I don’t think it’s a given that people do. Also what does that have to do with the article?


It’s not clear at all. So far, the only sentient, intelligent beings are the ones God made. His Word (Bible) said He made us for Him with predictions for the end times. AI takeover isn’t in there. Also, it implies (a) we won’t ever observe an act of evolution that produces anything like us and (b) any artificial intelligence will likewise require brilliant, intelligent designers to work and be maintained.

So far, all these AI’s with a tiny fraction of human capabilities require brilliant designers in fine-tuned environments, like we did. We’ve also observed billions of human and non-human births with no new kinds of animals coming out of them. So, the Word of God is supported by billions of observations plus every AI ever designed while evolutionary or singularity-type views are not. That’s despite so many comments in these discussions referencing evolutionary or mechanical explanations as if they’ve been proven instead of disproven by observations.

So, put your trust in Jesus Christ, receive the Spirit of God our Creator, and find out for yourself what’s special about us. Your life will be much more than the product of a biological, cellular process. God is powerful. You’ll see His work in a new light after you personally know Him.


I hate that we just turned out to be stochastic machines another not something more interesting.


This reminds me of how 'interesting' it is when people defend their use of language by downplaying the effects on other people. We really do have so many ripple effects from everything we broadcast to others, and tacking a negative, or a double/triple negative into a sentence doesn't change the fact that mentioning pink elephants will color an arbitrary portion of someone's day.

In other words, this is why tone absolutely matters, and I love how this awareness feeds into support for genuine expression. Sarcasm is chaotic with an unknown audience who would be less likely to recognize the intended subversion and toying with meaning, instead taking it in whatever direction the listener feels like without being sure that the speaker is understood.


Delusions of grandeur


I think the title of this is wrong...it should say "Structured Language Patterns Facilitate Structured Generalization in Neural Networks".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: