>Tasks that are instructed using conditional clauses also require a simple form of deductive reasoning (if p then q else s)
> Our models ofer several experimentally testable predictions
outlining how linguistic information must be represented to facilitate flexible and general cognition in the human brain.
Aren't those claims falsified by more recent studies that show that even in flys, preferred direction to a moving stimulus uses the timing of spikes. And that fear conditioning in even mice uses Dendritic Compartmentalization?
Or that humans can even do xor with a single neuron.
If "must be represented" was "may be modeled by" I would have less of an issue and obviously spikey artificial NNs have had problems with riddled basins and make autograd problematic in general.
So ANNs need to be binary and it is best to model biological neurons as such for practical models... but can someone please clarify why 'must' can apply when using what we know now is an oversimplified artificial neuron models?
Here are a couple of recent papers but I think dendritic compartmentalization and spike timing sensitivity has been established for over a decade.
There's also many issues with just interpreting the results of the work.
Our best models can perform a previously unseen task with an average performance of 83% correct based solely on linguistic instructions (that is, **zero-shot learning**).
They used GPT-2 from HuggingFace. I'm unsure what data this model is trained on. If it is the original GPT-2 checkpoint then that data is unknown. I just refuse to let anyone casually claim "zero-shot" when the training data is unknown. GPT-2 was trained on 40GB of text data (which is A LOT! It includes 8 million documents and 45 million web pages). This may not be the crazy sizes we see today, but even then the community was concerned about accurately stating what was in distribution and out of distribution. You can't know if you don't know what it was trained on AND how it was trained (since the mathematics can also put pressure on certain things that may not be realized at first).
In addition to this, their efforts look to be mainly using clustering techniques. CLIP itself is a clustering algorithm. ANNs frequently do clustering as well, but you know, there's some black box nature to them (but not entirely opaque either).
It is very hard to draw causal conclusions when you use either of these two things. Not to mention the fact that causality itself is difficult given that different graphs can be indistinguishable.
Yes, the word “represented” is too widely used and abused in neuroscience—to the point where a frog has “fly detector” neurons. Humberto Maturana pushed back against this pervasive idea. Chapter 4 of Terry Winograd’s and Francesco Valera’s Understanding Computers and Cognition has a good overview of common presumptions.
Given that CNS is a 700 million year hack, there will be lots of odd tricks used to generate effective behaviors.
> Or that humans can even do xor with a single neuron.
That's news to me.
I'm not hugely surprised given I've heard a biological neuron is supposed to be equivalent to a small ANN network, but still, first I've heard of that claim.
It's a similar situation to terminology for the atom.
It originally in Greek meant "the smallest indivisible unit of matter".
Scientists then took the name and named various elements (hydrogen, gold, etc) as various atoms.
So, this is like when computing took the idea of a neuron as "the smallest indivisible unit of memory and calculation" and ran with it.
Fast forward to now, when we know that each "atom" has a bunch of smaller stuff internally, but by now it's too late to change the terminology.
And now we also know that a biological "neuron" is something more like an embedded CPU or FPGA in its own right, each with a bunch of computing and storage capability and modes.
There’s a long debate in neuroscience about whether information is encoded in timing of individual spikes or only their rates (where rate coding is a bit more similar to how ANNs work, but still different). It hasn’t been decided by any one paper, nor is it likely to be: it seems that different populations of neurons in different parts of the brain encode information through different means.
Not either-or. It is both. Spike rate variation is way too slow for some types of low level compute. Spike timing us critical for actions as “simple” as throwing a fast ball into the strike zone.
> Or that humans can even do xor with a single neuron.
having a single neuron that has learned xor != understanding xor
Function approximation is trivial, understanding of what said functions can do and when to use them is much harder (though is arguably still function approximation)
Well xor is linearly inseparable, which is impossible with a single perceptron.
> Our models by contrast make tractable predictions for what popu-
lation and single-unit neural representations are required to support
compositional generalization and can guide future experimental work
examining the interplay of linguistic and sensorimotor skills in humans.
Do you see where that causes an issue with supervenience? Especially when mixed with STDP which could change that more?
It is confusing the map with the territory. At least with the extreme strength of their claim.
> Our models ofer several experimentally testable predictions outlining how linguistic information must be represented to facilitate flexible and general cognition in the human brain.
Aren't those claims falsified by more recent studies that show that even in flys, preferred direction to a moving stimulus uses the timing of spikes. And that fear conditioning in even mice uses Dendritic Compartmentalization?
Or that humans can even do xor with a single neuron.
If "must be represented" was "may be modeled by" I would have less of an issue and obviously spikey artificial NNs have had problems with riddled basins and make autograd problematic in general.
So ANNs need to be binary and it is best to model biological neurons as such for practical models... but can someone please clarify why 'must' can apply when using what we know now is an oversimplified artificial neuron models?
Here are a couple of recent papers but I think dendritic compartmentalization and spike timing sensitivity has been established for over a decade.
https://pubmed.ncbi.nlm.nih.gov/35701166/
https://www.sciencedirect.com/science/article/pii/S009286741...