Can LLM's compute any computable function? I thought that an LLM can approximate any computable function, if the function is within the distribution that it is are trained on. I think it's jolly interesting to think about different axiomizations in this context.
Also we know that LLM's can't do a few things - arithmetic, inference & planning are in there. They look like they can because they retrieve discussions from the internet that contain the problems, but when they are tested out of distribution then all of a sudden they fail. However, some other nn's can do these things because they have the architecture and infrastructure and training that enables it.
There is a question for some of these as to whether we want to make NN's do these tasks or just provide calculators, like for grade students, but on the other hand something like Alphazero looks like it could find new ways of doing some problems in planning. The challenge is to find architectures that integrate the different capabilities we can implement in a useful and synergistic way. Lots of people have drawn diagrams about how this can be done, then presented them with lots of hand waving at big conferences. What I love is that John Laird has been building this sort of thing for like, forty years, and is roundly ignored by NN people for some reason.
Maybe because he keeps saying it's really hard and then producing lots of reasons to believe him?
Many of the "specialist" parts of the brain are still made from cortical columns, though. Also, they are in many cases partly interchangeable, with some reduction in efficiency.
Transformers may be like that, in that they can do generalized learning from different types of input, with only minor modifications needed to optimize for different input (or output) modes.
Cortical columns are one part of much more complex systems of neural compute that at a minimum includes recursive connections with thalamus, hypothalamus, midbrain, brainstem nuclei, cerebellum, basal forebrain, — and the list goes on.
So it really does look like a society of networks, all working in functional synchrony (parasynchrony might be a better word) with some firms of “consciousness” updated in time slabs of about 200-300 milliseconds.
LLMs are probably equivalent now to Wernicke’s and Broca’s areas, but much more is needed “on top” and “on bottom”—-motivation, affect, short and longterm memory, plasticity of synaptic weighting and dynamics, and perhaps most important, a self-steering attentional supervisor or conductor. That attentional driver system is what we probably mean by consciousness.
> That attentional driver system is what we probably mean by consciousness.
You may know much more about this than me, but how sure are you about this? To me it seems like a better fit that the "self-steering attentional supervisor" is associated with what we mentally model (and oversimplify) as "free will", while "consciousness" seems to be downstream from the attention itself, and has more to do with organizing and rationalizing experiences than with than with the directly controlling behavior.
This processed information then seems to become ONE input to the executive function in following cycles, but with a lag of at least 1 second, and often much more.
> one part of much more complex systems of neural compute
As for your main objection, you're obviously right. But I wonder how much of the computation that is relevant for intelligence is actually in those other areas. It seems to me that recent developments indicate that Transformer type models are able to self-organize into several different type of microstructures, even within present day transformer based models [1].
Not sure at all. Also some ambiguities in definitions. Above I mean “consciousness” of the type many would be willing to assume operates in a cat, dog, or mouse—attentional and occasionally, also intentional.
I agree that this is downstream of pure attention. Attention needs to be steered and modulated. The combination of the two levels working together recursively is what I had in mind.
“Free will” gets us into more than that. I’ve been reading Daniel Dennett on levels of “intention” this week. This higher domain of an intentional stance (nice Wiki article) might get labeled “self-consciousness”.
Most humans seem to accept this as a cognitive and mainly linguistic domain—the internal discussions we have with ourselves, although I think we also accept that there is are major non-linguistic drivers. Language is an amazingly powerful tool for recursive attentional and semantic control.
My take on "free will" is definitely partly based on Dennett's work.
As for "consciousness", it seems to me that most of not all actions we do are decided BEFORE they hit our consciousness. For actions that are not executed immediately, the processing that we experience as "consciousness" may then raise some warning flags if the action our pre-conscious mind has decided on is likely to cause som bad consequences. This MAY cause the decision-making part (executive function) of the brain to modify the decision, but not because the consciousness can override the decision directly.
Instead, when this happens, it seems to be that our consciousness extrapolates our story into the future in a way that creates fear, desire or similar more primal motivations that have more direct influence over the executive function.
One can test this by for instance standing at near the top of a cliff (don't du this if suicidal): Try to imagine that you have decided to jump of the cliff. Now imagine the fall from the cliff and you hitting the rocks below. Even if (and maybe especially if) you managed to convince yourself that you were going to jump, this is likely to trigger a fear response strong enough to ensure you will not jump (unless you're truely suicidal).
Or for a less synthetic situation. Let's say you're a married man, but in a situation where you have an opportunity to have sex with a beautiful woman. The executive part of the brain may already have decided that you will. But if your consciousness predicts that your wife is likely to find out and starts to spin a narrative about divorce, loosing access to your children and so on, this MAY cause your executive function to alter the decision.
Often in situations like this, though, people tend to proceed with what the preconcious executive function had already decided. Afterwards, they may have some mental crisis because they ended up doing something their consciousness seemed to protest against. They may feel they did it against their own will.
This is why I think that the executive function, even the "free will" is not "inside" of consciousness, but is separate from it. And while it may be influenced by the narratives that our consciousness spin up, it also takes many other inputs that we may or may not be conscious of.
The reason I still call this "free" will, is based on Dennett's model, though. And in fact, "free" doesn't mean what we tend to think it means. Rather, the "free" part means that there is a degree of freedom (like in a vector space) that is sensitive to the kind of incentives the poeple around you may provide for your actions.
For instance stealing something can be seen as a "free will" decision if you would NOT do it if you knew with 100% certainty that you would be caught and punished for it. In other words, "free will" actions are those that, ironically, other people can influence to the point where they can almost force you to take them, by providing strong enough incentives.
Afaik some are similar, yes. But we also have different types of neurons etc. Maybe we'll get there with a generalist approach, but imho the first step is a patchwork of specialists.
In a single run, obviously not any, because it's context window is very limited. With a loop and access to an "API" (or willing conversation partner agreeing to act as one) to operate a Turing tape mechanism? It becomes a question of ability to coax it into complying. It trivially has the ability to carry out every step, and your main challenge becomes to get it to stick to it over and over.
One step "up", you can trivially get GPT4 to symbolically solve fairly complex runs of instructions of languages it can never have seen before if you specify a grammar and then give it a program, with the only real limitation again being getting it to continue to adhere to the instructions for long enough before it starts wanting to take shortcuts.
In other words: It can compute any computable function about as well as a reasonably easily distractable/bored human.
What exactly is it you think it can't do? It can explain and apply a number of methods for calculating sin. For sin it knows the symmetry and periodicity, and so will treat requests for sin of larger values accordingly. To convince it to continue to write out the numbers for an arbitrary large number of values without emitting "... continue like this" or similar shortcut a human told to do annoyingly pointless repetitive work would also be prone to prefer is indeed tricky, but there's nothing to suggest it can't do it.
Also we know that LLM's can't do a few things - arithmetic, inference & planning are in there. They look like they can because they retrieve discussions from the internet that contain the problems, but when they are tested out of distribution then all of a sudden they fail. However, some other nn's can do these things because they have the architecture and infrastructure and training that enables it.
There is a question for some of these as to whether we want to make NN's do these tasks or just provide calculators, like for grade students, but on the other hand something like Alphazero looks like it could find new ways of doing some problems in planning. The challenge is to find architectures that integrate the different capabilities we can implement in a useful and synergistic way. Lots of people have drawn diagrams about how this can be done, then presented them with lots of hand waving at big conferences. What I love is that John Laird has been building this sort of thing for like, forty years, and is roundly ignored by NN people for some reason.
Maybe because he keeps saying it's really hard and then producing lots of reasons to believe him?