> GPT is basically the language portion of your brain. The language portion of your brain does not do logic. It does not do analyses.
I like this analogy as a simple explanation. To dig in though, do we have any reason to think we can’t teach a LLM better logic? It seems it should be trivial to generate formulaic structured examples that show various logical / arithmetic rules.
Am I thinking about it right to envision that a deep NN has free parameters to create sub-modules like a “logic region of the brain” if needed to make more accurate inference?
"To dig in though, do we have any reason to think we can’t teach a LLM better logic?"
Well, one reason is that's not how our brains work. I won't claim our brains are the one and only way things can work, there's diversity even within human brains, but it's at least a bit of evidence that it is not preferable. If it were it would be an easier design than what we actually have.
I also don't think AIs will be huge undifferentiated masses of numbers. I think they will have structure, again, just as brains do. And from that perspective, trying to get a language model to do logic would require a multiplicatively larger langauge model (minimum, I really want to say "exponentially" but I probably can't justify that... that said, O(n^2) for n = "amount of math understood" is probably not out of the range of possibility and even that'd be a real kick in the teeth), whereas adjoining a dedicated logic module to your language model will be quite feasible.
AIs can't escape from basic systems engineering. Nothing in our universe works as just one big thing that does all the stuff. You can always find parts, even in biology. If anything, our discipline is the farthest exception in that we can build things in a fairly mathematical space that can end up doing all the things in one thing, and we consider that a serious pathology in a code base because it's still a bad idea even in programming.
This all matches my intuition as a non-practitioner of ML. However, isn’t a DNN free to implement its own structure?
Or is the point you’re making that full connectivity (even with ~0 weights for most connections) is prohibitively expensive and a system that prunes connectivity as the brain does will perform better? (It’s something like 1k dendrites per neuron max right?)
The story of the recent AI explosion seems to be the surprising capability gains of naive “let back-prop figure out the structure” but I can certainly buy that neuromorphic structure or even just basic modular composition can eventually do better.
(One thought I had a while ago is a modular system would be much more amenable to hardware acceleration, and also to interpretability/safety inspection, being a potentially slower-changing system with a more stable “API” that other super-modules would consume.)
> do we have any reason to think we can’t teach a LLM better logic?
I'll go for a pragmatic approach: the problem is that there is no data to teach the models cause and effect.
If I say "I just cut the grass" a human would understand that there's a world where grass exists, it used to be long, and now it is shorter. LLMs don't have such a representation of the world. They could have it (and there's work on that) but the approach to modern NLP is "throw cheap data at it and see what sticks". And since nobody wants to hand-annotate massive amounts of data (not that there's an agreement on how you'd annotate it), here we are.
I call this the embodiment problem. The physical limitations of reality would quickly kill us if we didn't have a well formed understanding of them. Meanwhile AI is stuck in 'dream mode', much like when we're dreaming we can do practically anything without physical consequence.
To achieve full AI I believe will eventually have to our AI's have a 'real world' set of interfaces to bounds check information.
I like this analogy as a simple explanation. To dig in though, do we have any reason to think we can’t teach a LLM better logic? It seems it should be trivial to generate formulaic structured examples that show various logical / arithmetic rules.
Am I thinking about it right to envision that a deep NN has free parameters to create sub-modules like a “logic region of the brain” if needed to make more accurate inference?