> how do you argue that these models are not able to reason?
They just don't have the right architecture to support it.
An LLM is just a fixed size stack of N transformer layers, and has no working memory other than the temporary activations between layers. There are always exactly N steps of "logic" (embedding transformation) put into each word output.
You can use prompts like "think step by step" to try to work around these limitations so that a complex problem can (with good planning by the model) be broken down into M steps of N layers, and the model's own output in early steps acts as pseudo-memory for later steps, but this only gets you so far. It provides a workaround for the fixed N layers and memory, but creates critical dependency on ability to plan and maintain coherency while manipulating long contexts, which are both observed weaknesses of LLMs.
Human reasoning/planning isn't a linear process of N steps - in the general case it's more like an iterative/explorative process of what-if prediction/deduction, backtracking etc, requiring working memory and focus on the task. There's a lot more to the architecture of our brain than a stack of layers - a transformer is just not up to the job, nor was built for it.
They just don't have the right architecture to support it.
An LLM is just a fixed size stack of N transformer layers, and has no working memory other than the temporary activations between layers. There are always exactly N steps of "logic" (embedding transformation) put into each word output.
You can use prompts like "think step by step" to try to work around these limitations so that a complex problem can (with good planning by the model) be broken down into M steps of N layers, and the model's own output in early steps acts as pseudo-memory for later steps, but this only gets you so far. It provides a workaround for the fixed N layers and memory, but creates critical dependency on ability to plan and maintain coherency while manipulating long contexts, which are both observed weaknesses of LLMs.
Human reasoning/planning isn't a linear process of N steps - in the general case it's more like an iterative/explorative process of what-if prediction/deduction, backtracking etc, requiring working memory and focus on the task. There's a lot more to the architecture of our brain than a stack of layers - a transformer is just not up to the job, nor was built for it.