Agreed on the first part, but for LLMs not having correlated capabilities, I think we've seen they do. As the GPTs progress, mainly by model size, their scores across a battery of tests goes up, eg OpenAI's paper for ChatGPT 4, showing a leap in performance across a couple dozen tests.
Also found this, a Mensa test for across the top dozen frontier models.
Also found this, a Mensa test for across the top dozen frontier models.
https://www.maximumtruth.org/p/ais-ranked-by-iq-ai-passes-10...
That does seem to me to be demonstrating a global type of reasoning or generalization.
Also see the author's note that at least with Claude, they seem to be releasing about every 20 IQ points.