Yup, missed that, thanks. Has anyone scored GPT-4 on the APPs benchmark?
I believe that if you take GPT-4 multimodal integrated with Eleven Labs and Whisper then there is a shot at passing that extended Turing test, if designed fairly. The wording is still a bit ambiguous.
Also assembling that particular scale model is probably challenging but not really a general task and something that could be probably be achieved with simulated sensors and effectors given a 3-4 month engineering effort into utilizing advanced techniques (maybe training an existing multimodal LLM and integrating it with some kind of RL-based robot controller?) at interpreting and acting on those kinds of instructions. It would be possible to integrate it with the LLM such that it could report its projects and identify objects during assembly.
So my takeaway is that with some serious attempts and an honest assessment of this bar, an AI would be able to pass that this year or next. I mean I don't know how far GPT-4 is from the 75%/90% but I doubt it is that far and so expect if not GPT-4 then GPT-4.5 or 5 could pass given some engineering effort aimed at the test competencies.
If people really are thinking 2030 or 2040 when they read "AGI" and respond to that poll (I suspect some didn't read the definition) then that would indicate that people are just ignorant of the reality of how far along we are, or in denial. Or a little of both.
I believe that if you take GPT-4 multimodal integrated with Eleven Labs and Whisper then there is a shot at passing that extended Turing test, if designed fairly. The wording is still a bit ambiguous.
Also assembling that particular scale model is probably challenging but not really a general task and something that could be probably be achieved with simulated sensors and effectors given a 3-4 month engineering effort into utilizing advanced techniques (maybe training an existing multimodal LLM and integrating it with some kind of RL-based robot controller?) at interpreting and acting on those kinds of instructions. It would be possible to integrate it with the LLM such that it could report its projects and identify objects during assembly.
So my takeaway is that with some serious attempts and an honest assessment of this bar, an AI would be able to pass that this year or next. I mean I don't know how far GPT-4 is from the 75%/90% but I doubt it is that far and so expect if not GPT-4 then GPT-4.5 or 5 could pass given some engineering effort aimed at the test competencies.
If people really are thinking 2030 or 2040 when they read "AGI" and respond to that poll (I suspect some didn't read the definition) then that would indicate that people are just ignorant of the reality of how far along we are, or in denial. Or a little of both.