Hacker News new | past | comments | ask | show | jobs | submit login

I'm simply applying "We use this test to measure intelligence in humans what does this AI do.". We have a priori here that it measures intelligence, before the AI existed. Now the AI scores high on this measurement. There is nothing else but t conclude the AI is intelligent.

You’re not understanding the tests. They were designed for humans. They were also designed to be taken once. LLMs have been trained on these tests numerous times. LLMs are also not humans. We can’t conceivably compare the two

Just saying these tests were designed for humans doesn't mean anything. You have to specify why exactly it doesn't work for an AI.

Or rather let me pose this question. What is the intellectual test you envisions that proves an AI is intelligent that any non-disabled human can easily pass. I'm willing to bet 500 USD it will pass that hurdle in the coming 10 years if you are willing to put your money where your mouth is.

To test intelligence by humans or AI, one needs a question where the answer hasn't been memorized (or answered by someone in its training set).

Indeed, you can see something like ChatGPT fall down by simply asking a modified form of a real IQ test question.

For example, ChatGPT answers a sample Stanford binet question "Counting from 1 to 100, how many 6s will you encounter?" correctly, but if you slightly modify it and ask how many 7s instead, it will only count 19.

Having written this out however, I've now invalidated the question since they use webcrawls to train.

Yes there is. I could conclude that the test wasn't actually measuring intelligence, but just one component that when summed with other components displays intelligence; That is, if the test was purported to alone measure intelligence, it was a flawed assumption.

We used to measure intelligence with IQ tests, those are now known to largely be bunk. What's to say our other intelligence tests aren't similarly flawed?

Huh? IQ tests are by far the best measure of intelligence we have. Where did you read it is bunk?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
