Hacker News new | past | comments | ask | show | jobs | submit login

I generally agree with what you're saying and the first half of your answer makes perfect sense but I think the second is unfair (i.e. "[is it] easier to balance a barrel on a plank or a plank on a barrel"). It's a trick question and "it" tried to answer in good faith.

If you were to ask the same question of a real person and they replied with the exact same answer you could not conclude that person was not capable of "actual reasoning". It's a bit of witch-hunt question set to give you the conclusion you want.




I should have said, as I understand it, the point of this type of question is not that one particular answer is the right answer and another is wrong, it's that often the model in giving an answer will do something really weird that shows that it doesn't have a model of the world.


I didn't make up this methodology and it's genuinely not a trick question (or not intended as such), it's a simple example of an actual class of questions that researchers ask when trying to determine whether a model of the world exists. The paper I linked uses a ball and a plank iirc. Often they use a much wider range of objects eg: something like "Suggest a stable way of stacking a laptop, a book, 4 wine classes, a wine bottle and an orange" is one that I've seen in a paper for example.


ok I believe it may not have been intended as a trick but I think it is. As a human, I'd have assumed you meant the trickier balancing scenario i.e. the plank and barrel on its side.

The question you quoted ("Suggest a stable way of stacking a laptop, a book, 4 wine classes, a wine bottle and an orange") I would consider much fairer and cgpt3.5 gives a perfectly "reasonable" answer:

https://chat.openai.com/share/fdf62be7-5cb2-4088-9131-40e089...


What's interesting about that one is I think that specific set of objects is part of its training set because when I have played around with swapping out a few of them it sometimes goes really bananas.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: