LLM responses are random. One's failure is other's success. When evaluating we a... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		smusamashah 6 days ago \| parent \| context \| favorite \| on: Are LLMs able to notice the “gorilla in the data”? LLM responses are random. One's failure is other's success. When evaluating we all should do rerurns and see how many times it fails or succeeds. Without number of rerurns, the result is as good as random.

dartos 6 days ago [–]

Okay?

OC was saying that the article said that Claude recognized the “artistic” lines of the image from just the scatter plot data.

That isn’t what happened.

The author added a png of the plot to the conversation.

Idk why I need to explain that twice.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact