Hacker News new | past | comments | ask | show | jobs | submit login

I just tried asking ChatGPT to rate various BBC and NYT articles out of 10, and it consistently gave all of them a 7 or 8. Then I tried today's featured Wikipedia article, which got a 7, which it revised to an 8 after regenerating the respose. Then I tried the same but with BuzzFeeds hilariously shallow AI-generated travel articles[1] and it also gave those 7 or 8 every time. Then I asked ChatGPT to write a review of the iPhone 20, fed it back, and it gave itself a 7.5 out of 10.

I personally give this experiment a 7, maybe 8 out of 10.

[1] https://www.buzzfeed.com/astoldtobuzzy




ChatGPT has a giant system prompt that you have no control over. Try using Llama and create a system prompt with clear instructions and examples. If you were going to use a model in a production system you would also want to either fine tune it or train a BERT-like model as a classifier that just outputs a score. Maybe even more than one for ranking along different dimensions.


Yes, do not rely on it for assessments. It generates ratings of 7 or 8 because those ratings are statistically common in its training data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: