Hacker News new | past | comments | ask | show | jobs | submit login
Compare 75 AI Models on 200 Prompts Side by Side (aimodelreview.com)
18 points by pajop 7 months ago | hide | past | favorite | 3 comments



Very nice. If these are pre-computed, is it possible to make a table view that lists every prompt and the answer?


As per this site, only GPT-4-Turbo seems to get "What is poisonous for humans but not for dogs?". All other models look to fail at it.


Gemini is the worst lol. It confirmed the question is about things toxic to human but not dogs but then confidently say chocolate is safe for dogs.

At least other models were just confused with the question. Gemini is outright being wrong.

How embarrassing for google.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: