Hacker News new | past | comments | ask | show | jobs | submit login

That's not how Elo scores work. A 120 point elo difference is a 66% win rate, which means you win 2:1. That means GPT4-Turbo wins twice as much as Vicuna. And even that isn't a fair comparison, since Elo isn't linear: it gets more and more challenging to go up in Elo as you go further and further.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: