Hacker News new | past | comments | ask | show | jobs | submit login

> Which ones are you looking at? Since the benchmark comparison in the blogpost itself doesn't include Opus at all.

I manually compared it with the values from the benchmarks they published when they originally announced the Claude 3 model family[0].

Not all rows have a 1:1 row in the current benchmarks, but I think it paints a good enough picture.

[0]: https://www.anthropic.com/news/claude-3-family




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: