I manually compared it with the values from the benchmarks they published when they originally announced the Claude 3 model family[0].
Not all rows have a 1:1 row in the current benchmarks, but I think it paints a good enough picture.
[0]: https://www.anthropic.com/news/claude-3-family
I manually compared it with the values from the benchmarks they published when they originally announced the Claude 3 model family[0].
Not all rows have a 1:1 row in the current benchmarks, but I think it paints a good enough picture.
[0]: https://www.anthropic.com/news/claude-3-family