Opus has been stuck on 3.0, so Sonnet 3.5 is better for most things as well as c...

diggan · 2024-10-22T15:29:53 1729610993

> Opus has been stuck on 3.0, so Sonnet 3.5 is better

So for example, Perplexity is wrong here implying that Opus is better than Sonnet?

https://i.imgur.com/N58I4PC.png

hobofan · 2024-10-22T15:33:36 1729611216

I think as of this announcement that is indeed outdated information.

diggan · 2024-10-22T15:44:19 1729611859

So Opus that costs $15.00/$75.00 for 1mil tokens (input/output) is now worse than the model that costs $3.00/$15.00?

That's according to https://docs.anthropic.com/en/docs/about-claude/models which has "claude-3-5-sonnet-20241022" as the latest model (today's date)

hobofan · 2024-10-22T16:01:58 1729612918

Yes, you will find similar things at essentially all other model providers.

The older/bigger GPT4 runs at $30/$60 and peforms about on par with GPT4o-mini which costs only $0.15/$0.60.

If you are currently, or have been integrating AI models in the past ~2 years, you should definitely keep up with model capability/pricing development. If you are staying on old models you are certainly overpaying/leaving performance on the table. It's essentially a tax on agility.

diggan · 2024-10-22T16:11:03 1729613463

> The older/bigger GPT4 runs at $30/$60 and peforms about on par with GPT4o-mini which costs only $0.15/$0.60.

I don't think GPT-4o Mini has comparable performance to GPT-4 at all, where are you finding the benchmarks claiming this?

Everywhere I look says GPT-4 is more powerful, but GPT-4o Mini is most cost-effective, if you're OK with worse performance.

Even OpenAI themselves about GPT-4o Mini:

> Our affordable and intelligent small model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo.

If it was "on par" with GPT-4 they would surely say this.

> should definitely keep up with model capability/pricing development

Yeah, I mean that's why we're both here and why we're discussing this very topic, right? :D

cootsnuck · 2024-10-22T19:00:24 1729623624

Just switch out gpt-4o-mini for gpt-4o, the point stands. Across the board, these foundational model companies have comparable, if not more powerful, models that are cheaper than their older models.

OpenAI's own words: "GPT-4o is our most advanced multimodal model that’s faster and cheaper than GPT-4 Turbo with stronger vision capabilities."

gpt-4o:

$2.50 / 1M input tokens $10.00 / 1M output tokens

gpt-4-turbo:

$10.00 / 1M input tokens $30.00 / 1M output tokens

gpt-4:

$30.00 / 1M input tokens $60.00 / 1M ouput tokens

https://openai.com/api/pricing/

chillfox · 2024-10-23T01:54:23 1729648463

I found that gpt-4-turbo beat gpt-4o pretty consistently for coding tasks, but claude-3.5-sonnet beat both of them, so it's what I have been using most of the time. gpt-4o-mini is adequate for summarizing text.

hobofan · 2024-10-22T16:25:26 1729614326

> Yeah, I mean that's why we're both here and why we're discussing this very topic, right? :D

That wasn't specifically directed at "you", but more as a plea to everyone reading that comment ;)

I looked at a few benchmarks, comparing the two, which like in the case of Opus 3 vs Sonnet 3.5 is hard, as the benchmarks the wider community is interested in shifts over time. I think this page[0] provides the best overview I can link to.

Yes, GPT4 is better in the MMLU benchmark, but in all other benchmarks and the LMSys Chatbot Arena scores[1], GPT4o-mini comes out ahead. Overall, the margin between is so thin that it falls under my definition of "on par". I think OpenAI is generally a bit more conservative with the messaging here (which is understandable), and they only advertise a model as "more capable", if one model beats the other one in every benchmark they track, which AFAIK is the case when it comes to 4o mini vs 3.5 Turbo.

[0]: https://context.ai/compare/gpt-4o-mini/gpt-4

[1]: https://artificialanalysis.ai/models?models_selected=gpt-4o-...

apsec112 · 2024-10-22T15:34:47 1729611287

Basically yeah