Hacker News new | past | comments | ask | show | jobs | submit login

Exactly, to create larger and better performing models, there is no lack of ideas or techniques. The real problem is to have the GPUs for that.



I disagree mainly because google, aws, apple, etc. All have similar, or even more access to GPU compute and funding for it, and in google's case also has been one of the main research contributers, yet they still struggle to touch GPT4's performance in practice.

If it was as simple as dropping 10's millions on compute they could do that, yet google's bard/gemini have been a year behind GPT4's performance.

That said I do agree that it's a moat for the startups like stability/mistral, etc. They also have access to $/compute, albiet a lot less. And you can see this in their research, as they've been focused on methods to lower the training/inference costs.

*I'm measuring performance by the chatbot arena's elo system and r/locallama




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: