> The top couple LLMs are extraordinarily expensive - will get dramatically more expensive yet - and are one of the most challenging products that have been created in all of human history.
I disagree. The more we learn about LLMs the more it appears that they're not as difficult to build as it initially seemed.
You need a lot of GPUs and electricity, so you need money, but the core ideas: dump in a ton of pre-training data, then run layers of instruction tuning on top - are straight-forward enough that there are already 4-5 organizations that are capable of training GPT-4 class LLMs - and it's still a pretty young field.
Compared to human endeavors like the Apollo project LLMs are pretty small fry.
100%. I don't think we should at all minimize the decades of research that it took to get to the current "generative AI boom", but once transformers were invented in 2018, we basically found that just throwing more and more data at the problem gave you better results.
And not to discount the other important advances like RLHF, but the reason everyone talks about the big model companies as having "no moat" is because it's not really a secret of how to recreate these models. That is basically the complete opposite of, say, other companies that really do build "the most challenging products that have been created in all of human history." E.g. nobody has really been able to recreate TSMC's success, which requires not only billions but a highly educated, trained, and specialized workforce.
I disagree. The more we learn about LLMs the more it appears that they're not as difficult to build as it initially seemed.
You need a lot of GPUs and electricity, so you need money, but the core ideas: dump in a ton of pre-training data, then run layers of instruction tuning on top - are straight-forward enough that there are already 4-5 organizations that are capable of training GPT-4 class LLMs - and it's still a pretty young field.
Compared to human endeavors like the Apollo project LLMs are pretty small fry.