1) They have the best quality model. Better quality means more users. More users means more data. Which means higher quality...
2) operationalizing & scaling these these models is non-trivial. I'm not sure what the state of distillation/pruning is for GPT-3, but I imagine they have figured out some proprietary techniques.
3) It's not just publishing a single model, but making it so people can fine tune and push their own. Because they've gotten good at 2, now anyone can create their own version of GPT customized for their use case.
Will Google or others be able to do the same eventually? Definitely.
The point I'm more making is that it's not just training the model and running it.
I don't view any of those things as a meaningful moat against the other companies with AI labs.
Specifically, training data is not primarily coming from interactions with model. While with RLHF this data might become more important, it is still a very small portion.
I don't know either way, but by way of example that it might be, the Google PageRank patent has expired, yet Google remains valuable because their personalisation of results became a moat.
1) They have the best quality model. Better quality means more users. More users means more data. Which means higher quality...
2) operationalizing & scaling these these models is non-trivial. I'm not sure what the state of distillation/pruning is for GPT-3, but I imagine they have figured out some proprietary techniques.
3) It's not just publishing a single model, but making it so people can fine tune and push their own. Because they've gotten good at 2, now anyone can create their own version of GPT customized for their use case.
Will Google or others be able to do the same eventually? Definitely.
The point I'm more making is that it's not just training the model and running it.