> this means the ability to train 50 GPT-4 scale models every 90 days or 200 such models per year.
What it actually means is that they are training next gen models that are 50X larger.
And, considering MS and OpenAI are planning to build a $100 billion AI training computer, these 350K GPUs is just a tiny portion of what they are planning.
This isn't an overkill. This is the current plan: throw as much compute as possible and hope intelligence scales with compute.
What it actually means is that they are training next gen models that are 50X larger.
And, considering MS and OpenAI are planning to build a $100 billion AI training computer, these 350K GPUs is just a tiny portion of what they are planning.
This isn't an overkill. This is the current plan: throw as much compute as possible and hope intelligence scales with compute.