Hacker News new | past | comments | ask | show | jobs | submit login

What does “355 years” mean in this context? I assume it’s not human years



Claimed here, so this is presumably the reference (355 GPU Years):

https://lambdalabs.com/blog/demystifying-gpt-3

"We are waiting for OpenAI to reveal more details about the training infrastructure and model implementation. But to put things into perspective, GPT-3 175B model required 3.14E23 FLOPS of computing for training. Even at theoretical 28 TFLOPS for V100 and lowest 3 year reserved cloud pricing we could find, this will take 355 GPU-years and cost $4.6M for a single training run. Similarly, a single RTX 8000, assuming 15 TFLOPS, would take 665 years to run."


That's still including margins of cloud vendors. OpenAI had Microsoft providing resources which could do that at much lower cost. It still won't be cheap but you'll be way below $5m if you buy hardware yourself, given that you're able to utilize it long enough. Especially if you set it up in a region with low electricity prices, latency doesn't matter anyway.


Cumulative hours spent across training hardware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: