Absolute best case in the cloud for the kind of GPUs this needs? ~$1/GPU/hr, but...

weinzierl · on June 29, 2023

That sounds cheap. Can I really train a 13b model from scratch for just USD 2000?

vessenes · on June 29, 2023

Nope. Salesforce just announced their heavily trained 7B model cost them $150k at Google.

What you can do is fine-tune an open 7B model for a few thousand dollars, and that's the plan for these folks.

pk-protect-ai · on June 29, 2023

This is full training right? Not fine tuning. Fine tuning must be cheap like those mentioned $2k ...

loudmax · on June 29, 2023

It might depend on what you mean by "full training" and "fine tuning". They're not proposing to train a brand new foundational model from scratch, like a brand new LLaMa. But they want to do something considerably more intensive than just building a LORA.

The article contains this:

  We are currently seeking GPU compute sponsors for training OpenOrca on the following platforms:
  * Falcon 7b, 40b
  * LLaMA 7b, 13b, 33b, 65b
  * MPT-7b, 30b
  * Any other targets that get a sponsor. (RWKV, OpenLLaMA)

As I understand, a full round of training on the OpenOrca dataset would be comparable to going from LLaMa to Vicuna, but hopefully with more dramatic effects, if the techniques proposed in the "Textbooks is all you need" paper work as well as advertised.

SparkyMcUnicorn · on June 29, 2023

Depending on how much fine-tuning you're doing, it can be free (colab) or a few bucks.