Absolute best case in the cloud for the kind of GPUs this needs? ~$1/GPU/hr, but maybe up to $5/GPU/hr depending on provider and configuration. But companies or other organizations with extra capacity on their in-house hardware might also be able to just run their training script for a while, at which point the cost is more like electricity + opportunity cost.
It might depend on what you mean by "full training" and "fine tuning". They're not proposing to train a brand new foundational model from scratch, like a brand new LLaMa. But they want to do something considerably more intensive than just building a LORA.
The article contains this:
We are currently seeking GPU compute sponsors for training OpenOrca on the following platforms:
* Falcon 7b, 40b
* LLaMA 7b, 13b, 33b, 65b
* MPT-7b, 30b
* Any other targets that get a sponsor. (RWKV, OpenLLaMA)
As I understand, a full round of training on the OpenOrca dataset would be comparable to going from LLaMa to Vicuna, but hopefully with more dramatic effects, if the techniques proposed in the "Textbooks is all you need" paper work as well as advertised.