Hacker News new | past | comments | ask | show | jobs | submit login

It might depend on what you mean by "full training" and "fine tuning". They're not proposing to train a brand new foundational model from scratch, like a brand new LLaMa. But they want to do something considerably more intensive than just building a LORA.

The article contains this:

  We are currently seeking GPU compute sponsors for training OpenOrca on the following platforms:
  * Falcon 7b, 40b
  * LLaMA 7b, 13b, 33b, 65b
  * MPT-7b, 30b
  * Any other targets that get a sponsor. (RWKV, OpenLLaMA)
As I understand, a full round of training on the OpenOrca dataset would be comparable to going from LLaMa to Vicuna, but hopefully with more dramatic effects, if the techniques proposed in the "Textbooks is all you need" paper work as well as advertised.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: