Hacker News new | past | comments | ask | show | jobs | submit login

That size is on the edge of something you can train at home





If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.

I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.


Hmm is there anything reasonably ready made* for this spot? Training and querying a llm locally on an existing codebase?

* I don't mind compiling it myself but i'd rather not write it.


Not even on the edge. That's something you could train on a 2 GB GPU.

The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: