I wonder how big that model is in RAM/disk. I use LLMs for FFMPEG all the time, and I was thinking about training a model on just the FFMPEG CLI arguments. If it was small enough, it could be a package for FFMPEG. e.g. `ffmpeg llm "Convert this MP4 into the latest royalty-free codecs in an MKV."`
Please submit a blog post to HN when you're done. I'd be curious to know the most minimal LLM setup needed get consistently sane output for FFMPEG parameters.
That is easily small enough to host as a static SPA web app. I was first thinking it would be cool to make a static web app that would run the model locally. You'd make a query and it'd give the FFMPEG commands.
You can train that size of a model on ~1 billion tokens in ~3 minutes on a rented 8xH100 80GB node (~$9/hr on Lambda Labs, RunPod io, etc.) using the NanoGPT speed run repo: https://github.com/KellerJordan/modded-nanogpt
For that short of a run, you'll spend more time waiting for the node to come up, downloading the dataset, and compiling the model, though.
If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.
I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.
Not even on the edge. That's something you could train on a 2 GB GPU.
The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.
Hm... I wonder what your use case it. I do the modern Enterprise Java and the tab completion is a major time saver.
While interactive AI is all about posing, meditating on the prompt, then trying to fix the outcome, IntelliJ tab completion... shows what it will complete as you type and you Tab when you are 100% OK with the completion, which surprisingly happens 90..99% of the time for me, depending on the project.
For context, GPT-2-small is 0.124B params (w/ 1024-token context).