Hacker News new | past | comments | ask | show | jobs | submit login

lol good luck running a 13B model on a single GPU



Seeing the performance of implementations like FlexGen [1], I don't think it would be entirely unreasonable to run a 13B model on a single GPU for personal usage purposes. You are not going to a run a public service out of it, but it probably would be good enough to run your own ChatGPT or Copilot locally.

[1]: https://github.com/FMInference/FlexGen


You need a RTX 3090 24gb




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: