Hacker News new | past | comments | ask | show | jobs | submit login

Does this mean us plebs can run LLMs on gimped VRAM Nvidia lower end cards?



I don't think so. It seems to just lower the ram needed for the context window. Not for loading the model on the vram.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: