Does this mean us plebs can run LLMs on gimped VRAM Nvidia lower end cards? | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

tharmas 60 days ago | parent | context | favorite | on: New LLM optimization technique slashes memory cost...

Does this mean us plebs can run LLMs on gimped VRAM Nvidia lower end cards?

swifthesitation 60 days ago [–]

I don't think so. It seems to just lower the ram needed for the context window. Not for loading the model on the vram.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact