We're NOT business users, we just want to run our own LLM at home. Given the siz...

enlyth · on March 21, 2023

Exactly, we're just below that sweet spot right now.

For example on 24GB, Llama 30B runs only in 4bit mode and very slowly, but I can imagine a RLHF finetuned 30B or 65B version running in at least 8bit would be actually useful, and you could run it on your own computer easily.

bick_nyers · on March 21, 2023

Do you know where the cutoff is? Does 32GB VRAM give us 30B int8 with/without a RLHF layer? I don't think 5090 is going to go straight to 48GB, I'm thinking either 32 or 40GB (if not 24GB).

riku_iki · on March 21, 2023

> For example on 24GB, Llama 30B runs only in 4bit mode and very slowly

why do you think adding vram, but not cores will make it run faster?..

enlyth · on March 21, 2023

I've been told the 4 bit quantization slows it down, but don't quote me on this since I was unable to benchmark at 8 bit locally

In any case, you're right it might not be as significant, however, the quality of the output increases with 8/16bit, and running 65B is completely impossible on 24GB

riku_iki · on March 22, 2023

It's not impossible, there are several projects which load model layer by layer for execution from the disk or ram, but it will be much slower.

bick_nyers · on March 21, 2023

I don't think you understand though, they don't WANT you. They WANT the version of you who makes $150k+ a year and will splurge $5k on a Quadro.

If they had trouble selling stock we would see this niche market get catered to.

koheripbal · on March 22, 2023

That IS me. $5K is not enough to run an LLM at home (beyond the non-functional reduced quantization smaller models).

bick_nyers · on March 22, 2023

Ahh yes, looks like I was too generous with my numbers, the new Quadro with 48GB VRAM is $7k, so you probably would need $14k and a Threadripper/Xeon/EPYC workstation because you won't have enough PCIE lanes/RAM/Memory Bandwidth otherwise.

So maybe more accurate is $200k+ a year and $20-30k on a workstation.

I grew up on $20k a year, the numbers in tech. are baffling!