Nvidia don't want consumers using consumer GPUs for business. If you are a busin...

Melatonic · on March 21, 2023

That was always why the Titan line was so great - they typically unlocked features in between Quadro and Gaming cards. Sometimes it was subtle (like very good FP32 AND FP16 performance) or adding full 10 bit colour support if you had a Titan only. Now it seems like they have opened up even more of those features to consumer cards (at least the creative ones) with the studio drivers.

andrewstuart · on March 21, 2023

Hmmm ... "Studio Drivers" ... how are these tangibly different to gaming drivers?

According to this, the difference seems to be that Studio Drivers are older and better tested, nothing else.

https://nvidia.custhelp.com/app/answers/detail/a_id/4931/~/n...

What am I missing in my understanding of Studio Drivers?

""" How do Studio Drivers differ from Game Ready Drivers (GRD)?

In 2014, NVIDIA created the Game Ready Driver program to provide the best day-0 gaming experience. In order to accomplish this, the release cadence for Game Ready Drivers is driven by the release of major new game content giving our driver team as much time as possible to work on a given title. In similar fashion, NVIDIA now offers the Studio Driver program. Designed to provide the ultimate in functionality and stability for creative applications, Studio Drivers provide extensive testing against top creative applications and workflows for the best performance possible, and support any major creative app updates to ensure that you are ready to update any apps on Day 1. ""

koheripbal · on March 21, 2023

Isn't a new Titan RTX 4090 coming out soon?

enlyth · on March 21, 2023

An alleged photo of an engineering sample was spotted in the wild a while ago, but no one knows if it's actually going to end up being a thing you can buy.

koheripbal · on March 21, 2023

We're NOT business users, we just want to run our own LLM at home.

Given the size of LLMs, this should be possible with just a little bit of extra VRAM.

enlyth · on March 21, 2023

Exactly, we're just below that sweet spot right now.

For example on 24GB, Llama 30B runs only in 4bit mode and very slowly, but I can imagine a RLHF finetuned 30B or 65B version running in at least 8bit would be actually useful, and you could run it on your own computer easily.

bick_nyers · on March 21, 2023

Do you know where the cutoff is? Does 32GB VRAM give us 30B int8 with/without a RLHF layer? I don't think 5090 is going to go straight to 48GB, I'm thinking either 32 or 40GB (if not 24GB).

riku_iki · on March 21, 2023

> For example on 24GB, Llama 30B runs only in 4bit mode and very slowly

why do you think adding vram, but not cores will make it run faster?..

enlyth · on March 21, 2023

I've been told the 4 bit quantization slows it down, but don't quote me on this since I was unable to benchmark at 8 bit locally

In any case, you're right it might not be as significant, however, the quality of the output increases with 8/16bit, and running 65B is completely impossible on 24GB

riku_iki · on March 22, 2023

It's not impossible, there are several projects which load model layer by layer for execution from the disk or ram, but it will be much slower.

bick_nyers · on March 21, 2023

I don't think you understand though, they don't WANT you. They WANT the version of you who makes $150k+ a year and will splurge $5k on a Quadro.

If they had trouble selling stock we would see this niche market get catered to.

koheripbal · on March 22, 2023

That IS me. $5K is not enough to run an LLM at home (beyond the non-functional reduced quantization smaller models).

bick_nyers · on March 22, 2023

Ahh yes, looks like I was too generous with my numbers, the new Quadro with 48GB VRAM is $7k, so you probably would need $14k and a Threadripper/Xeon/EPYC workstation because you won't have enough PCIE lanes/RAM/Memory Bandwidth otherwise.

So maybe more accurate is $200k+ a year and $20-30k on a workstation.

I grew up on $20k a year, the numbers in tech. are baffling!