>If you can field a competitively priced consumer card if this unicorn were to s...

AnthonyMouse · on Oct 6, 2023

The thing that causes it to be competitively priced is having enough production capacity to prevent that from happening.

One way to do that may be to produce a card on an older process node (or the existing one when a new one comes out) that has a lot of VRAM. There is less demand for the older node so they can produce more of them and thereby sell them for a lower price without running out.

Havoc · on Oct 6, 2023

>if this unicorn were to show up

A unicorn like that showed up a couple hours ago. Someone posted a guide for getting llama to run on a 7900xtx

https://old.reddit.com/r/LocalLLaMA/comments/170tghx/guide_i...

It's still slow and janky but this really isn't that far away.

I don't buy that AMD can't make this happen if they actually tried.

Go on fiverr, get them to compile a list of top 100 people in the DIY LLM space, send them all free 7900XTXs. Doesn't matter if half of it is wrong, just send it. Next take 1.2m USD, post a dozen 100k bounties against llama.cpp that are AMD specific - support & optimise the gear. Rinse and repeat with every other hobbyist LLM/stable diffusion project. A lot of these are zero profit open source / passion / hobby projects. If 6 figure bounties show up it'll absolute raise pulses. Next do all the big youtubers in the space - carefully on that one so that it doesn't come across as an attempted pay-off...but you want them to know that you want this space to grow and are willing to put your money where your mouth is.

That'll cost AMD what 2m 3m? To move the needle on a multi billion market? That's the cheapest marketing you've ever seen.

As I said the datacenter & enterprise market is another beast entirely full of moats and strategy, but I don't see why a suitably motivated senior AMD exec can't tackle the enthusiast market single handedly with a couple of emails, a cheque book and a tshirt that has the nike slogan on it.

>what's to say that all the non-consumers won't just scarf up these equally performant yet lower priced cards

It doesn't matter. They're in the business of selling cards. To consumers, to datacenters, to your grandmother. From a profit driven capitalist company the details don't matter as long as there is traction & volume. The above - opening up even the possibility of a new market - is gold in that perspective. And from a consumer perspective anything that breaks the nvidia cuda monopoly is a win.

lhl · on Oct 6, 2023

llama.cpp, ExLlama, and MLC LLM have all had ROCm inferencing for months (here are a bunch of setup instructions I've written up, for Linux and Windows: https://llm-tracker.info/books/howto-guides/page/amd-gpus ) - but I don't think that's the problem (and wouldn't drive lots of volume or having downstream impact in any case).

The bigger problem is on the training/research support. Eg, here's no official support for AMD GPUs for bitsandbytes, and no support at all for FlashAttention/FA2 (nothing that 100K in hardware/grants to Dettmers or Dao's labs wouldn't fix I suspect).

The real elephant though is that AMD still having the disconnect that lack of support for consumer cards and home/academic devs in general has been disastrous (while Nvidia supports CUDA on basically every single GPU they've made since 2010) - just last week there was this mindblowing thread where it turns out an AMD employee is paying out of pocket for AMD GPUs to support build/CI for drivers on Debian. I mean, WTF, that's stupidity that's beyond embarrassing and gets into negligence terriroty IMO: https://news.ycombinator.com/item?id=37665784

bornfreddy · on Oct 7, 2023

Wow, that is really awkward... AMD should be donating the cards and even paying extra for the privilege - this is an important step for getting satisfied consumers. I hope they notice and rectify this situation so that Debian (and with it all downstream distros, like Ubuntu) can provide better support for their cards. I mean, that's a no-brainer...

dylan604 · on Oct 6, 2023

>an AMD employee is paying out of pocket for AMD GPUs

I hope he's at least getting an employee discount! I guess AMD is not a fan of the 20% concept either

65a · on Oct 7, 2023

I was running llama on a w7900 a month ago, with 48gb of VRAM and excellent performance. ROCm support got a lot better really recently.

Zetobal · on Oct 8, 2023

Inference ≠ Training