Hacker News new | past | comments | ask | show | jobs | submit login

I would sell a kidney for one of these. It's basically impossible to train language models on a consumer 24GB card. The jump up is the A6000 ADA, at 48GB for $8,000. This one will probably be priced somewhere in the $100k+ range.



Use 4 consumer grade 4090 then. It would be much cheaper and better in almost every aspect. Also even with this, forget about training foundational models. Meta spent 82k GPU hours on the smallest llama and 1M hours on largest.


Go with 2x 3090s instead. 4000 series doesn't support SLI, so you're stuck with the max of whatever one card you get.


If I remember correctly the NVLINK adds 100GB/s (where PCIE 4.0 is 64GB/s). Is it really worth getting 3090 performance (roughly half) for that extra bus speed?


Ampere NVLink (NV3) was 600 GByte/sec, with Hopper (NV4) it's 900 GByte/sec. https://www.nvidia.com/en-us/data-center/nvlink/


That is for the data center NVLINK, according to Wikipedia, for GA102 (3090) it is a 56.25GB/s bidirectional, yielding 112.5GB/s total bus bandwidth.


Ah, that's true, thanks. It's the same type of NVLink as on the A40 GPU. https://images.nvidia.com/content/Solutions/data-center/a40/...


PCIE 4.0*16 is 32 GB/s.


You think? It’s double 48 GB (per card) so why wouldn’t it be in the $20k range?


Machine learning is so hyped right now (with good reason) so customers are price insensitive.


I guess we'll see.


Tomshardware is estimating $80k.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: