AFIK in research nothing AMD had competitive GPGPUs AFIK just only relevant to a...

AFIK in research nothing

AMD had competitive GPGPUs AFIK just only relevant to a small number of very very large customers

problems where mostly outside of research

mainly there wasn't much insensitive (potential profit) for AMD to bring there GPGPU tooling to the consumer/small company marked and polish it for LLMs (to be clear I do not mean OpenCL, which was long term available but general subpar and badly supported)

Nvideas mindshare was just too dominant and a few years ago it wasn't that uncommon for researchers to idk. create new building blocks or manual optimizations involving direct work with CUDA and similar

But that's exactly what changed, by now, especially with LLMs, research does nearly always only involve usage of "high level abstractions" which are quite independent of the underlying gpu compute code (high-level might not be the best description as many of this GPU independent abstractions are still quite low level) .

AMD has already shown that they can support that quite well and it seems to be mainly be question of polishing before it becomes more widely available.

Another problem is that in the past AMD had decent GPU (compute/server) parts and GPU (gaming) parts but there GPU (gaming) parts where not that usable for compute. On the other hand Nvidea sold high end GPUs which can do both and can be "good enough" even for a lot of smaller companies. So a ton of researchers had easy access to that GPUs where access to specialized server compute cards is always complicated and often far more expensive (e.g. due to only being sold in bulk). This still somewhat holds up for the newest generation of AMD GPUs but much much less so. At the same time LLMs become so large that even using the highest-end Nvidea GPU became ... to slow. And selling a more high end customer GPU isn't really viable either IMHO. Additionally local inference seems to become much much more relevant and new AMD laptop CPU/GPU bundles and dedicated GPUs seem to be quite well equipped for that.

Also the marked it growing a lot, so even if you just manage to get smaller % cut of the marked share it might now be profitable. I.e. they don't need to beat Nvidea in that marked anymore to make profit, grabbing a bit of marked share can now already be worthwhile.

---

> port torch

Idk. if it's already publically available/published but AMD has demoed proper well working torch support based on ROCm (instead of OpenCL).