NVIDIA is the leader because most academic AI setups are NVIDIA based.
When AI moves further away from academia NVIDIA will have less of a grip.
Proprietary, hardware specific APIs never stand the test of time. Ask 3Dfx.
Either CUDA will open up, if it is to survive or open API use will spread.
Weirdly, NVIDIA hardware only outperforms competitors on its own API. When you compare NVIDIA on a level playing field, they aren't the clear winners. Nobody is right now.
I suspect the battle ground for AI will be accuracy rather than speed in the medium term and on paper AMD could win there...purely because they aren't shy about over speccing the RAM in their kit and certain price points.
For me, I want to run the largest models I can with the least amount of quantization for the best bang for the buck...and AMD is right there as soon as people start picking up APIs outside of CUDA.
I work for one of their big competitors, and all my conversations with customers tend to follow the same string; "NVIDIA is milking us dry, we want an alternative, but all the alternatives require significant redesign in languages and tools people are unfamiliar with and we can't afford that overhead". It tends to be very cut and dry.
Until university labs get people working in open frameworks and not CUDA, every student joining the industry will default to NVIDIA GPUs until they're forced otherwise. The few people I've managed to convert have been forced by supply constraints, not any desire to innovate or save themselves money. As long as NVIDIA can keep the market satiated with a critical mass of compute, they'll sit on their throne for a long ol' while.
> but all the alternatives require significant redesign in languages and tools people are unfamiliar with and we can't afford that overhead
Where I work, we've made it a principle to stay OpenCL-compatible even while going with NVIDIA due to their better-performing GPUs. I even go as far as writing kernels that can be compiled as either CUDA C++ or OpenCL-C, with a bit of duct-tape adapter headers:
of course, if you're working with higher-level frameworks then it's more difficult, and you depend on whether or not they provided different backends. So, no thrust for AMD GPUs, for example, but pytorch and TensorFlow do let you use them.
Yeah - which is to say that that competitors and other competitors aren't actually going to create a CUDA replacement any time soon. And, correct me if I'm wrong, it would be quite possible to create such a thing - AMD had a system which had a tool to do conversion a while back but I recall them not supporting it seriously.
The problem is that when a company has done serious capital investment to advance a market, anyone who invested equivalently wouldn't reap the same rewards - competition would just eat away each company's profits so no one will challenge that.
It's being done. The Mesa project has drivers for OpenCL (RustiCL) and Vulkan under development on any hardware that can provide the underlying facilities for that kind of support. This provides the basic foundation (together with other projects like SYCL) for a high-level alternative that can be properly supported across vendors (minus the expected hardware-specific quirks).
> Either CUDA will open up, if it is to survive or open API use will spread.
I don't really think so, at least not anytime soon while the hardware functionality continues to evolve so much, and while they seem to be concentrating on the high end devices/architecture rather than low-end stuff.
I've been more or less exclusively writing CUDA for the past decade in the AI/ML space (though have spent some time with OpenCL, Vulkan and other things along the way too). What a GPU is or should be I don't think has reached an evolutionary end yet. CUDA also is not a static thing, and it has co-evolved with the hardware, not being locked into some static industry standard with a boatload of annoying glExtWhatever dangling off of it. Over the past decade or so, Nvidia has introduced new ways that the register file can be used (Kepler shuffles), changed the memory model of GPUs and the warp execution model (to avoid deadlock/starvation by breaking the lockstep behavior somewhat), slowly changing the grid/CTA model (cf. cooperative groups, CTA clusters), adding more asynchronous components to the host APIs and the hardware (async DMAs), and has constantly changed the underlying instruction set, all of which leaks into CUDA in some way.
> Either CUDA will open up, if it is to survive or open API use will spread.
CUDA won't die if Open APIs take over AI inferencing operations. It's still used and applied in so many niche industries that it can only be "replaced" in industries like AI where companies invest in moving digital mountains. Stuff like Microsoft's ONNX project will go a long way towards making CUDA unneccesary for AI acceleration, but it won't ever kill the demand for CUDA.
Just look at how lethargic the industry's response has been in the wake of AI, and look at how other companies like AMD and Apple abandoned OpenCL before it was ready. Now Apple is banking on CoreML as an integration feature and AMD is segmenting their consumer and server hardware like crazy.
> Weirdly, NVIDIA hardware only outperforms competitors on its own API. When you compare NVIDIA on a level playing field, they aren't the clear winners.
That does not reflect any of the benchmarks I've seen at all, unless by "level playing field" you mean comparing old Nvidia chips to modern AMD ones. The only systems comparable to the DGX pods Nvidia sells is Apple's hardware, which lacks the networking and OS support to be competitive server side.
AMD is an amazing company for being open and transparent with their approach, but nice guys always finish last. This is a race between the highest-density TSMC customers, which means it's Apple and Nvidia laughing their respective paths to the bank.
Calling AMD a nice guy is a huge stretch in my opinion. From my understanding they didn't even allow use of ROCm with consumer GPUs until this year... CUDA is and always was very accessible to a broad audience.
CUDA runs on most recent Nvidia GPUs, which are replete on college campuses and well-supported in server software. AMD's GPGPU compute support differs from GPU to GPU, and Apple didn't start contributing acceleration patches to Pytorch and Tensorflow until stuff like Llama and Stable Diffusion took off.
When AI moves further away from academia NVIDIA will have less of a grip.
Proprietary, hardware specific APIs never stand the test of time. Ask 3Dfx.
Either CUDA will open up, if it is to survive or open API use will spread.
Weirdly, NVIDIA hardware only outperforms competitors on its own API. When you compare NVIDIA on a level playing field, they aren't the clear winners. Nobody is right now.
I suspect the battle ground for AI will be accuracy rather than speed in the medium term and on paper AMD could win there...purely because they aren't shy about over speccing the RAM in their kit and certain price points.
For me, I want to run the largest models I can with the least amount of quantization for the best bang for the buck...and AMD is right there as soon as people start picking up APIs outside of CUDA.