> and of those vendors, it seems like Nvidia is the only one that shipped anything truly usable. Medium-sized LLMs, Stable Diffusion and probably even stuff like OAI Whisper is faster run on Apple's GPUs than their AI coprocessor.
Be careful not to have NVIDIA-shaped tunnel vision. Performance isn't the whole story. It's very telling that approximately everybody making SoCs for battery powered devices (phones, tablets, laptops) has implemented an AI coprocessor that's separate from the GPU. NVIDIA may take exception, but the industry consensus is that GPUs aren't always the right solution to every AI/ML-related problem.
Ideally, you're right. Realistically, Apple has to choose between using their powerful silicon (the GPU) for high-quality results or their weaker silicon (the Neural engine) for lower-power inference. Devices that are designed around a single power profile (eg. desktop GPUs) can integrate the AI logic into the GPU and have both high-quality and high-speed inference. iPhones gotta choose one or the other.
There's not nothing you can run on that Neural Engine, but it's absolutely being misunderstood relative to the AI applications people are excited for today. Again; if chucking a few TOPS of optimized AI compute onto a mobile chipset is all we needed, then everyone would be running float16 Llama on their smartphone already. Very clearly, something must change.
Be careful not to have NVIDIA-shaped tunnel vision. Performance isn't the whole story. It's very telling that approximately everybody making SoCs for battery powered devices (phones, tablets, laptops) has implemented an AI coprocessor that's separate from the GPU. NVIDIA may take exception, but the industry consensus is that GPUs aren't always the right solution to every AI/ML-related problem.