I don't see CPUs being competitive for low-latency inference in the web accessib...

I don't see CPUs being competitive for low-latency inference in the web accessible SaaS ('software as a service') space. They certainly can be attractive for specialized backend applications where batch (in the macro-scheduling sense) processing can be utilzed. The author also neglects the attention that other GPU makers are investing in improving their software stacks, particularly AMD, to compete directly with Nvidia.