Mostly true, and you'll get no argument from me on the AMD & Intel are fuckwits front. Intel does ok, but AMD in particular has completely dropped the ball on the SW front, and has been doing so for at least 25 yrs.
The point I was glibly trying to get across was that even a small effort on the part of AMD to treat the SW side as seriously as NVidia does would have yielded great benefits, and not have left them so far behind.
Also, there is a lot of work going on in the gcc & llvm toolchain to not only use OpenMP to target accelerators in computationally intensive loops but, in the case of llvm, to also target tensor instructions for more efficient code generation (https://lists.llvm.org/pipermail/llvm-dev/2021-November/1537...).
It took the AI folk less than 18 months to almost completely move away from CUDA to Tensorflow and then PyTorch... LLVM, imho, is going to do the same for Sci/Eng and general code bases in the next 2 years.