Hacker News new | past | comments | ask | show | jobs | submit login

Is there any reason OpenCL is not the standard in implementations like PyTorch? Similar performance, open standard, runs everywhere - what's the downside?



IIRC, ease of implementation (for the GPU kernels), and cross-compatibility (the same bytecode can be loaded by multiple models of GPU).


How is CUDA-C that much easier than OpenCL? Having ported back and forth myself, the base C-like languages are virtually identical. Just sub "__syncthreads();" for "barrier(CL_MEM_FENCE)" and so on. To me the main problem is that Nvidia hobbles OpenCL on their GPUs by not updating their CL compiler to OpenCL 2.0, so some special features are missing, such as many atomics.


Never used it myself, these are just the main reasons I've heard from friends.


The ease of implementation using CUDA means that your code because effed for life, because it is no longer valid C/C++, unless you totally litter it with #ifdefs to special case for CUDA. In my own proprietary AI inference pipeline I've ended up code-generating to a bunch of different backends (OpenCL SpirV, Metal, CUDA, HLSL, CPU w. OpenMP), giving no special treatment to CUDA, and the resulting code is much cleaner and builds with standard open source toolchains.


> The ease of implementation using CUDA means that your code because effed for life

yes, yes it absolutely does. establishing market dominance as everyone wants to use CUDA but almost nobody wants to write their kernel twice.


Downsides are it can't express a bunch of stuff cuda or openmp can plus the nvidia opencl implementation is worse than their cuda one. So opencl is great if you want a lower performance way of writing a subset of the programs you want to write.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: