I'd love an alternative to CUDA. The problem is that as far as I can see, OpenCL...

marmaduke · on Aug 18, 2016

Nope. I have some substantial simulation code written against OpenCL that runs on Intel OpenCL and NVIDIA without modifications, and rather performant on both of them. The only part of the code specific to vendor is the platform selection, which is one line of code.

OpenCL falls down in terms of standard libraries such as cu{dnn,sparse,blas} but if you're writing everything from scratch it's fine.

joe_the_user · on Aug 18, 2016

I'm an independent developer in the process of choosing a GPGPU library.

I can see simple, comprehensible 20-50 sample code for cuda that does most simple tasks. With OpenCL, I get references to version, boiler-plate, mode with nothing that boils down to simple code.

If you have a simple sample, you should post it here or blog about it.

marmaduke · on Aug 19, 2016

Look at Pyopencl; it makes it quite easy to prototype OpenCL apps in tens of lines of code.

Later, it's straightforward to port to c or c++ if that's your thing, though I find having numpy et al handy even in production code

duaneb · on Aug 18, 2016

Boilerplate is one thing—this is what transpilers are for, among other solutions—but can OpenCL match CUDA on a performance level?

marmaduke · on Aug 19, 2016

Cl is comparable to Cuda where the cuda code doesn't employ nvidia specific primitives.

dagss · on Aug 18, 2016

Interesting...so do you use the vector subset designed for CPUs or the wavefront subset designed for GPUs?

marmaduke · on Aug 19, 2016

I write code which maximizes coalesced memory access and plain old 32 bit floats.. nothing special. The respective drivers do a good job of mapping that onto the hardware given sufficient work group sizes.

pjmlp · on Aug 18, 2016

Only with OpenCL 2.x have they came around and started to support C++ for writing kernels, as well as, a standard bytecode for any other language to target. Which most vendors still don't support.

Whereas CUDA supported C++ and Fortran from day 1, with the PTX support added a few versions later.

Also the debugging tools, from the presentations I have seen, are much more developer friendly on CUDA.

Of course developers rather use APIs that offer more modern experiences than ones still stuck in pure C, with a compiler at the driver level, forcing each programmer to writer the boilerplate to compile and link.

Now it might already be too late for OpenCL in spite of the latest improvements.

visarga · on Aug 18, 2016

Can anyone tell me what exactly is missing from OpenCL to be able to run the primitives of deep learning frameworks? Like, does it not have some kind of operation that is essential for matrix manipulation?

pjmlp · on Aug 18, 2016

Up to OpenCL 2.1 was support for C++ and Fortran, forcing everyone to either use C or generate C code from their compilers.