Hacker News new | past | comments | ask | show | jobs | submit login

I'd love an alternative to CUDA.

The problem is that as far as I can see, OpenCL is in no way that. Basically, OpenCL gives me the impression that the oceans of boiler plate required both make development hard and effectively locks you into a specific vendor also since the boiler-plate is going to be setting things up for one's specific vendor.




Nope. I have some substantial simulation code written against OpenCL that runs on Intel OpenCL and NVIDIA without modifications, and rather performant on both of them. The only part of the code specific to vendor is the platform selection, which is one line of code.

OpenCL falls down in terms of standard libraries such as cu{dnn,sparse,blas} but if you're writing everything from scratch it's fine.


I'm an independent developer in the process of choosing a GPGPU library.

I can see simple, comprehensible 20-50 sample code for cuda that does most simple tasks. With OpenCL, I get references to version, boiler-plate, mode with nothing that boils down to simple code.

If you have a simple sample, you should post it here or blog about it.


Look at Pyopencl; it makes it quite easy to prototype OpenCL apps in tens of lines of code.

Later, it's straightforward to port to c or c++ if that's your thing, though I find having numpy et al handy even in production code


Boilerplate is one thing—this is what transpilers are for, among other solutions—but can OpenCL match CUDA on a performance level?


Cl is comparable to Cuda where the cuda code doesn't employ nvidia specific primitives.


Interesting...so do you use the vector subset designed for CPUs or the wavefront subset designed for GPUs?


I write code which maximizes coalesced memory access and plain old 32 bit floats.. nothing special. The respective drivers do a good job of mapping that onto the hardware given sufficient work group sizes.


Only with OpenCL 2.x have they came around and started to support C++ for writing kernels, as well as, a standard bytecode for any other language to target. Which most vendors still don't support.

Whereas CUDA supported C++ and Fortran from day 1, with the PTX support added a few versions later.

Also the debugging tools, from the presentations I have seen, are much more developer friendly on CUDA.

Of course developers rather use APIs that offer more modern experiences than ones still stuck in pure C, with a compiler at the driver level, forcing each programmer to writer the boilerplate to compile and link.

Now it might already be too late for OpenCL in spite of the latest improvements.


Can anyone tell me what exactly is missing from OpenCL to be able to run the primitives of deep learning frameworks? Like, does it not have some kind of operation that is essential for matrix manipulation?


Up to OpenCL 2.1 was support for C++ and Fortran, forcing everyone to either use C or generate C code from their compilers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: