Not really a problem. NVIDIA sells cards based around 32-bit (and now increasing...

dharma1 · on June 3, 2016

Vulkan/SPIR-V looks promising just needs chip vendors (ARM, Qualcomm, AMD, Intel) to come together and invest in CuDNN equivalents.

Although I reckon deep learning on mobile (at least for some use cases like cameras) will use dedicated silicon from Movidius etc and ultimately be embedded in the camera chips directly

techdragon · on June 3, 2016

Given the quality of OpenCL and its cross platform nature, it's amazing that everything is still written directly for CUDA...

There were no CUDA -> OpenCL toolchains last time I checked. Which is even more frustrating.

mattkrause · on June 3, 2016

There is CU2CL: http://chrec.cs.vt.edu/cu2cl/

The cross-platform nature is actually part of the problem--the whole point of doing GPGPU work is that you're playing to the hardwares' strengths, which can be difficult when the hardware can be nearly anything from a CPU to a GPU to an FPGA.

It doesn't help that until recently, AMD hasn't tried to push OpenCL nearly as hard as nVIDIA pushes CUDA.

Athas · on June 3, 2016

Modern AMD and NVIDIA GPUs are fairly similar hardware-wise, and it is not hard to write OpenCL code that executes efficiently on both. I agree that it is pretty hopeless to write performance-portable OpenCL across entirely different architectures, however.

mattkrause · on June 5, 2016

Sure, but if you go with nVIDIA, you also get access to all the other goodies they distribute (thrust, cudaFFT, cudaDNN, etc) and all the CUDA-compatible stuff other people have written, like Theano and TensorFlow.

It does seem like people have gotten a little more interested in OpenCL lately, but it still lags pretty far behind. As dharma1 says below, AMD seems weirdly uninterested in catching up. If I were in change of AMD, I'd be throwing money and programmers at this: "Want to port your library to OpenCL? Here, have a GPU! We'll help."

dharma1 · on June 3, 2016

AMD management has completely missed the memo on deep learning. No mention of deep learning or FP16 perf yesterday when Polaris was announced - it was all around VR.

They are just not turning up to the party and as a company are running out of time if Polaris and Zen dont sell.

merijnv · on June 3, 2016

> Given the quality of OpenCL and its cross platform nature

I'm sorry, WHAT? OpenCL is absolute shit. Cumbersome API definition, lack of low-level control, stringly typed programs (all programs are provided as strings and kernels are identified with those too). Which means nearly no compile-time feedback, it's hard to embed GPU kernels into a single binary. The API is woefully lacking in flexibility (no dynamic launch), OpenCL 2.0 is better, (EDIT: Apparently AMD supports it now, I'd have to check whether Intel/NVidia have also added support), but no one supports it so it's also irrelevant.

Not only that, AMD hardware is terrible. Atomics on NVidia's maxwell are orders of magnitude faster than on AMD (to the point of being comparable to non-atomic operations with low contention).

CUDA's environment provides: Better documentation, better feature support, saner development and debugging, possibility to ship both generic & specialised binary kernels, JITtable kernels in intermediate representation, better compile time sanity checking, the ability to generate your own IR/CUDA assembler from non CUDA languages...

The reason everyone does CUDA and uses NVidia is because there's zero real competition. AMD is the only company that cares about OpenCL, Intel and NVidia just implement the bare minimum to have AMD's OpenCL code be portable to them. Intel has OpenMP and TBB for the Phi, NVidia has CUDA.

To me it's crazy that anyone keeps mentioning OpenCL as a serious alternative. In theory I agree that an open standard would be nice, but over here in reality where I have to actually write code there is no realistic alternative to CUDA if you want to stay sane.

14113 · on June 3, 2016

You write OpenCL if you want to target anything other than AMD/NVIDIA/Intel. If you're writing code for an embedded application (with some heterogeneous core), or for a mobile application, you absolutely have to write OpenCL code, as there's no alternative. OpenCL is shit, but it's cross platform shit.

If your aim is to get 100% performance in a GPU heavy cluster, then sure, you're going to need to write CUDA code, and buy some NVIDIA GPUS, however there are a lot of applications which run in entirely different environments which _only_ support OpenCL.

pjmlp · on June 3, 2016

> for a mobile application

Not really, OpenCL doesn't have any real foot on mobile.

Android uses their own Renderscript dialect instead of OpenCL and iOS moving away from OpenCL to Metal Compute.

And the dying WP uses C++ AMP.

alexvoica · on June 5, 2016

Yes, but Vulkan will change that since it is an API designed for graphics and compute.

SXX · on June 3, 2016

It's of course not about OpenCL, but AMD recently released their HIP tool that make it easier to target HSA using exist CUDA code.

https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP

pjmlp · on June 3, 2016

> Given the quality of OpenCL...

What quality?

CUDA has had support for Fortran, C++ and any language with compilers that could target PTX from the early days.

Also the graphical debugging tools are quite good. Debugging GPUs feel just like debugging any other application.

Meanwhile on OpenCL land, plain C and printf debugging.

Only with OpenCL 2.0 Khronos finally started to address these issues, so what quality?!

merijnv · on June 3, 2016

Does anyone actually implement OpenCL 2.0 yet? Last I checked not even AMD supported it, and they're the only company that has a reason to care about advancing OpenCL.

pjmlp · on June 3, 2016

Yeah, AMD has it now.

http://developer.amd.com/tools-and-sdks/opencl-zone/amd-acce...