Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

IMO the trouble is that CUDA is too low level to allow emulation without a major loss of performance, and even if there was a choice of CUDA-compatible vendors, people are ultimately going to vote with their wallets. It's not enough to be compatible - you need to be compatible while providing the same or better performance (else why not just use NVIDIA).

A better level to target compatibility would be at the framework level such as PyTorch, where the building blocks of neural networks (convolution, multi-head attention, etc, etc) are high level and abstract enough to allow flexibility in mapping them onto AMD hardware without compromising performance.

However, these frameworks are forever changing and playing continual catch-up there still wouldn't be a great place to be, especially without a large staff dedicated to the effort (writing hand-optimized kernels), which AMD don't seem to be able/willing to muster.

So, finally, perhaps the strategically best place for AMD to invest would be in compilers and software tools to allow kernels to be written in a high level language. Becoming a first class Mojo target wouldn't be a bad place to start, assuming they are not already in partnership.



> However, these frameworks are forever changing and playing continual catch-up there still wouldn't be a great place to be, especially without a large staff dedicated to the effort (writing hand-optimized kernels), which AMD don't seem to be able/willing to muster.

The situation in reality is quite actually quite bad.

Given that I have a M2 Max and no nVidia cards, I've tried enough PyTorch-based ML libraries that at some point, I basically expect them to flat out show an error saying CUDA 10.x+ is required once the dependencies are installed (eg. one of them being the bitsandbytes library -- in fairness, there's apparently some effort trying to port the code to other platforms as well).

As of today, the whole field is moving too fast that it's simply not worth it for a solo dev or even a small team to even attempt getting a non-CUDA stack up and running, especially with the other major GPU vendors not (able to?) hiring people to port the hand-optimized CUDA kernels.

Hopefully the situation will change after these couple years of frenzy, but in the time being I don't see any viable way to avoid using a CUDA stack if one is serious with getting ML stuff done.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: