Why does publishing papers require the latest and greatest GPUs? My understanding is that the paper talks about very general principles.
> So I guess I'm wondering if I should buy a GPU myself or should I just rent on the cloud if I wanted to start getting some experience in this field. How do you even get experience in this normally anyways, do you get into really good schools and into their AI labs which have a lot of funding?
Unless you have money to throw around, you'd better start working on something, write some code and get them running on a leased GPU, before deciding on a long term plan
> My understanding is that the paper talks about very general principles.
This isn't really true.
In this case it's specific to NVidia's tensor matrix multiply-add (MMA) instructions, which lets it use silicon that would otherwise be unusued at that point.
> Why does publishing papers require the latest and greatest GPUs?
You really do need to test these things on real hardware and across hardware. When you are doing unexpected things there are lots of unexpected interaction effects.
As a reminder, the context is "require the latest and greatest GPUs", responding to the parent comment. "General" doesn't mean "you can do this on an Intel Arc GPU" level of general.
That said, my comment could have used a bit more clarity.
> So I guess I'm wondering if I should buy a GPU myself or should I just rent on the cloud if I wanted to start getting some experience in this field. How do you even get experience in this normally anyways, do you get into really good schools and into their AI labs which have a lot of funding?
Unless you have money to throw around, you'd better start working on something, write some code and get them running on a leased GPU, before deciding on a long term plan