On an Apple M1 with 16gig RAM, without using Pytorch compiled to take advantage of Metal, it could take 12mins to generate an image with a tweet-length prompt. With Metal, it takes less than 60 seconds.
And PyTorch on the M1 (without Metal) uses the fast AMX matrix multiplication units (through the Accelerate Framework). The matrix multiplication on the M1 is on par with ~10 threads/cores of Ryzen 5900X.
I was using a 6ish year old amd cpu with 16gigs of ram and generating a prompt would take about a half hour. Which is still massively impressive for what it is.
yes, and if he does it on a paid machine with a better GPU it'll be even faster!
While true, neither your statement or mine above is germane to the discussion. It wasn't about how long it takes. It's a discussion of how cool it is that it can be done on that machine at all.
On 21 April 2023 Google blocked usage of Stable Diffusion with a free account on colab. You need a paid plan to use it.
Apparently there are ways around it, but I just switched to runpod.io. It's very cheap (around $0.80/h for a 4090 including storage) and having a real terminal is worth it.
I wonder how fast would a consumer PC, with no GPU, generate an image with say 16gb of RAM?