Okay... what's the downside?

refulgentis · on Oct 3, 2023

In terms of, what's the tradeoff for the time decrease?

Apples to oranges, they're comparing 11 hours on a Raspberry Pi Zero to:

- 10 seconds on Intel i7-13700

- 3 seconds on Intel i9-9990XE

- 5 seconds on Ryzen 9-5900X

Additionally, the 2048 is accomplished by using RealESRGAN to 2x, which isn't close to what a native 2048 diffuser's quality would be.

It does look interesting and is an achievement, in terms of, it's hard to write this stuff from scratch, much less in pure C++ without relying on GPU.

Filligree · on Oct 3, 2023

Ah. I use RealESRGAN (or one of its descendants, rather) as a first pass upscaler before high-resolution diffusion. If you skip the diffusion step, of course it'll be faster.

leonidasv · on Oct 3, 2023

Unrelated, but now I'm curious about how much would it take on RPis 4 and 5.

refulgentis · on Oct 3, 2023

yeah me too...I've been very negative about the edge, it got overhyped with the romanticization of local LLMs, but there's a bunch of stuff coming together at the same time...Raspberry Pi 5...Mistral 7B Orca is my 20th try of a local LLM...and the first time it handled simple conversation with RAG. And new diffusion, even every 2 hours, is a credible product, arguing about power consumption aside...

biomcgary · on Oct 3, 2023

Also $29 to get pre-trained model assets to run code.

smusamashah · on Oct 3, 2023

Why does this one needs pretrained models? Can't we use any of the thousands of already available ones?

brucethemoose2 · on Oct 3, 2023

These are mostly Stable Diffusion architecture models, but its not the only game in town.

TeddyDD · on Oct 3, 2023

Hard to tell since there is zero documentation in regard to models.