Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Every 10x increase in model size requires 10x more power

Does it? I’ll be the first to admit I am so far behind on this area, but isn’t this assuming the hardware isn’t improving over time as well? Or am I missing the boat here?



Hardware isn’t improving exponentially anymore, especially not on the flops/watt metric.

That’s part of what motivated the transition to bfloat16 and even smaller minifloat formats, but you can only quantize so far before you’re just GEMMing noise.


Hardware gets faster but efficiency is stalling if not getting worse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: