Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're right, that's a good point. It is possible to make a model dumber via quantization.

But even F16 -> llama.cpp Q4 (3.8 bits) has negligible perplexity loss.

Theoratically, a leading AI lab could quantize absurdly poorly after the initial release where they know they're going to have huge usage.

Theoratically, they could be lying even though they said nothing changed.

At that point, I don't think there's anything to talk about. I agree both of those things are theoratically possible. But it would be very unusual, 2 colossal screwups, then active lying, with many observers not leaking a word.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: