Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is no “one size fits all” here. A bigger model is just a bigger hammer, that in many uses is too bulky and slow to be a proper solution.

At my job, I can’t casually fire up 8xA100 80gb instances. And if I could, the performance wouldn’t have the throughput I require to be useful. Big models are operationally much more expensive.

The smallest/fastest model that is accurate enough for your use case is ideal.



> The smallest/fastest model that is accurate enough for your use case is ideal.

Sure.

…but it’s also fair to say that the smallest model that can fit your use case will be bounded by the parameter count.

No amount of training data can make 100 param model do text summarisation.

If you have a 3B param model, and you want a chat-GPT to embed in your app, do you think it’ll do?

I don’t.

The output is not at that quality level, because it’s too small.

Not everyone needs that; but these 3B / 7B models don’t have the capability to do everything.


You could do what letter is most likely to come next




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: