Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This doesn't work like that. An analogy would be giving a 5 year old a task that requires the understanding of the world of an 18 year old. It doesn't matter whether you give that child 5 minutes or 10 hours, they won't be capable of solving it.


I think the question of what can be achieved with a small model comes down to what needs knowledge vs what needs experience. A small model can use tools like RAG if it is just missing knowledge, but it seems hard to avoid training/parameters where experience is needed - knowing how to perceive then act.

There is obviously also some amount (maybe a lot) of core knowledge and capability needed even to be able to ask the right questions and utilize the answers.


Small models handle simple, low context tasks most of the time correctly. But for more complex tasks, they fail due to insufficient training capacity and too few parameters to integrate the necessary relationships.


What if you give them 13 years?


Nothing will change. They will go out of context and collapse into loops.


I mean the 5yo child, not the LLM


Then they're not a 5-year-old anymore.


but in 13 years, will they be capable?


No. They will go out of context and collapse into loops.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: