Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not the GP, but I believe that they are talking about the size of the training data set in relation to the model size.


You don't need to and can't really load all training data.

For LLMs you need to load single row of context size, that's vector of ie. 8k numbers, which is 32kB for single precision floats.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: