Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you know how the memory demands compare to LLMs at the same number of parameters? For example, Mistral 7B quantized to 4 bits works very well on an 8GB card, though there isn’t room for long context.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: