Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice to see such speedups for CPUs. Are these changes available as a branch or pull request in llama.cpp itself? I'd like to make use of them in that form if possible (as I'm used to using that).


Yes, this is really a phenomenal effort! And what open source is about: Bringing improvements to so many use cases. So that Intel and AMD chip uses can start to perform while taking advantage of their high-performance capabilities, making even old parts competitive.

There are two PRs raised to merge to llama.cpp:

https://github.com/ggerganov/llama.cpp/pull/6414

https://github.com/ggerganov/llama.cpp/pull/6412

Hopefully these can be accepted, without drama! as there are many downstream dependencies on llama.cpp can will also benefit.

Though of course everyone should also look directly at releases from llamafile https://github.com/mozilla-Ocho/llamafile.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: