Nice to see such speedups for CPUs. Are these changes available as a branch or p...

dagaci · on April 1, 2024

Yes, this is really a phenomenal effort! And what open source is about: Bringing improvements to so many use cases. So that Intel and AMD chip uses can start to perform while taking advantage of their high-performance capabilities, making even old parts competitive.

There are two PRs raised to merge to llama.cpp:

https://github.com/ggerganov/llama.cpp/pull/6414

https://github.com/ggerganov/llama.cpp/pull/6412

Hopefully these can be accepted, without drama! as there are many downstream dependencies on llama.cpp can will also benefit.

Though of course everyone should also look directly at releases from llamafile https://github.com/mozilla-Ocho/llamafile.