Releases: jhen0409/llama.cpp
Releases · jhen0409/llama.cpp
b4635
readme : add llm_client Rust crate to readme bindings (#11628) [This crate](https://github.com/ShelbyJenkins/llm_client) has been in a usable state for quite awhile, so I figured now is fair to add it. It installs from crates.io, and automatically downloads the llama.cpp repo and builds it for the target platform - with the goal being the easiest user experience possible. It also integrates model presets and choosing the largest quant given the target's available VRAM. So a user just has to specify one of the presets (I manually add the most popular models), and it will download from hugging face. So, it's like a Rust Ollama, but it's not really for chatting. It makes heavy use of llama.cpp's grammar system to do structured output for decision making and control flow tasks.
b4066
metal : more precise Q*K in FA vec kernel (#10247)
b4062
Revert "convert : fix missing ftype for gemma (#5690)" This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.
b4012
Revert "convert : fix missing ftype for gemma (#5690)" This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.