8000 Releases · jhen0409/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Releases: jhen0409/llama.cpp

b4635

04 Feb 14:04
106045e
Compare
Choose a tag to compare
readme : add llm_client Rust crate to readme bindings (#11628)

[This crate](https://github.com/ShelbyJenkins/llm_client) has been in a usable state for quite awhile, so I figured now is fair to add it.

It installs from crates.io, and automatically downloads the llama.cpp repo and builds it for the target platform - with the goal being the easiest user experience possible.

It also integrates model presets and choosing the largest quant given the target's available VRAM. So a user just has to specify one of the presets (I manually add the most popular models), and it will download from hugging face.

So, it's like a Rust Ollama, but it's not really for chatting. It makes heavy use of llama.cpp's grammar system to do structured output for decision making and control flow tasks.

b4066

11 Nov 09:01
b0cefea
Compare
Choose a tag to compare
metal : more precise Q*K in FA vec kernel (#10247)

b4062

09 Nov 14:16
Compare
Choose a tag to compare
Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.

b4012

02 Nov 04:15
Compare
Choose a tag to compare
Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.
0