Releases · jhen0409/llama.cpp

04 Feb 14:04

106045e

b4635 Latest

Latest

readme : add llm_client Rust crate to readme bindings (#11628)

[This crate](https://github.com/ShelbyJenkins/llm_client) has been in a usable state for quite awhile, so I figured now is fair to add it.

It installs from crates.io, and automatically downloads the llama.cpp repo and builds it for the target platform - with the goal being the easiest user experience possible.

It also integrates model presets and choosing the largest quant given the target's available VRAM. So a user just has to specify one of the presets (I manually add the most popular models), and it will download from hugging face.

So, it's like a Rust Ollama, but it's not really for chatting. It makes heavy use of llama.cpp's grammar system to do structured output for decision making and control flow tasks.

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-02-04T14:04:32Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-02-04T14:04:40Z
llama-b4635-bin-macos-arm64.zip

25.3 MB 2025-02-04T14:04:51Z
llama-b4635-bin-macos-x64.zip

27.1 MB 2025-02-04T14:04:52Z
llama-b4635-bin-ubuntu-x64.zip

29 MB 2025-02-04T14:04:53Z
llama-b4635-bin-win-avx-x64.zip

15.4 MB 2025-02-04T14:04:54Z
llama-b4635-bin-win-avx2-x64.zip

15.4 MB 2025-02-04T14:04:55Z
llama-b4635-bin-win-avx512-x64.zip

15.4 MB 2025-02-04T14:04:56Z
llama-b4635-bin-win-cuda-cu11.7-x64.zip

150 MB 2025-02-04T14:04:57Z
llama-b4635-bin-win-cuda-cu12.4-x64.zip

150 MB 2025-02-04T14:05:02Z
Source code (zip)

2025-02-04T11:20:55Z
Source code (tar.gz)

2025-02-04T11:20:55Z

11 Nov 09:01

github-actions

b4066

b0cefea

b4066

metal : more precise Q*K in FA vec kernel (#10247)

Assets 22

09 Nov 14:16

github-actions

b4062

773a06d

b4062

Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.

Assets 22

02 Nov 04:15

github-actions

b4012

753ee9a

b4012

Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: jhen0409/llama.cpp

b4635

Uh oh!

b4066

Uh oh!

b4062

Uh oh!

b4012

Uh oh!