Releases · ggml-org/llama.cpp

28 May 14:55

c962ae3

b5522 Latest

Latest

server: fix remove 'image_url'/'input_audio' json-object effectlly fo…

Assets 18

cudart-llama-bin-win-cuda-11.7-x64.zip

303 MB 2025-05-28T14:55:01Z
cudart-llama-bin-win-cuda-12.4-x64.zip

373 MB 2025-05-28T14:55:16Z
llama-b5522-bin-macos-arm64.zip

10.9 MB 2025-05-28T14:55:30Z
llama-b5522-bin-macos-x64.zip

26.1 MB 2025-05-28T14:55:31Z
llama-b5522-bin-ubuntu-arm64.zip

11.8 MB 2025-05-28T14:55:33Z
llama-b5522-bin-ubuntu-vulkan-x64.zip

20.1 MB 2025-05-28T14:55:34Z
llama-b5522-bin-ubuntu-x64.zip

12.3 MB 2025-05-28T14:55:35Z
llama-b5522-bin-win-cpu-arm64.zip

11.1 MB 2025-05-28T14:55:36Z
llama-b5522-bin-win-cpu-x64.zip

13.7 MB 2025-05-28T14:55:37Z
llama-b5522-bin-win-cuda-11.7-x64.zip

109 MB 2025-05-28T14:55:38Z
Source code (zip)

2025-05-28T14:33:54Z
Source code (tar.gz)

2025-05-28T14:33:54Z

28 May 13:06

github-actions

b5519

a682474

b5519

CUDA: fix FA tg at long context for CC >= 8.9 (#13852)

Assets 18

28 May 04:13

github-actions

b5517

1e8659e

b5517

CANN: Add SOC TYPE printing in cmake configuration (#13837)

Assets 18

27 May 20:31

github-actions

b5516

a3c3084

b5516

opencl: add new ops - `argsort`, `div`, `sub`, `addrows`, `sigmoid`, …

Assets 18

27 May 20:21

github-actions

b5515

1701d4c

b5515

opencl: mark `mul_mat` `f32f32` as supporting non-contiguous tensors …

Assets 18

27 May 18:47

github-actions

b5514

bef8176

b5514

vulkan: use timestamp queries for GGML_VULKAN_PERF (#13817)

Also change it to be controlled by an env var rather than cmake flag

Assets 18

27 May 18:23

github-actions

b5513

34b7c04

b5513

cmake : add llama-cparams.cpp to build (#13832)

Assets 18

27 May 18:01

github-actions

b5512

f3101a8

b5512

SYCL: add gelu_erf kernel (#13749)

* SYCL: add gelu_erf kernel

* refactor code

Co-authored-by: Atharva Dubey <atharva.dubey@codeplay.com>

* Use scope_op_debug_print

---------

Co-authored-by: Atharva Dubey <atharva.dubey@codeplay.com>

Assets 18

27 May 16:05

github-actions

b5510

a8ea03d

b5510

ggml : add ggml_repeat_4d (#13824)

Assets 18

27 May 14:04

github-actions

b5509

05f6ac6

b5509

ggml : riscv: add xtheadvector support (#13720)

* ggml : riscv: add xtheadvector support

* ggml : clean up some macro usage

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5522

Uh oh!

b5519

Uh oh!

b5517

Uh oh!

b5516

Uh oh!

b5515

Uh oh!

b5514

Uh oh!

b5513

Uh oh!

b5512

Uh oh!

b5510

Uh oh!

b5509

Uh oh!