8000 Releases · ochafik/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Releases: ochafik/llama.cpp

b5546

30 May 17:11
dd665cc
Compare
Choose a tag to compare
parallel : increase the variability of the prompt lengths (#13927)

ggml-ci

b5537

29 May 21:49
e83ba3e
Compare
Choose a tag to compare
llama : add support for jina-reranker-v2 (#13900)

b5500

26 May 20:41
a26c4cc
Compare
Choose a tag to compare
scripts : add option to compare commits in Debug (#13806)

* scripts : add option to compare commits in Debug

* cont : reuse existing CMAKE_OPTS

b5497

26 May 15:34
03f582a
Compare
Choose a tag to compare
server: fix streaming crashes (#13786)

* add preludes to content on partial regex match

* allow all parsers to parse non-tool-call content.

* tweak order of <|python_tag|> vs <function= parsing for functionary v3.1 format. still not ideal but hopefully less prone to crash

b5495

26 May 14:22
d74e94c
Compare
Choose a tag to compare
`server`: fix format of streamed tool call deltas (diff name, fix id …

b5494

26 May 14:03
f13847c
Compare
Choose a tag to compare
server: fix regression on streamed non-chat completion w/ stops (#13785)

* more forgiving message diffs: partial stop words aren't erased, full stops are

* Add (slow) server test for completion + stream + stop

b5493

26 May 11:57
79c137f
Compare
Choose a tag to compare
examples : allow extracting embeddings from decoder contexts (#13797)

ggml-ci

b5488

25 May 23:42
e121edc
Compare
Choose a tag to compare
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3…

b5479

25 May 07:21
Compare
Choose a tag to compare
server: fix/test add_generation_prompt

b5478

25 May 07:18
f5cd27b
Compare
Choose a tag to compare
`server`: streaming of tool calls and thoughts when `--jinja` is on (…
0