8000 Tags · ochafik/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Tags: ochafik/llama.cpp

Tags

b5537

Toggle b5537's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
8000
llama : add support for jina-reranker-v2 (ggml-org#13900)

b5500

Toggle b5500's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
scripts : add option to compare commits in Debug (ggml-org#13806)

* scripts : add option to compare commits in Debug

* cont : reuse existing CMAKE_OPTS

b5497

Toggle b5497's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: fix streaming crashes (ggml-org#13786)

* add preludes to content on partial regex match

* allow all parsers to parse non-tool-call content.

* tweak order of <|python_tag|> vs <function= parsing for functionary v3.1 format. still not ideal but hopefully less prone to crash

b5495

Toggle b5495's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
`server`: fix format of streamed tool call deltas (diff name, fix id …

…location) (ggml-org#13800)

* fix deltas of tool_call.function.name

* fix tool_call.id (was in tool_call.function.id!) + add function type

* add tool_call.type

* populate empty tool_call.function.arguments on first delta

b5494

Toggle b5494's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: fix regression on streamed non-chat completion w/ stops (ggml…

…-org#13785)

* more forgiving message diffs: partial stop words aren't erased, full stops are

* Add (slow) server test for completion + stream + stop

b5493

Toggle b5493's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
examples : allow extracting embeddings from decoder contexts (ggml-or…

…g#13797)

ggml-ci

b5488

Toggle b5488's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3…

… w/ enable_thinking:false) (ggml-org#13771)

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

b5479

Toggle b5479's commit message
server: fix/test add_generation_prompt

b5478

Toggle b5478's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
`server`: streaming of tool calls and thoughts when `--jinja` is on (g…

…gml-org#12379)

* add common_json w/ support for truncated json healing

* add common_chat_msg_diff

* partial common_chat_parse

* refactor parser w/ optionals

* server: wire chat diffs in stream mode

* fix trigger of thinking models (must happen after thoughts are closed)

* fix functionary v3.2 raw python!

* rename: common_chat_syntax (now contains format)

* rm common_regex.at_start

* don't return empty <think></think>

* accommodate yet another deepseek r1 distill fantasy syntax (`<|tool▁calls|>`)

* fix QwQ 32B tool call parsing after thoughts (hermes2)

* better logs for grammar triggers

* consume spaces after parse_json_tool_calls

* fix required tool calls w/ thinking models that have pre-opened thinking tags

* fix thinking model's initial trigger + test qwq's template

* run most test_tool_call tests in stream + non-stream modes

* make functionary v3.2 parsing more strict (differentiate first match from others)

* send final diff from server, to close off raw python arguments

* support partial content streaming in Generic mode

* tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)

* Update function-calling.md

* Update tool_bench.py

* chat-parser: remove input from exception (llm output may contain PII)

---------

Co-authored-by: ochafik <ochafik@google.com>
Co-authored-by: Olivier Chafik <ochafik@users.noreply.github.com>

b5470

Toggle b5470's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci : enable winget package updates (ggml-org#13734)

0