8000 Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars by ochafik · Pull Request #9639 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars #9639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 375 commits into from
Jan 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
375 commits
Select commit Hold shift + click to select a range
ec9f3b1
nits
Oct 27, 2024
9a86ea7
`tool-call`: slow tool call integration tests
Oct 28, 2024
c88095e
space nits
Oct 28, 2024
7fde6d0
`tool_call`: test no tool call on a real model + rename scenarios
Oct 28, 2024
dd6d024
`tool-call`: script to prefetch models used in server tests
Oct 28, 2024
168add7
Update tool_call.feature
Oct 28, 2024
ec547e4
`tool-call`: add tests: tool_call=none, parallel_tool_calls=true
Oct 28, 2024
b51c71c
`tool-call`: remove duplicate script to fetch templates
Oct 28, 2024
74d71a6
`agent`: simplify syntax (default tools to local w/ default port)
Oct 28, 2024
b825440
`tool-call`: use Q4_K_M models
Oct 28, 2024
aefac1e
`tool-call`: update scripts/fetch_server_test_models.py
Oct 28, 2024
64287a3
`tool-call`: test Hermes-3-Llama-3.1-8B
Oct 29, 2024
fa4c111
`tool-call`: use functionary-small-v3.2-Q8_0.gguf in test (Q4_K_M too…
Oct 29, 2024
773ff91
`tool-call`: force printing of lazy grammar trigger tokens to regular…
Oct 29, 2024
92c384a
nits
Oct 29, 2024
3ebdb2b
`tool-call`: support tool_use variant in llama_chat_template_from_mod…
Oct 30, 2024
35ac17f
`tool-call`: fix missing initializer errors
Oct 30, 2024
5227321
`tool-call`: when slow server tests fail, hint to run `python scripts…
Oct 30, 2024
e4d5449
`tool-calls`: test Qwen2.5-7B-Instruct-Q4_K_M.gguf
Oct 30, 2024
61655b9
Merge remote-tracking branch 'origin/master' into tool-call
Oct 31, 2024
be9de3e
Update llama-sampling.cpp
Oct 31, 2024
542853b
`tool-call`: greedy sampling in server tests + tweak prompt
Oct 31, 2024
7d9c90f
`tool-call`: nemo tweak (accept raw sql again)
Oct 31, 2024
e8d9d71
Update tool_call.feature
Oct 31, 2024
c395d48
`tool-call`: behaviour-based detection of template features
Oct 31, 2024
f5b7825
`tool-call`: code_interpreter & system + tool call support for all ji…
Oct 31, 2024
c773516
`tool-call`: don't use -fa w/ Mistral-Nemo (hard crashes?)
Oct 31, 2024
b35aa4a
`tool-call`: add LLAMA_UPDATE_GOLDENS env for test-chat-template
Oct 31, 2024
9477c54
`tool-call`: functionary-small-v3.2 test now green
Oct 31, 2024
c4a8050
Update README.md
Oct 31, 2024
f5f7475
nits
Oct 31, 2024
fe967b6
Update README.md
Oct 31, 2024
479c152
`tool-call`: fix qwen template test
Oct 31, 2024
bc52c0a
`agent`: add missing tool name in response!
Oct 31, 2024
c059aec
`agent`: memorize, search_memory (sqlite-vec + sqlite-lembed), fetch …
Nov 9, 2024
5789f69
`minja`: don't explode upon referencing a field on an array (fixes He…
Nov 9, 2024
f9b1969
Update README.md
Nov 9, 2024
adc673c
agent: add --think "tool", default to local tools endpoint, support -…
Dec 5, 2024
1afa312
Merge remote-tracking branch 'origin/master' into tool-call
Dec 6, 2024
30fbcb2
agent: more robust squid config
Dec 6, 2024
a469f53
agent: update readme
Dec 6, 2024
cbe395d
minja: remove tests (now in https://github.com/google/minja)
Dec 6, 2024
1fd5f1a
Update README.md
Dec 6, 2024
5d0033f
minja: sync @ https://github.com/google/minja/commit/916c181c0d4a6f96…
Dec 7, 2024
1f0b157
tool-call: add firefunction-v2 style
Dec 7, 2024
93a5245
tool-calls: migrate tests to pytest
Dec 10, 2024
055053c
Merge remote-tracking branch 'origin/master' into tool-call
Dec 14, 2024
1e2115f
tool-calls: shorter name: grammar_triggers
Dec 14, 2024
7bfcd0a
Merge remote-tracking branch 'origin/master' into tool-call
Dec 14, 2024
7e3feff
tool-call: stabilize server tests
Dec 15, 2024
e70ce3f
Merge remote-tracking branch 'origin/master' into tool-call
Dec 26, 2024
f0bd693
Update test-tool-call.cpp
Dec 26, 2024
f645887
Update minja.hpp https://github.com/google/minja/commit/202aa2f3de21b…
Dec 26, 2024
0e87ae2
rm trailing spaces
Dec 27, 2024
0a5d527
Update fetch_server_test_models.py
Dec 27, 2024
a2fe8a4
Fix tool-call server tests
Dec 27, 2024
523ebf8
Simplify tool call grammars when there's only 1 tool
Dec 27, 2024
abd274a
Copy minja from https://github.com/google/minja/commit/58f0ca6dd74bcb…
Dec 30, 2024
e5113e8
Add --jinja and --chat-template-file flags
Dec 30, 2024
80138d9
Add missing <optional> include
Dec 30, 2024
06b5159
Avoid print in get_hf_chat_template.py
Dec 30, 2024
ce48584
No designated initializers yet
Dec 30, 2024
389d79b
Try and work around msvc++ non-macro max resolution quirk
Dec 30, 2024
238b968
Update test_chat_completion.py
Dec 30, 2024
cb72cf1
Merge remote-tracking branch 'origin/master' into jinja
Jan 13, 2025
78861a3
Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template
Jan 13, 2025
1aac99a
Refactor test-chat-template
Jan 13, 2025
7c84ebc
Test templates w/ minja
Jan 13, 2025
18f257b
Fix deprecation
Jan 13, 2025
8dd4f33
Add --jinja to llama-run
Jan 13, 2025
c04c50e
Merge remote-tracking branch 'origin/master' into jinja
Jan 13, 2025
a6afb27
Update common_chat_format_example to use minja template wrapper
Jan 13, 2025
b4083e4
Test chat_template in e2e test
Jan 13, 2025
b7e2171
Update utils.py
Jan 13, 2025
a57bb94
Update test_chat_completion.py
Jan 13, 2025
4daae0b
Update run.cpp
Jan 13, 2025
1b3bb7e
Update arg.cpp
ochafik Jan 14, 2025
e7ff6ec
Merge branch 'jinja' into tool-call
Jan 14, 2025
7a7d6f6
Fix merge
Jan 14, 2025
e183fa9
Update test-chat-template.cpp
Jan 14, 2025
010726c
Merge remote-tracking branch 'origin/master' into tool-call
Jan 14, 2025
d47f40c
Update test-chat-template.cpp
Jan 14, 2025
3ed670b
Merge remote-tracking branch 'origin/master' into jinja
Jan 14, 2025
3c7784c
Refactor common_chat_* functions to accept minja template + use_jinja…
Jan 18, 2025
b75d062
Refactor common_chat_* functions to accept minja template + use_jinja…
Jan 18, 2025
40db789
Merge remote-tracking branch 'origin/master' into jinja
Jan 18, 2025
81c0d43
Attempt to fix linkage of LLAMA_CHATML_TEMPLATE
Jan 18, 2025
138a4ba
Merge branch 'jinja' into tool-call
Jan 18, 2025
d5fa351
Revert LLAMA_CHATML_TEMPLATE refactor
Jan 18, 2025
045edd1
Merge branch 'jinja' into tool-call
Jan 18, 2025
2ceabee
Fix fetch_server_test_models.py (avoid conv trap)
Jan 18, 2025
259d9e4
tools: greedy sampling in tests
Jan 18, 2025
acf7c24
tools: run tool call slow tests when SLOW_TESTS=1 (+ prefetch models)
Jan 18, 2025
ee1e10e
Normalize newlines in test-chat-templates for windows tests
Jan 18, 2025
e63520f
Forward decl minja::chat_template to avoid eager json dep
Jan 18, 2025
33322e8
Flush stdout in chat template before potential crash
Jan 18, 2025
5074e6f
Fix copy elision warning
Jan 18, 2025
76893f5
Merge branch 'jinja' into tool-call
Jan 18, 2025
fc60802
Rm unused optional include
Jan 18, 2025
0e74c9d
Add missing optional include to server.cpp
Jan 18, 2025
d6f058d
Merge branch 'jinja' into tool-call
Jan 18, 2025
e3c475c
Disable jinja test that has a cryptic windows failure
Jan 18, 2025
cc50356
minja: fix vigogne (https://github.com/google/minja/pull/22)
Jan 18, 2025
c207fdc
Merge branch 'jinja' into tool-call
Jan 18, 2025
0401a83
agent: add --greedy, --top-p, --top-k options
Jan 19, 2025
153e852
Apply suggestions from code review
ochafik Jan 20, 2025
db9dd0c
Finish suggested renamings
Jan 20, 2025
c9e8fdd
Move chat_templates inside server_context + remove mutex
Jan 20, 2025
8c84aef
Update --chat-template-file w/ recent change to --chat-template
Jan 20, 2025
154bfaa
Refactor chat template validation
Jan 20, 2025
099f983
Merge remote-tracking branch 'origin/master' into jinja
Jan 20, 2025
54a669e
Guard against missing eos/bos tokens (null token otherwise throws in …
Jan 20, 2025
8348c60
Warn against missing eos / bos tokens when jinja template references …
Jan 20, 2025
ee475d2
rename: common_chat_template[s]
Jan 20, 2025
8a7c89e
reinstate assert on chat_templates.template_default
Jan 20, 2025
9bab693
Merge branch 'jinja' into tool-call
Jan 20, 2025
b110374
apply renames from jinja branch
Jan 20, 2025
8347da9
Update minja to https://github.com/google/minja/commit/b8437df626ac6c…
Jan 20, 2025
7ea6a06
Merge branch 'jinja' into tool-call
Jan 20, 2025
56aa93c
fix std imports for gcc build
Jan 21, 2025
ff2cce5
Update minja to https://github.com/google/minja/pull/25
Jan 21, 2025
ba8dd66
Merge branch 'jinja' into tool-call
Jan 21, 2025
9d8ebd6
Update minja from https://github.com/google/minja/pull/27
Jan 21, 2025
c606255
Merge branch 'jinja' into tool-call
Jan 21, 2025
fec0260
Merge remote-tracking branch 'origin/master' into tool-call
Jan 21, 2025
b49d052
rm tests/test-minja from makefile
Jan 21, 2025
f6e73da
Remove examples/agent (moved to https://gist.github.com/ochafik/9246d…
Jan 21, 2025
77f4098
Delete update_jinja_goldens.py
Jan 21, 2025
dbf841b
Push laziness down to grammar impl
Jan 22, 2025
ef61a4c
minimize diffs
Jan 22, 2025
3972945
common_tool_call rename
Jan 22, 2025
d77fecc
shrink diff in json conversion code
Jan 22, 2025
5268ec8
Refactor string helpers into common
Jan 22, 2025
9e8b43f
follow enum naming style for tool call styles
Jan 22, 2025
9a5acbb
Factor string_join, string_split, string_repeat into common
Jan 22, 2025
4de5cf8
json: refactor to surface a versatile builder
Jan 22, 2025
03fe80f
drop unused fs_list_files
Jan 22, 2025
41a613b
Merge branch 'string_utils' into tool-call
Jan 22, 2025
5140d7a
Update common.cpp
Jan 22, 2025
e211629
Merge branch 'string_utils' into tool-call
Jan 22, 2025
28cac49
drop llama_sampler_accept_str
Jan 22, 2025
2dd09c7
more cleanups
Jan 22, 2025
01b345b
Merge remote-tracking branch 'origin/master' into tool-call
Jan 22, 2025
82b6e9a
merge common_tool_calls into common_chat_msg
Jan 22, 2025
63387c6
smaller diff
Jan 22, 2025
a422636
nits
Jan 22, 2025
cce1166
Update tool-call.cpp
Jan 22, 2025
c6a22ed
Greedy sampling in tool call tests
Jan 22, 2025
30d33d9
Update test_chat_completion.py
Jan 22, 2025
9ccc62b
Sync minja after https://github.com/google/minja/pull/29
Jan 22, 2025
d186721
Merge remote-tracking branch 'origin/master' into tool-call
Jan 22, 2025
f0231a5
fix common_chat_msg invocations
Jan 22, 2025
5e358ad
fix msg init warning
Jan 22, 2025
cdfa8b9
Update chat-template.hpp
Jan 22, 2025
a46de6a
Add grammar options + rename builder to common_grammar_builder
Jan 22, 2025
c2d836f
Update real tool call tests (use less models)
Jan 22, 2025
46415d7
Fix lazy trigger handling
Jan 22, 2025
36ed106
WIP chat handlers
Jan 24, 2025
c479d39
tool-call: allow special tokens that are grammar triggers
Jan 25, 2025
0208b20
Update test_chat_completion.py
Jan 25, 2025
a6463c1
jinja: don't add bos when jinja enabled
Jan 25, 2025
51b7aab
Update test_chat_completion.py
Jan 25, 2025
3f3fc03
nit: trailing spaces
Jan 26, 2025
1159455
Merge branch 'tool-call' into tool-call-handler
Jan 26, 2025
43385b2
sync: minja
Jan 26, 2025
5ec4c5e
reshuffle chat handlers
Jan 26, 2025
f7078ca
tool-call: fix functionary v3.1 required test
Jan 26, 2025
ca0c837
nits
Jan 27, 2025
bddc1be
tool-call: fix special handling of special trigger tokens (Nemo)
Jan 27, 2025
da606d8
tool-call: remove nonsensical code_interpreter code
Jan 27, 2025
15ec01e
jinja: only add special tokens if template doesn't seem to handle them
Jan 27, 2025
2efa0c2
tool-call: add weather tool e2e tests
Jan 27, 2025
57f40e3
tool-call: fix lazy grammar & mixed content + tool calls parsing
Jan 27, 2025
6770955
tool-call: compact json output to cap # tokens generated
Jan 27, 2025
09971e6
Update test_chat_completion.py
Jan 27, 2025
92ac336
Prepare DeepSeek-R1-Distill-Llama-8B support
Jan 27, 2025
118f799
DeepSeek-R1: implement grammar constraints
Jan 27, 2025
add9124
fix test-chat-handler grammar tests
Jan 27, 2025
fa065eb
Rehabilitate test_format_detection
Jan 27, 2025
ad22978
updated tool call example to be less ambiguous (deepseek likes to ran…
Jan 27, 2025
90effb8
Pass grammar laziness all the way down to sampler (need to print spec…
Jan 27, 2025
cafea60
Split e2e test_tool_call from test_chat_completion
Jan 27, 2025
b565ab2
comment out broken tests in test_tool_call.py
Jan 27, 2025
2d607f1
Update test-chat-handler.cpp
Jan 27, 2025
ef9efc9
Fix Llama 3.1 (incl. constrained builtin tools e.g. `<|python_tag|>fo…
Jan 28, 2025
6271714
Allow tool use + streaming
Jan 28, 2025
6d56829
Cleanup dead code in llama_3_1 tool call code
Jan 28, 2025
2f99236
Tool-call: do last partial parse upon limit stop
Jan 28, 2025
0a51e51
Update test-chat-handler.cpp
Jan 28, 2025
d274ffc
build: Add missing optional include for gcc
Jan 28, 2025
62d45a5
Disable slow tests where appropriate, + nits
Jan 28, 2025
ec4aeaf
Revert "Allow tool use + streaming"
Jan 28, 2025
b5a74d1
Simplify parser defs (incremental parsing for streaming will need mor…
Jan 28, 2025
ba10b47
Add missing link dep for windows build
Jan 28, 2025
cd63ba4
beef up test-chat-handler w/ delta expectations
Jan 28, 2025
cad1448
Disable test-chat-handler on win32 like the other grammar-related tests
Jan 28, 2025
4f25755
minja: sync on https://github.com/google/minja/pull/33
Jan 28, 2025
d603d06
sync: minja
Jan 28, 2025
6426391
Fix firefunction w/ jinja: requires two variables, use the chat handl…
Jan 29, 2025
4cdbb8c
Revert breaking minja change
Jan 29, 2025
47be437
Text fireworks v2 template
Jan 29, 2025
18d5a1b
nits
Jan 29, 2025
4a1e8e9
refactor test-chat-handler
Jan 29, 2025
923c805
rm dead code + nits
Jan 29, 2025
384f54a
Split bulk of tool call tests to slow lane
Jan 29, 2025
40cc3f2
Merge branch 'tool-call' of github.com:ochafik/llama.cpp into tool-call
Jan 29, 2025
41eec46
rm unused templates, rename one
Jan 29, 2025
76f6ab1
Update test_tool_call.py
Jan 29, 2025
77dd67c
tool-calls: disable crashing tests
Jan 29, 2025
0f8af53
nits
Jan 29, 2025
babdefc
Merge remote-tracking branch 'origin/master' into tool-call
Jan 29, 2025
682026f
Create meta-llama-Llama-3.1-8B-Instruct.jinja
Jan 29, 2025
7b5e080
Move templates/ under models/
Jan 29, 2025
ba27e98
Unify llama 3.x chat handling again (allow `{"type": "function", "nam…
Jan 29, 2025
6e676c8
sync: minja
Jan 29, 2025
ed7c622
Rename: common/chat.*, common_chat_{inputs -> params}
Jan 29, 2025
36c776f
Finish renaming of chat inputs vs. params [skip ci]
Jan 29, 2025
bc8a611
nits
Jan 29, 2025
84bc083
Remove server tests LLAMA_CACHE override (tests are serial, and the c…
Jan 29, 2025
2b24569
Add cli mode to test-chat to generate template summaries markdown
Jan 29, 2025
64545ac
Somehow /* bad inside block comments, ok fine.
Jan 29, 2025
cbecb35
Add tool call to hot topics
Jan 29, 2025
a810c37
Partial revert of LLAMA_CACHE=tmp (unless set explicitly in env)
Jan 29, 2025
77c60e6
Avoid passing tools twice in generic handler (now that minja passes t…
Jan 30, 2025
d86a1ae
Unify content + message in server_task_result_cmpl_final (+ avoid str…
Jan 30, 2025
774557c
llama 3.1: allow `{name:` & `{function:` syntax even w/ builtin tools…
Jan 30, 2025
590c979
Update tests readme + add raw output to verbose log
Jan 30, 2025
f8e14bf
split chat handler vs. parser around enum again
Jan 30, 2025
81547e6
nits
Jan 30, 2025
18450e6
debug logs are back
Jan 30, 2025
b831a6e
rm unused llama_param
Jan 30, 2025
7635912
llama 3.2 1b now fails the weather tool call?
Jan 30, 2025
9591af1
increase http timeout to 12
Jan 30, 2025
8ef37a3
Merge remote-tracking branch 'origin/master' into tool-call
Jan 30, 2025
2d51c45
code style changes on test
ngxson Jan 30, 2025
c88f4a7
simplify handle_apply_template
ngxson Jan 30, 2025
3dcde9e
Fix debug + verbose
Jan 30, 2025
06c4ca5
Update test_chat_completion.py
Jan 30, 2025
0c171f5
Update test_chat_completion.py
Jan 30, 2025
9685043
Update scripts/fetch_server_test_models.py to new compact hf_repo syn…
Jan 30, 2025
2bb3fed
nit: fix py import
Jan 30, 2025
7d59bf4
deprecate llama_sampler_init_grammar -> llama_sampler_grammar_init
Jan 30, 2025
5a64af6
add llama_sampler_init_grammar_lazy instead of renaming the non-lazy
Jan 30, 2025
f223df0
Format test-chat.cpp
Jan 30, 2025
8205246
log prompt + nits
Jan 30, 2025
5add261
test: leave model_hf_file blank
ngxson Jan 30, 2025
1029ff9
force printing </tool_call> on hermes 2 model if/as it's a special token
Jan 30, 2025
3bd6abe
try and avoid weird server test failure (spillage / parallelism betwe…
Jan 30, 2025
729d2d3
Disable chat_completion tests of non-tool jinja mode
Jan 30, 2025
34f54dd
Fix typo
Jan 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,11 @@ indent_style = tab
[examples/cvector-generator/*.txt]
trim_trailing_whitespace = unset
insert_final_newline = unset

[models/templates/*.jinja]
indent_style = unset
indent_size = unset
end_of_line = unset
charset = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
2 changes: 1 addition & 1 deletion .github/workflows/server.yml
1C6A
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ jobs:
run: |
cd examples/server/tests
$env:PYTHONIOENCODING = ":replace"
pytest -v -x
pytest -v -x -m "not slow"

- name: Slow tests
id: server_integration_tests_slow
Expand Down
9 changes: 9 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ TEST_TARGETS = \
tests/test-arg-parser \
tests/test-autorelease \
tests/test-backend-ops \
tests/test-chat \
tests/test-chat-template \
tests/test-double-float \
tests/test-grammar-integration \
Expand Down Expand Up @@ -983,6 +984,7 @@ OBJ_COMMON = \
$(DIR_COMMON)/ngram-cache.o \
$(DIR_COMMON)/sampling.o \
$(DIR_COMMON)/speculative.o \
$(DIR_COMMON)/chat.o \
$(DIR_COMMON)/build-info.o \
$(DIR_COMMON)/json-schema-to-grammar.o

Expand Down Expand Up @@ -1361,6 +1363,8 @@ llama-server: \
examples/server/httplib.h \
examples/server/index.html.hpp \
examples/server/loading.html.hpp \
common/chat.cpp \
common/chat.hpp \
common/chat-template.hpp \
common/json.hpp \
common/minja.hpp \
Expand Down Expand Up @@ -1471,6 +1475,11 @@ tests/test-json-schema-to-grammar: tests/test-json-schema-to-grammar.cpp \
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

tests/test-chat: tests/test-chat.cpp \
$(OBJ_ALL)
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

tests/test-opt: tests/test-opt.cpp \
$(OBJ_GGML)
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)

- **How to use [MTLResidencySet](https://developer.apple.com/documentation/metal/mtlresidencyset?language=objc) to keep the GPU memory active?** https://github.com/ggerganov/llama.cpp/pull/11427
- **VS Code extension for FIM completions:** https://github.com/ggml-org/llama.vscode
- Universal tool call support in `llama-server`: https://github.com/ggerganov/llama.cpp/pull/9639
- Vim/Neovim plugin for FIM completions: https://github.com/ggml-org/llama.vim
- Introducing GGUF-my-LoRA https://github.com/ggerganov/llama.cpp/discussions/10123
- Hugging Face Inference Endpoints now support GGUF out of the box! https://github.com/ggerganov/llama.cpp/discussions/9669
Expand Down
2 changes: 2 additions & 0 deletions common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ add_library(${TARGET} STATIC
arg.cpp
arg.h
base64.hpp
chat.cpp
chat.hpp
chat-template.hpp
common.cpp
common.h
Expand Down
Loading
Loading
0