Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
When I submit a chat request with one tool choice in the body, but the chat message does not make the model choose a tool, I expect the server to return a successful response with an empty tool_calls
array.
Current Behavior
It throws 3 validation errors and returns a 500 error to the client. (See below)
Environment and Context
MacOS 14.3.1, Apple M1 Max CPU with 64GB memory.
$ python3 --version
Python 3.11.0
XCode 15.3
llama-cpp-python installed with:
CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip install git+https://github.com/abetlen/llama-cpp-python.git --no-cache-dir --force-reinstall
Failure Information (for bugs)
Exception: 3 validation errors:
{'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
{'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
{'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}
Steps to Reproduce
- Start up server with:
python3 -m llama_cpp.server \
--model models/functionary-medium-v2_2-q4_0/functionary-medium-v2.2.q4_0.gguf \
--chat_format functionary-v2 \
--hf_pretrained_model_name_or_path models/functionary-medium-v2_2-q4_0/ \
--n_gpu_layers 1
- POST the following request to
/v1/chat/completions
{"model":"meetkai/functionary-medium-v2.2","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"say hello"}],"temperature":0.7,"tools":[{"type":"function","function":{"name":"GetDirections","description":"Provides on screen directions.","parameters":{}}}],"tool_choice":"auto"}
If I post a similar message with no tools
parameter, the server successfully responds.
Failure Logs
llama_print_timings: load time = 11453.19 ms
llama_print_timings: sample time = 0.29 ms / 3 runs ( 0.10 ms per token, 10380.62 tokens per second)
llama_print_timings: prompt eval time = 11452.99 ms / 153 tokens ( 74.86 ms per token, 13.36 tokens per second)
llama_print_timings: eval time = 219.31 ms / 2 runs ( 109.66 ms per token, 9.12 tokens per second)
llama_print_timings: total time = 11879.75 ms / 155 tokens
Llama.generate: prefix-match hit
llama_print_timings: load time = 11453.19 ms
llama_print_timings: sample time = 0.89 ms / 10 runs ( 0.09 ms per token, 11248.59 tokens per second)
llama_print_timings: prompt eval time = 0.00 ms / 1 tokens ( 0.00 ms per token, inf tokens per second)
llama_print_timings: eval time = 1011.59 ms / 10 runs ( 101.16 ms per token, 9.89 tokens per second)
llama_print_timings: total time = 1035.41 ms / 11 tokens
Exception: 3 validation errors:
{'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
{'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
{'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}
Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
response = await original_route_handler(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/contextlib.py", line 222, in __aexit__
await self.gen.athrow(typ, value, traceback)
File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/fastapi/concurrency.py", line 35, in contextmanager_in_threadpool
raise e
fastapi.exceptions.ResponseValidationError: 3 validation errors:
{'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
{'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
{'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}
INFO: ::1:56668 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
Git HEAD at time of install: 1e60dba
Environment info:
llama-cpp-python$ python3 --version
Python 3.11.0
llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
fastapi 0.110.0
numpy 1.26.4
sse-starlette 2.0.0
uvicorn 0.29.0
Metadata
Metadata
Assignees
Labels
No labels