Validation error using functionary

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

When I submit a chat request with one tool choice in the body, but the chat message does not make the model choose a tool, I expect the server to return a successful response with an empty tool_calls array.

Current Behavior

It throws 3 validation errors and returns a 500 error to the client. (See below)

Environment and Context

MacOS 14.3.1, Apple M1 Max CPU with 64GB memory.

$ python3 --version
Python 3.11.0

XCode 15.3

llama-cpp-python installed with:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip install git+https://github.com/abetlen/llama-cpp-python.git --no-cache-dir --force-reinstall

Failure Information (for bugs)

Exception: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

Steps to Reproduce

Start up server with:

python3 -m llama_cpp.server \
  --model models/functionary-medium-v2_2-q4_0/functionary-medium-v2.2.q4_0.gguf \
  --chat_format functionary-v2 \
  --hf_pretrained_model_name_or_path models/functionary-medium-v2_2-q4_0/ \
  --n_gpu_layers 1

POST the following request to /v1/chat/completions

{"model":"meetkai/functionary-medium-v2.2","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"say hello"}],"temperature":0.7,"tools":[{"type":"function","function":{"name":"GetDirections","description":"Provides on screen directions.","parameters":{}}}],"tool_choice":"auto"}

If I post a similar message with no tools parameter, the server successfully responds.

Failure Logs

llama_print_timings:        load time =   11453.19 ms
llama_print_timings:      sample time =       0.29 ms /     3 runs   (    0.10 ms per token, 10380.62 tokens per second)
llama_print_timings: prompt eval time =   11452.99 ms /   153 tokens (   74.86 ms per token,    13.36 tokens per second)
llama_print_timings:        eval time =     219.31 ms /     2 runs   (  109.66 ms per token,     9.12 tokens per second)
llama_print_timings:       total time =   11879.75 ms /   155 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =   11453.19 ms
llama_print_timings:      sample time =       0.89 ms /    10 runs   (    0.09 ms per token, 11248.59 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =    1011.59 ms /    10 runs   (  101.16 ms per token,     9.89 tokens per second)
llama_print_timings:       total time =    1035.41 ms /    11 tokens
Exception: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/contextlib.py", line 222, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/fastapi/concurrency.py", line 35, in contextmanager_in_threadpool
    raise e
fastapi.exceptions.ResponseValidationError: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

INFO:     ::1:56668 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

Git HEAD at time of install: 1e60dba

Environment info:

llama-cpp-python$ python3 --version
Python 3.11.0

llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
fastapi            0.110.0
numpy              1.26.4
sse-starlette      2.0.0
uvicorn            0.29.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions