Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Please provide a detailed written description of what you were trying to do, and what you expected llama-cpp-python
to do.
Current Behavior
Using response_format={ "type": "json_object" }
crashes
Environment and Context
I'm using:
from llama_cpp import Llama
llm = Llama("/Users/shakedz/local_models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf", n_ctx=4096, verbose=False)
Which were downloaded form here:
https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main
This fails:
print(llm.create_chat_completion([{'role': 'user', 'content': 'What is the capital of France? Replay using a JSON: {"answer": "YOUR_ANSWER"}!'}], response_format={ "type": "json_object" }))
Due to:
libc++abi: terminating due to uncaught exception of type std::out_of_range: vector
[1] 25897 abort /Users/shakedz/bitbucket/achilles/.venv/bin/python
But without the response_format:
print(llm.create_chat_completion([{'role': 'user', 'content': 'What is the capital of France? Replay using a JSON: {"answer": "YOUR_ANSWER"}!'}]))
It works:
{'id': 'chatcmpl-987cdac0-5398-44a7-b6ca-36c1121392dc', 'object': 'chat.completion', 'created': 1722861547, 'model': '/Users/shakedz/local_models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': '{"answer": "Paris"}'}, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 56, 'completion_tokens': 6, 'total_tokens': 62}}
System
- Apple M2 Max, 32 GB, Sonoma 14.5
- Python 3.12.1
- llama_cpp_python==0.2.85
- installed using
CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python