8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
-
I would like to use functions calling & stream=True But I have error Automatic streaming tool choice is not supported
Automatic streaming tool choice is not supported
From here:
llama-cpp-python/llama_cpp/llama_chat_format.py
Line 3751 in 816d491
What need to be done to support "auto" tool choice with streaming? Why it was not implemented at first? Why is it not supported? @abetlen
I can try to implement it, be would be useful to have history of the feature, so if it's not possible I'll not lose my time.
Thanks !
Beta Was this translation helpful? Give feedback.
This is my assumption on it.
llama-cpp-python's function call seems to force LLM to output json format and them parse it. You notice combination of streaming and json parser is really tricky when you try to implement it by yourself. Consider following incomplete output of json schema
{ "arg_1": "value 1", "arg_2": "val
The LLM is appropriately trying to obey the rule, but this intermediate state is still not valid as function calling nor normal texts. If stream=False this is not troublesome since only matured outputs is gonna be fed into the parser. However, when stream=True, this immature outputs are gonna be exposed on user interface, which must be weird.
stream=False
stream=True
What we can do to avoid it is to auto-complete the LLM's outputs (e.g. adding closure sequence ""\n}" to the above example) I think it's tough to implement decent auto-completion for json, tbh. So, I decided to write XML function calling pipeline on my own.
If this is the only problem, I think it's easily solvable. There's even library like https://pypi.org/project/json-stream/ to iteratively parse a json.
@samuelint Are you still interested in giving this a go? We ran into this limitation with RAGLite as well. Streaming tool_choice="auto" would be an awesome improvement to llama-cpp-python!
tool_choice="auto"
Unfortunately, I've moved on to something else. I'm not going to implement this.
OK, thanks for letting me know!
For those interested, I implemented streaming tool use for llama-cpp-python models in RAGLite and just pushed #1884 to contribute this improvement back to llama-cpp-python.