Description
Name and Version
version: 5481 (d785f9c)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
Problem description & steps to reproduce
Testing the recently merged support for tools with streaming 12379 with pydantic-ai.
Not really sure if it's a bug in pydantic-ai or llama.cpp or a misunderstanding on my side. The problem is that pydantic-ai interprets the tool name in the tool deltas as partial and concatenates the deltas to build the final tool name. However, the deltas returned from the api contain the complete tool name in each delta, only the arguments part is chunked. So a tool name that should be final_result
becomes final_resultfinal_resultfinal_result
(in case there were 3 deltas).
What is the expected behavior? Should consecutive toolcall deltas contain the tool_name (redundantly) or should it be empty on the second, third etc.?
First Bad Commit
No response
Relevant log output
.venv/lib/python3.11/site-packages/pydantic_ai/result.py:464, in StreamedRunResult.validate_structured_output(self, message, allow_partial)
462 match = self._output_schema.find_named_tool(message.parts, self._output_tool_name)
463 if match is None:
--> 464 raise exceptions.UnexpectedModelBehavior( # pragma: no cover
465 f'Invalid response, unable to find tool: {self._output_schema.tool_names()}'
466 )
468 call, output_tool = match
469 result_data = output_tool.validate(call, allow_partial=allow_partial, wrap_validation_errors=False)
UnexpectedModelBehavior: Invalid response, unable to find tool: ['final_result']