10000 Fix and optimize functionary chat handler by jeffrey-fong · Pull Request #1282 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content

Fix and optimize functionary chat handler #1282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 18, 2024

Conversation

jeffrey-fong
Copy link
Contributor

There is a breaking bug in the functionary_v1_v2_handler such that it is not able to handle some chat requests as mentioned here. This is a result of wrong logic in the handler.

The following optimizations were done:

  • Fixed the wrong logic (certain if-else statements were wrong)
  • Changed the logic when tool_choice="auto". In a while loop,
    1. Generate function name first
    2. Change the grammar according depending on whether the model chooses a function or not and generate the content
    3. Check whether the model wants to generate another turn by checking for the presence of \n<|from|>assistant
    4. Break out of the loop if stop token is generated

The second optimization allows for both text response and tool calls to be filled in 1 assistant turn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0