-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Added --chat-template-file to llama-run #11922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added --chat-template-file to llama-run #11922
Conversation
@ericcurtin PTAL |
36a0c4f
to
2ec2107
Compare
2ec2107
to
6e50443
Compare
This code LGTM waiting on builds |
Should pass now. Didn't review the latest changes from master thoroughly. Not sure why I get
|
examples/run/run.cpp
Outdated
return ""; | ||
} | ||
|
||
FILE* file = ggml_fopen(chat_template_file.c_str(), "r"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better if it used the File class from this code. It would automatically fclose the file when necessary. I would also change fopen -> ggml_fopen in that File class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file class from this code is behind the LLAMA_USE_CURL
macro. So I'd need to move it since the --chat-template
should also work without that macro. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'd move it out of the LLAMA_USE_CURL macro. It doesn't actually do any curl stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL @ericcurtin
6e50443
to
f01a139
Compare
examples/run/run.cpp
Outdated
printe("Error reading chat template file '%s': %s", chat_template_file.c_str(), strerror(errno)); | ||
return ""; | ||
} | ||
return std::string(data.begin(), data.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could actually fread directly to a std::string and save a copy. It's a optimisation, a small one. A std::string is basically a vector of chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
Never seen that before 🤷 |
Relates to: ggml-org#11178 Added --chat-template-file CLI option to llama-run. If specified, the file will be read and the content passed for overwriting the chat template of the model to common_chat_templates_from_model. Signed-off-by: Michael Engel <mengel@redhat.com>
f01a139
to
86f68a8
Compare
Hmm... closing and reopening it. Hopefully this fixes that error. |
Relates to: #11178
Added --chat-template-file CLI option to llama-run. If specified, the file will be read and the content passed for overwriting the chat template of the model to common_chat_templates_from_model.
This also enables running the
granite-code
model from ollama:Make sure to read the contributing guidelines before submitting a PR