8000 feat: upgrade llama.cpp by wsxiaoys · Pull Request #645 · TabbyML/tabby · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@wsxiaoys
Copy link
Member
@wsxiaoys wsxiaoys commented Oct 26, 2023

this change will merge once we've updated gguf files for all models in https://tabby.tabbyml.com/docs/models/

fix TAB-281

TextInferenceEngineImpl(owned<llama_model> model, owned<llama_context> ctx) :
model_(std::move(model)),
ctx_(std::move(ctx)) {
batch_ = llama_batch_init(N_BATCH, 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous usage of batch api in tabby generates a segmentation fault with updated llama.cpp version - roll back to llama_batch_get_one as workaround, will revisit this when integrating the continuous batching support.

self.path_string("ggml/q8_0.gguf")
}

pub fn ggml_q8_0_v2_file(&self) -> String {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated llama.cpp requires re-converting all starcoder models, updating filepath to keep forward compatibility.

@wsxiaoys wsxiaoys marked this pull request as ready for review October 27, 2023 19:17
@wsxiaoys wsxiaoys merged commit f378405 into main Oct 27, 2023
@wsxiaoys wsxiaoys deleted the upgrade-llama-cpp branch October 27, 2023 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0