8000 sample grammar perf by MarcusDunn · Pull Request #4330 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

sample grammar perf #4330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 5, 2023
Prev Previous commit
Next Next commit
reserve canidates_grammar
  • Loading branch information
MarcusDunn committed Dec 4, 2023
commit eb9d1fcd7dbef531a14783a0ff2fabbe4a2812ee
1 change: 1 addition & 0 deletions llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7359,6 +7359,7 @@ void llama_sample_grammar(struct llama_context * ctx, llama_token_data_array * c
std::vector<std::pair<std::vector<uint32_t>, llama_partial_utf8>> candidates_decoded;
candidates_decoded.reserve(candidates->size);
std::vector<llama_grammar_candidate> candidates_grammar;
candidates_grammar.reserve(candidates->size);

for (size_t i = 0; i < candidates->size; ++i) {
const llama_token id = candidates->data[i].id;
Expand Down
0