8000 fix: Always set logits_all = True when using speculative decoding · bellofils/llama-cpp-python@cb79171 · GitHub
[go: up one dir, main page]

Skip to content

Commit cb79171

Browse files
committed
fix: Always set logits_all = True when using speculative decoding
1 parent 153a004 commit cb79171

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

llama_cpp/llama.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ def __init__(
281281
)
282282
self.context_params.yarn_orig_ctx = yarn_orig_ctx if yarn_orig_ctx != 0 else 0
283283
self.context_params.mul_mat_q = mul_mat_q
284-
self.context_params.logits_all = logits_all
284+
self.context_params.logits_all = logits_all if draft_model is None else True # Must be set to True for speculative decoding
285285
self.context_params.embedding = embedding
286286
self.context_params.offload_kqv = offload_kqv
287287

0 commit comments

Comments
 (0)
0