8000 docs: Remove ref to llama_eval in llama_cpp.py docs by richdougherty · Pull Request #1819 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content

docs: Remove ref to llama_eval in llama_cpp.py docs #1819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions llama_cpp/llama_cpp.py
Original file line number Diff line number Diff line change
Expand Up @@ -764,7 +764,7 @@ class llama_context_params(ctypes.Structure):
cb_eval_user_data (ctypes.ctypes.c_void_p): user data for cb_eval
type_k (int): data type for K cache
type_v (int): data type for V cache
logits_all (bool): the llama_eval() call computes all logits, not just the last one (DEPRECATED - set llama_batch.logits instead)
logits_all (bool): the llama_decode() call computes all logits, not just the last one (DEPRECATED - set llama_batch.logits instead)
embeddings (bool): if true, extract embeddings (together with logits)
offload_kqv (bool): whether to offload the KQV ops (including the KV cache) to GPU
flash_attn (bool): whether to use flash attention
Expand Down Expand Up @@ -2453,10 +2453,10 @@ def llama_synchronize(ctx: llama_context_p, /):
"llama_get_logits", [llama_context_p_ctypes], ctypes.POINTER(ctypes.c_float)
)
def llama_get_logits(ctx: llama_context_p, /) -> CtypesArray[ctypes.c_float]:
"""Token logits obtained from the last call to llama_eval()
The logits for the last token are stored in the last row
Logits for which llama_batch.logits[i] == 0 are undefined
Rows: n_tokens provided with llama_batch
"""Token logits obtained from the last call to llama_decode()
The logits for which llama_batch.logits[i] != 0 are stored contiguously
in the order they have appeared in the batch.
Rows: number of tokens for which llama_batch.logits[i] != 0
Cols: n_vocab

Returns:
Expand Down
0