8000 Add doc string for n_gpu_layers argument · iamudesharma/llama-cpp-python@d018c7b · GitHub
[go: up one dir, main page]

Skip to content

Commit d018c7b

Browse files
authored
Add doc string for n_gpu_layers argument
1 parent 66fb034 commit d018c7b

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

llama_cpp/llama.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,7 @@ def __init__(
239239
n_ctx: Maximum context size.
240240
n_parts: Number of parts to split the model into. If -1, the number of parts is automatically determined.
241241
seed: Random seed. -1 for random.
242+
n_gpu_layers: Number of layers to offload to GPU (-ngl). If -1, all layers are offloaded.
242243
f16_kv: Use half-precision for key/value cache.
243244
logits_all: Return logits for all tokens, not just the last token.
244245
vocab_only: Only load the vocabulary no weights.

0 commit comments

Comments
 (0)
0