Closed
Description
Doing the below:
./llama.cpp/llama-cli \
-hf unsloth/DeepSeek-R1-0528-GGUF:IQ1_S \
--threads -1 \
--n-gpu-layers 99 \
--prio 3 \
--temp 0.6 \
--top_p 0.95 \
--min_p 0.01 \
--ctx-size 16384 \
--seed 3407
causes
/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:75: CUDA error
CUDA error: an illegal memory access was encountered
current device: 7, in function ggml_backend_cuda_synchronize at /llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:2461
cudaStreamSynchronize(cuda_ctx->stream())
./llama.cpp/llama-cli(+0x751b7b)[0x5c5553040b7b]
./llama.cpp/llama-cli(+0x7521fe)[0x5c55530411fe]
./llama.cpp/llama-cli(+0x35f017)[0x5c5552c4e017]
./llama.cpp/llama-cli(+0x36200a)[0x5c5552c5100a]
./llama.cpp/llama-cli(+0x76bae0)[0x5c555305aae0]
./llama.cpp/llama-cli(+0x1d1d41)[0x5c5552ac0d41]
./llama.cpp/llama-cli(+0x1b7b27)[0x5c5552aa6b27]
./llama.cpp/llama-cli(+0x50a28)[0x5c555293fa28]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x77c5c8e2a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x77c5c8e2a28b]
./llama.cpp/llama-cli(+0x95b15)[0x5c5552984b15]
Aborted (core dumped)
Interestingly using -ot
removes the issue.
In fact adding a dummy -ot "(999).ffn_(down)_exps.=CPU"
also removes the CUDA error?!!
Metadata
Metadata
Assignees
Labels
No labels