`CUDA error: an illegal memory access was encountered` on DeepSeek-R1-0528

Doing the below:
```bash
./llama.cpp/llama-cli  \
    -hf unsloth/DeepSeek-R1-0528-GGUF:IQ1_S  \
    --threads -1  \
    --n-gpu-layers 99 \
     --prio 3 \
     --temp 0.6  \
    --top_p 0.95  \
    --min_p 0.01  \
    --ctx-size 16384  \
     --seed 3407
```
causes
```bash
/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:75: CUDA error
CUDA error: an illegal memory access was encountered
  current device: 7, in function ggml_backend_cuda_synchronize at /llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:2461
  cudaStreamSynchronize(cuda_ctx->stream())
./llama.cpp/llama-cli(+0x751b7b)[0x5c5553040b7b]
./llama.cpp/llama-cli(+0x7521fe)[0x5c55530411fe]
./llama.cpp/llama-cli(+0x35f017)[0x5c5552c4e017]
./llama.cpp/llama-cli(+0x36200a)[0x5c5552c5100a]
./llama.cpp/llama-cli(+0x76bae0)[0x5c555305aae0]
./llama.cpp/llama-cli(+0x1d1d41)[0x5c5552ac0d41]
./llama.cpp/llama-cli(+0x1b7b27)[0x5c5552aa6b27]
./llama.cpp/llama-cli(+0x50a28)[0x5c555293fa28]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x77c5c8e2a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x77c5c8e2a28b]
./llama.cpp/llama-cli(+0x95b15)[0x5c5552984b15]
Aborted (core dumped)
```

Interestingly using `-ot` removes the issue.

In fact adding  a dummy `-ot "(999).ffn_(down)_exps.=CPU"` also removes the CUDA error?!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions