8000 Eval bug: mtmd in server mode crashes on too big image · Issue #13414 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content
Eval bug: mtmd in server mode crashes on too big image #13414
Closed
@pwilkin

Description

@pwilkin

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3080, compute capability 8.6, VMM: yes
version: 5331 (33eff40)
built with cc (Ubuntu 14.2.0-4ubuntu2) 14.2.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

i7-9700K + GTX 3080 10GB VRAM

Models

Qwen2.5 VL (Q4_K_M)

Problem description & steps to reproduce

Upload a big image (i.e. 3MB for my case) without --no-mmproj-offload. The server crashes on an OOM.

First Bad Commit

33eff40

Relevant log output

srv  process_chun: processing image...
ggml_backend_cuda_buffer_type_alloc_buffer: allocating 4514.05 MiB on device 0: cudaMalloc failed: out of memory
ggml_gallocr_reserve_n: failed to allocate CUDA0 buffer of size 4733328896
/devel/tools/llama.cpp/ggml/src/ggml-backend.cpp:1662: GGML_ASSERT((char *)addr + ggml_backend_buffer_get_alloc_size(buffer, tensor) <= (char *)ggml_backend_buffer_get_base(buffer) + ggml_backend_buffer_get_size(buffer)) failed
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0