batch : auto-gen positions + verify multi-sequence input #14177

ggerganov · 2025-06-13T20:03:55Z

Auto-generate input positions when missing
Sanitize input batches with multiple sequences per token.

ggml-ci

broadbit-hu · 2025-06-16T16:35:36Z

This commit appears to have broken the --batch-size 1 workaround for multimodal, see: #13694 (comment)

The last working release is: https://github.com/ggml-org/llama.cpp/releases/tag/b5664

Test call:

./build/bin/llama-mtmd-cli -m ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf --mmproj ../models/mmproj-Qwen2.5-VL-7B-Instruct-f16.gguf --image ../4.png -p "Please first output bbox coordinates and colors of every rectangle in this image in JSON format, and then answer how many rectangles are there in the image." --seed 1 -ngl 99 --temp 0.0 -c 20000 -b 1

Result of release https://github.com/ggml-org/llama.cpp/releases/tag/b5666:

encoding image slice...
image slice encoded in 1396 ms
decoding image batch 1/1369, n_tokens_batch = 1
image decoded (batch 1/1369) in 3 ms
decoding image batch 2/1369, n_tokens_batch = 1
failed to decode image
init: sequence 0 does not start from the last position stored in the memory
decode: failed to initialize batch
failed to decode image
failed to eval chunk 1
llama_decode: failed to decode, ret = -1
Unable to eval prompt

Same results with -b 1360 or less...

The result of release https://github.com/ggml-org/llama.cpp/releases/tag/b5664

encoding image slice...
image slice encoded in 1418 ms
decoding image batch 1/1369, n_tokens_batch = 1
image decoded (batch 1/1369) in 2 ms
decoding image batch 2/1369, n_tokens_batch = 1
image decoded (batch 2/1369) in 15 ms
decoding image batch 3/1369, n_tokens_batch = 1
image decoded (batch 3/1369) in 13 ms
decoding image batch 4/1369, n_tokens_batch = 1
image decoded (batch 4/1369) in 14 ms
decoding image batch 5/1369, n_tokens_batch = 1
image decoded (batch 5/1369) in 14 ms
...

ggerganov added 2 commits June 13, 2025 23:01

batch : verify multi-sequence input batches

42b2ae3

ggml-ci

cont : auto-gen positions + verify multi-seq input

b67fe0d

ggml-ci

ggerganov changed the title ~~batch : verify multi-sequence input batches~~ batch : auto-gen positions + verify multi-sequence input Jun 14, 2025

ggerganov marked this pull request as ready for review June 14, 2025 08:04

ggerganov added 4 commits June 14, 2025 11:05

cont : first print debug info, then perform validation

2437143

ggml-ci

cont : fix position auto-gen + add comments

91b7792

ggml-ci

cont : engrish [no ci]

a6b0e85

cont : comments [no ci]

32b5eee

ggerganov merged commit b9912ac into master Jun 15, 2025
1 check passed

ggerganov deleted the gg/batch-mutli-seq-id-verify branch June 15, 2025 06:18

ggerganov mentioned this pull request Jun 18, 2025

Eval bug: Qwen2.5-VL-7B-Instruct returns extremely inaccurate bbox coordinates #13694

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

batch : auto-gen positions + verify multi-sequence input #14177

batch : auto-gen positions + verify multi-sequence input #14177

Uh oh!

Uh oh!

Uh oh!

Uh oh!

batch : auto-gen positions + verify multi-sequence input #14177

batch : auto-gen positions + verify multi-sequence input #14177

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!