Why can't InternVL3-8B start vLLM after being converted to the Hugging Face format? It shows the error: `ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.'

@amyeroberts

System Info

vllm 0.8.5.post1
transformers 4.52.0.dev0

Who can help?

@amyeroberts
@qubvel

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

CUDA_VISIBLE_DEVICES=0,1 vllm serve $MODEL_PATH \
    --tensor-parallel-size 2 \
    --port $MODEL_PROT \
    --host 0.0.0.0 \
    --dtype float16 \
    --max-model-len 65536 \
    --limit-mm-per-prompt image=30,video=0\
    --enable-prefix-caching \
    --gpu-memory-utilization 0.6 \
    --block-size 16 > "$VLLM_LOG"

Expected behavior

I downloaded the model from OpenGVLab/InternVL3-8B, which natively supports running OpenAI-style chat completions with vLLM. However, after converting it to the Hugging Face format using the script transformers/src/transformers/models/internvl/convert_internvl_weights_to_hf.py, launching vLLM resulted in the error:

ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.

The command I used to launch vllm is as follows:


CUDA_VISIBLE_DEVICES=0,1 vllm serve $MODEL_PATH \
    --tensor-parallel-size 2 \
    --port $MODEL_PROT \
    --host 0.0.0.0 \
    --dtype float16 \
    --max-model-len 65536 \
    --limit-mm-per-prompt image=30,video=0\
    --enable-prefix-caching \
    --gpu-memory-utilization 0.6 \
    --block-size 16 > "$VLLM_LOG"

The system runs correctly when I set MODEL_PATH to the original OpenGVLab/InternVL3-8B address. But throws an error when I change the path to the converted InternVL-3B-hf format : ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.

Could someone explain why this is happening and suggest solutions?
Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions