Why can't InternVL3-8B start vLLM after being converted to the Hugging Face format? It shows the error: `ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.' · Issue #38000 · huggingface/transformers · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why can't InternVL3-8B start vLLM after being converted to the Hugging Face format? It shows the error: `ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.' #38000
I downloaded the model from OpenGVLab/InternVL3-8B, which natively supports running OpenAI-style chat completions with vLLM. However, after converting it to the Hugging Face format using the script transformers/src/transformers/models/internvl/convert_internvl_weights_to_hf.py, launching vLLM resulted in the error:
ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.
The system runs correctly when I set MODEL_PATH to the original OpenGVLab/InternVL3-8B address. But throws an error when I change the path to the converted InternVL-3B-hf format : ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.
Could someone explain why this is happening and suggest solutions?
Thank you very much!