-
Notifications
You must be signed in to change notification settings - Fork 897
Description
Describe the bug (描述 bug 以及复现过程,最好有截图)
I am attempting to deploy Qwen3-VL models (specifically the 30B or 235B versions) using the MS-Swift inference pipeline with vllm
as the backend accelerator.
The inference process appears to fail or encounters a compatibility error when running the swift infer
command with these specific multimodal models. This suggests a potential disconnect between the model format expected by the current MS-Swift integration layer and the format produced or required by vLLM.
复现命令 (Reproduction Command):
swift infer Qwen/Qwen3-VL-30B-A3B-Instruct --infer_backend vllm [...]
(File "/home/i-liuche/codes/swift39/ms-swift/swift/llm/infer/infer_engine/vllm_engine.py", line 766, in patch_remove_log async_llm_engine._origin_log_task_completion = async_llm_engine._log_task_completion ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: module 'vllm.engine.async_llm_engine' has no attribute '_log_task_completion' ).
Your hardware and system info (在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
Please confirm the exact versions you are using, or specify "Latest" if installed recently:
Component | Version -- | -- MS-Swift | Current vLLM | 0.11.0 Python | (3.12) GPU/Hardware | (e.g., H100 80GB) CUDA Version | (e.g., 12.8) PyTorch Version | (e.g., 2.8.0)