-
Notifications
You must be signed in to change notification settings - Fork 584
Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
Qwen3系列模型支持开关控制是否启用think,如:"chat_template_kwargs": {"enable_thinking": false},但是目前lmdeploy版本还不支持。
Reproduction
调用如下:
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "Qwen/Qwen3-8B",
"messages": [
{"role": "user", "content": "Give me a short introduction to large language models."}
],
"temperature": 0.7,
"top_p": 0.8,
"max_tokens": 1024,
"chat_template_kwargs": {"enable_thinking": false}
}'
Environment
lmdeploy 0.7.3
transformers 4.51.3
Error traceback
BUJIDAOVS, QwertyJack and HaotianHu
Metadata
Metadata
Assignees
Labels
No labels