-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add Qwen3 #3229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Qwen3 #3229
Conversation
Signed-off-by: yuanwu <yuan.wu@intel.com>
@regisss Please help to review. |
if [[ "$*" == *"Llama-4"* ]]; then | ||
echo 'ATTENTION=paged and Llama-4 detected' | ||
if [[ "$*" == *"Llama-4"* || "$*" == *"Qwen3"* ]]; then | ||
echo 'ATTENTION=paged and Llama-4 or Qwen3 detected' | ||
pip install git+https://github.com/huggingface/transformers.git@29338949 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transformers v4.52 should be released today, let's wait for it and update this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuanwu2017 We can use Transformers v4.52.2 now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently gaudi tgi cannot use the latest transformers, because the latest transformers moves the VideoInput into video_utils. But qwen2_5_vl.py uses the old version transformers. If I run llama4 or Qwen3 with latest transformers, I need to change the qwen2_5_vl.py. But If I run llama3 with 4.49 transformers, I cannot change the qwen2_5_vl.py. If using the 4.52.2 for all models, I must reomve the optimum-habana, because it has conflict with latest transformers. So I think we need to use the transformers.git@29338949. After we remove the OH, I will update it to latest transformers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
Ok
________________________________
发件人: regisss ***@***.***>
发送时间: Tuesday, May 20, 2025 4:36:26 PM
收件人: huggingface/text-generation-inference ***@***.***>
抄送: Wu, Yuan ***@***.***>; Author ***@***.***>
主题: Re: [huggingface/text-generation-inference] Add Qwen3 (PR #3229)
@regisss commented on this pull request.
________________________________
In backends/gaudi/tgi-entrypoint.sh<#3229 (comment)>:
@@ -10,8 +10,8 @@ fi
# Check if ATTENTION environment variable is set to paged
if [[ "$ATTENTION" == "paged" ]]; then
# Check if Llama-4 is in the command line arguments
- if [[ "$*" == *"Llama-4"* ]]; then
- echo 'ATTENTION=paged and Llama-4 detected'
+ if [[ "$*" == *"Llama-4"* || "$*" == *"Qwen3"* ]]; then
+ echo 'ATTENTION=paged and Llama-4 or Qwen3 detected'
pip install ***@***.***
Transformers v4.52 should be released today, let's wait for it and update this line?
―
Reply to this email directly, view it on GitHub<#3229 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIIJ2KJSGLCR43NH3KN3KY327LSQVAVCNFSM6AAAAAB5HQ43HWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDQNJTGI2TCOJRGA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Signed-off-by: yuanwu <yuan.wu@intel.com>
Signed-off-by: yuanwu <yuan.wu@intel.com>
Signed-off-by: yuanwu <yuan.wu@intel.com>
Signed-off-by: yuanwu <yuan.wu@intel.com>
Signed-off-by: yuanwu <yuan.wu@intel.com>
if [[ "$*" == *"Llama-4"* ]]; then | ||
echo 'ATTENTION=paged and Llama-4 detected' | ||
if [[ "$*" == *"Llama-4"* || "$*" == *"Qwen3"* ]]; then | ||
echo 'ATTENTION=paged and Llama-4 or Qwen3 detected' | ||
pip install git+https://github.com/huggingface/transformers.git@29338949 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
Signed-off-by: yuanwu <yuan.wu@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
Enable the Qwen3 dense base models on Gaudi platform.
Command:
Run tests command:
https://github.com/yuanwu2017/llm-dbg
./run_tgi_benchmark.sh
Result:
model=Qwen/Qwen3-8B
model=Qwen/Qwen3-32B
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.