-
Notifications
You must be signed in to change notification settings - Fork 12k
sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices #13973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@ShanoToni But it doesn't cover the old iGPU in Intel Core CPU, since 11th. I design the hw_info to check if the hardware support reorder, to make the feature won't bring negative impact to the Intel GPUs even if reorder can't bring benefit. For newer GPU, this feature always bring benefit as your test. That's why I design the structure to check the GPU type to enable reorder on the GPU which can get benefit, and disable reorder for others. In this PR, it use the GPU name to check for Intel GPU. It's not good method: it can be used to check the GPU detail model. But use the architecture by SYCL API can support it. I suggest keep the legacy function. It's OK to set this feature to be enabled default. |
4c20538
to
bc14320
Compare
@NeoZhangJianyu Appreciate the comment.
|
LGTM. Not approving yet since we want to measure the impact on one more device before merging. |
Legacy code make two paths for new GPU and old GPU. This PR make the code be simple, to remove the path for old GPU and reduce the performance of old GPU. All my words are only suggestion. It's depends on you. No old user, no new user. |
This PR proposes moving the env variable
GGML_SYCL_DISABLE_OPT
from default ON (reorder disabled) to default OFF (reorders enabled). This would allow easier testing on newer hardware, without needing to modify the list of devices which support the reorder feature.Concerns regarding the reorder feature from #13254 have been resolved.
Regarding performance on older devices an amendment has been made to the README to suggest disabling the feature.
Below are performance's runs on 2 different models showing performance improvements of having the feature enabled:
Llama2-7B Q4_0 PVC
With reorder
Without
gemma2 2B Q4_K PVC
With Reorder
Without
Llama2-7B Q4_0 ARC-A770
With reorder
Without
gemma2 2B Q4_K ARC-A770
With Reorder
Without
Llama2-7B Q4_0 Lunar Lake
With reorder
Without
gemma2 2B Q4_K Lunar Lake
With Reorder
Without
Llama2-7B Q4_0 Intel B580
With reorder
Without
gemma2 2B Q4_K Intel B580
With Reorder
Without