-
Notifications
You must be signed in to change notification settings - Fork 12k
opencl: Add support for multiple devices #12622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
... but limited to one platform. A platform with a GPU will be preferred. Additionally: * Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc). * Make ggml_backend_opencl_reg() thread-safe.
... when there is only one OpenCL device available.
Gentle ping, @max-krasnyansky, @lhez. Is this PR good for landing? |
Thank you @linehill, it looks good. |
@max-krasnyansky ping - I think this PR should be good to merge. |
Does this PR bring back support for AMD/Nvidia GPUs or is it still missing? I would like to compare OpenCL and Vulkan performance. |
* opencl: Add support for multiple devices ... but limited to one platform. A platform with a GPU will be preferred. Additionally: * Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc). * Make ggml_backend_opencl_reg() thread-safe. * fixup: fix an error in sync_with_other_backends ... when there is only one OpenCL device available.
They aren't supported, at least because of the device whitelist in here. The backend might work on AMD and NVidia OpenCL drivers if you remove this line but beware - the current kernel implementations seems to be tailored for Intel and Qualcomm HW. |
The problem is some of the kernels use subgroups and need to know the subgroup size and Nvidia's OpenCL implementation does not support subgroups. I think AMD has subgroups support in OpenCL, so it should be relatively easy to enable AMD. |
... but limited to one platform for now. A platform with a GPU will be preferred.
Additionally:
Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc).
Make ggml_backend_opencl_reg() thread-safe.