opencl: Add support for multiple devices #12622

linehill · 2025-03-28T08:20:34Z

... but limited to one platform for now. A platform with a GPU will be preferred.

Additionally:

Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc).
Make ggml_backend_opencl_reg() thread-safe.

max-krasnyansky · 2025-04-11T04:58:35Z

@lhez please take a look. It makes sense to add multi-device support.

@linehill please rebase once we merge #12886 when you get the chance

... but limited to one platform. A platform with a GPU will be preferred. Additionally: * Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc). * Make ggml_backend_opencl_reg() thread-safe.

... when there is only one OpenCL device available.

linehill · 2025-05-05T13:31:28Z

Gentle ping, @max-krasnyansky, @lhez. Is this PR good for landing?

lhez · 2025-05-06T06:20:53Z

Thank you @linehill, it looks good.

lhez · 2025-05-14T17:56:22Z

@max-krasnyansky ping - I think this PR should be good to merge.

acbits · 2025-05-22T00:14:29Z

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

I would like to compare OpenCL and Vulkan performance.

* opencl: Add support for multiple devices ... but limited to one platform. A platform with a GPU will be preferred. Additionally: * Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc). * Make ggml_backend_opencl_reg() thread-safe. * fixup: fix an error in sync_with_other_backends ... when there is only one OpenCL device available.

linehill · 2025-05-22T10:35:40Z

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

They aren't supported, at least because of the device whitelist in here. The backend might work on AMD and NVidia OpenCL drivers if you remove this line but beware - the current kernel implementations seems to be tailored for Intel and Qualcomm HW.

lhez · 2025-05-22T18:04:57Z

The problem is some of the kernels use subgroups and need to know the subgroup size and Nvidia's OpenCL implementation does not support subgroups. I think AMD has subgroups support in OpenCL, so it should be relatively easy to enable AMD.

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 28, 2025

linehill marked this pull request as draft March 28, 2025 11:40

linehill force-pushed the ocl-mdev branch from 2e7c4ba to 4daab0e Compare April 7, 2025 09:44

linehill marked this pull request as ready for review April 7, 2025 09:44

linehill force-pushed the ocl-mdev branch from 4daab0e to ca31c30 Compare April 10, 2025 08:32

linehill added 2 commits April 24, 2025 13:40

fixup: fix an error in sync_with_other_backends

e6d7896

... when there is only one OpenCL device available.

linehill force-pushed the ocl-mdev branch from ca31c30 to e6d7896 Compare April 24, 2025 10:45

max-krasnyansky approved these changes May 21, 2025

View reviewed changes

max-krasnyansky merged commit a4e8912 into ggml-org:master May 21, 2025
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opencl: Add support for multiple devices #12622

opencl: Add support for multiple devices #12622

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

opencl: Add support for multiple devices #12622

opencl: Add support for multiple devices #12622

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!