8000 opencl: Add support for multiple devices by linehill · Pull Request #12622 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

opencl: Add support for multiple devices #12622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 21, 2025

Conversation

linehill
Copy link
Contributor

... but limited to one platform for now. A platform with a GPU will be preferred.

Additionally:

  • Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc).

  • Make ggml_backend_opencl_reg() thread-safe.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 28, 2025
@linehill linehill marked this pull request as draft March 28, 2025 11:40
@linehill linehill marked this pull request as ready for review April 7, 2025 09:44
@max-krasnyansky
Copy link
Collaborator

@lhez please take a look. It makes sense to add multi-device support.

@linehill please rebase once we merge #12886 when you get the chance

... but limited to one platform. A platform with a GPU will be preferred.

Additionally:

* Filter out devices that lack capabilities needed by the backend
  implementation (half support, OpenCL 2.0+, etc).

* Make ggml_backend_opencl_reg() thread-safe.
... when there is only one OpenCL device available.
@linehill
Copy link
Contributor Author
linehill commented May 5, 2025

Gentle ping, @max-krasnyansky, @lhez. Is this PR good for landing?

@lhez
Copy link
Contributor
lhez commented May 6, 2025

Thank you @linehill, it looks good.

@lhez
Copy link
Contributor
lhez commented May 14, 2025

@max-krasnyansky ping - I think this PR should be good to merge.

@max-krasnyansky max-krasnyansky merged commit a4e8912 into ggml-org:master May 21, 2025
48 checks passed
@acbits
Copy link
acbits commented May 22, 2025

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

I would like to compare OpenCL and Vulkan performance.

infil00p pushed a commit to baseweight/llama.cpp that referenced this pull request May 22, 2025
* opencl: Add support for multiple devices

... but limited to one platform. A platform with a GPU will be preferred.

Additionally:

* Filter out devices that lack capabilities needed by the backend
  implementation (half support, OpenCL 2.0+, etc).

* Make ggml_backend_opencl_reg() thread-safe.

* fixup: fix an error in sync_with_other_backends

... when there is only one OpenCL device available.
@linehill
Copy link
Contributor Author

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

They aren't supported, at least because of the device whitelist in here. The backend might work on AMD and NVidia OpenCL drivers if you remove this line but beware - the current kernel implementations seems to be tailored for Intel and Qualcomm HW.

@lhez
Copy link
Contributor
lhez commented May 22, 2025

The problem is some of the kernels use subgroups and need to know the subgroup size and Nvidia's OpenCL implementation does not support subgroups. I think AMD has subgroups support in OpenCL, so it should be relatively easy to enable AMD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0