8000 musa: override warp_size of musa device to 32 by yeahdongcn · Pull Request #12445 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

musa: override warp_size of musa device to 32 #12445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 18, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
musa: override warp_size of musa device to 32
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
  • Loading branch information
yeahdongcn committed Mar 18, 2025
commit 5c43a3f8ce4615093f32d9c76f339e213d04a065
2 changes: 2 additions & 0 deletions ggml/src/ggml-cuda/ggml-cuda.cu
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,8 @@ static ggml_cuda_device_info ggml_cuda_init() {
id, prop.name, prop.gcnArchName, info.devices[id].cc & 0xffff,
device_vmm ? "yes" : "no", prop.warpSize);
#elif defined(GGML_USE_MUSA)
// FIXME: Ensure compatibility with varying warp sizes across different MUSA archs.
info.devices[id].warp_size = 32;
// TODO: refine the .cc to reflect MUSA's actual CC capabilities
info.devices[id].smpbo = prop.sharedMemPerBlockOptin;
info.devices[id].cc = 100*prop.major + 10*prop.minor;
Expand Down
Loading
0