8000 CUDA: fix crash on large batch size for MoE models by JohannesGaessler · Pull Request #13384 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

CUDA: fix crash on large batch size for MoE models#13384

Merged
JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler:cuda-fix-moe-max-ub
May 9, 2025

Commits

Commits on May 8, 2025

0