8000 cuda : speed-up by using CUBLAS_COMPUTE_32F instead of CUBLAS_COMPUTE_16F by ggerganov · Pull Request #3816 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

cuda : speed-up by using CUBLAS_COMPUTE_32F instead of CUBLAS_COMPUTE_16F#3816

Closed
ggerganov wants to merge 1 commit intomasterfrom
cuda-cublas-opts
Closed

cuda : speed-up by using CUBLAS_COMPUTE_32F instead of CUBLAS_COMPUTE_16F#3816
ggerganov wants to merge 1 commit intomasterfrom
cuda-cublas-opts

Commits

Commits on Jan 2, 2024

0