abetlen llama-cpp-python · Discussions · GitHub

8000 abetlen llama-cpp-python · Discussions · GitHub

Discussions

You must be logged in to vote

CUDA Build with -DBUILD_SHARED_LIBS=OFF is OKAY or not

ramavaditya asked Dec 10, 2025 in Q&A · Unanswered

0
You must be logged in to vote

CPU-only prebuilt wheels for newer versions

Spider-netizen asked Aug 18, 2025 in Q&A · Unanswered

1
You must be logged in to vote

Building Dockerfile with CUDA support and smaller image.

jflachman asked Jul 19, 2024 in Q&A · Answered

3
You must be logged in to vote

Diagnosing Latency in llama.cpp Python Wrapper for Short Prompts

quantumtensors asked Sep 29, 2025 in Q&A · Answered

4
You must be logged in to vote

help please？！！！

NierWinter asked Sep 16, 2025 in Q&A · Unanswered

1
You must be logged in to vote

Updating conda-forge package to 0.3.1 (from 0.2.24)

didierguillevic asked Oct 21, 2024 in Q&A · Unanswered

1
You must be logged in to vote

< 10BC0 div class="d-flex flex-justify-center flex-items-center bg-discussions-row-emoji-box rounded-2 flex-shrink-0" aria-hidden="true"> 🙏

Building llama-cpp-python with CUDA support fails due to GLIBC version incompatibility

ObaOzai asked Jun 23, 2025 in Q&A · Answered

8

Worse speed and GPU load than pure llama-cpp

Mushoz asked Nov 14, 2024 in Q&A · Answered

4

Running multiple tiny models in parallel on a single GPU

abarai-lanl asked May 11, 2025 in Q&A · Unanswered

3

Api/generate endpoint

gl2007 asked Feb 5, 2025 in Q&A · Unanswered

1

Getting Embedding from CLIP model

lelefontaa asked Feb 18, 2025 in Q&A · Unanswered

10BC0

1

Segmentation fault on tensor conversion

devashishraj asked Feb 15, 2025 in Q&A · Unanswered

1

Throughputs of Long Sequences #12608

simmonssong asked Mar 27, 2025 in Q&A · Unanswered

1

Hope to support dynamically loaded libraries ggml-base.so

yigekuyou started May 13, 2025 in Ideas

1

Best way to apply chat templates locally

bwilkie asked Feb 10, 2025 in Q&A · Unanswered

1

Use model for multiple user sessions

riebers-m started Sep 4, 2024 in General

9

THIS is how to Get CUDA Working with llama-cpp-python!

PasiKoodaa started Feb 9, 2025 in General · Closed

0

How to make it work with CUDA-support?

itinance asked Aug 18, 2024 in Q&A · Unanswered

2

0