-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Insights: abetlen/llama-cpp-python
Overview
-
- 0 Merged pull requests
- 2 Open pull requests
- 2 Closed issues
- 9 New issues
Could not load contribution data
Please try again later
2 Pull requests opened by 2 people
-
Flush libc stdout/stderr in suppress_stdout_stderr
#2015 opened
May 7, 2025 -
Add support for Cohere Command models
#2018 opened
May 12, 2025
2 Issues closed by 2 people
-
"eval time" and "prompt eval time" is 0.00ms after Ver0.3.0
#1830 closed
May 20, 2025 -
Qwen 3 model not working
#2008 closed
May 11, 2025
9 Issues opened by 9 people
-
Support for jinja for custom chat templates
#2023 opened
May 22, 2025 -
Assertion error when offloading Llama 4 layers to CPU
#2022 opened
May 19, 2025 -
Is it possible to run bitnet.cpp through these bindings ?
#2021 opened
May 16, 2025 -
Installation URL for CUDA 12.5 in README results in 404 error
#2020 opened
May 16, 2025 -
Macos wheel fails on 0.35, works on 0.34
#2016 opened
May 9, 2025 -
Is llama-cpp-python supports Llama-4?
#2014 opened
May 5, 2025 -
Can't install with GPU support with Cuda toolkit 12.9 and Cuda 12.9
#2013 opened
May 5, 2025 -
How to install the latest version with GPU support
#2012 opened
May 2, 2025 -
llama-cpp-python 0.3.8 with CUDA
#2010 opened
May 1, 2025
22 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
CUDA llama-cpp-python build failed.
#1986 commented on
May 1, 2025 • 0 new comments -
Retrieve attention score for all input tokens per generated token
#1141 commented on
May 2, 2025 • 0 new comments -
Llama 4 not working
#1994 commented on
May 7, 2025 • 0 new comments -
Segmentation fault (core dumped) appearing randomly
#2005 commented on
May 9, 2025 • 0 new comments -
Setting seed to -1 (random) or using default LLAMA_DEFAULT_SEED generates a deterministic reply chain
#1809 commented on
May 12, 2025 • 0 new comments -
Could not install llama-cpp-python 0.3.7 on Macbook Air M1 - Compilation issue
#1956 commented on
May 12, 2025 • 0 new comments -
Trying to Install GPU Version - Getting CMake Error With _CMAKE_CUDA_WHOLE_FLAG
#1508 commented on
May 18, 2025 • 0 new comments -
Can't install GPU version for windows for many times.
#1393 commented on
May 18, 2025 • 0 new comments -
Steps to Build and Install llama-cpp-python 0.3.7 w/CUDA on Windows 11 [06/03/2025]
#1963 commented on
May 18, 2025 • 0 new comments -
Running basic example from docs results in `TypeError: 'NoneType' object is not callable`
#1998 commented on
May 19, 2025 • 0 new comments -
destructor llama error: TypeError: 'NoneType' object is not callable
#1610 commented on
May 19, 2025 • 0 new comments -
Failed building wheel for llama-cpp-python
#1932 commented on
May 21, 2025 • 0 new comments -
llama-server not using GPU
#1826 commented on
May 21, 2025 • 0 new comments -
Include usage key in create_completion when streaming
#1498 commented on
May 23, 2025 • 0 new comments -
Feature request: add support for streaming tool use
#1883 commented on
May 25, 2025 • 0 new comments -
Add reranking support
#1794 commented on
May 30, 2025 • 0 new comments -
pyinstaller hook script
#709 commented on
May 23, 2025 • 0 new comments -
Add batch inference support (WIP)
#951 commented on
May 19, 2025 • 0 new comments -
Feat: Support Ranking Method
#1820 commented on
May 27, 2025 • 0 new comments -
fix(types): remove redundant type (typo with repeating lines)
#1971 commented on
May 9, 2025 • 0 new comments -
feat: Add Gemma3 chat handler (#1976)
#1989 commented on
May 30, 2025 • 0 new comments -
Added support for overriding tensor buffer types
#2007 commented on
May 22, 2025 • 0 new comments