Simple Python bindings for @ggerganov's llama.cpp
library.
This package provides:
- Low-level access to C API via
ctypes
interface. - High-level Python API for text completion
- OpenAI-like API
- LangChain compatibility
Documentation is available at https://llama-cpp-python.readthedocs.io/en/latest.
Install from PyPI (requires a c compiler):
pip install llama-cpp-python
The above command will attempt to install the package and build llama.cpp
from source.
This is the recommended installation method as it ensures that llama.cpp
is built with the available optimizations for your system.
If you have previously installed llama-cpp-python
through pip and want to upgrade your version or rebuild the package with different compiler options, please add the following flags to ensure that the package is rebuilt correctly:
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh
Otherwise, while installing it will build the llama.ccp x86 version which will be 10x slower on Apple Silicon (M1) Mac.
llama.cpp
supports multiple BLAS backends for faster processing.
Use the FORCE_CMAKE=1
environment variable to force the use of cmake
and install the pip package for the desired BLAS backend.
To install with OpenBLAS, set the LLAMA_OPENBLAS=1
environment variable before installing:
CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
To install with cuBLAS, set the LLAMA_CUBLAS=1
environment variable before installing:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python