8000 Update README.md · TaoAthe/falcon-cpp-python@ab2cab5 · GitHub
[go: up one dir, main page]

Skip to content

Commit ab2cab5

Browse files
committed
Update README.md
1 parent bb3b70b commit ab2cab5

File tree

1 file changed

+21
-94
lines changed

1 file changed

+21
-94
lines changed

README.md

Lines changed: 21 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,87 +1,26 @@
1-
# 🦙 Python Bindings for `llama.cpp`
1+
# Python Bindings for `ggllm.cpp`
22

3-
[![Documentation Status](https://readthedocs.org/projects/llama-cpp-python/badge/?version=latest)](https://llama-cpp-python.readthedocs.io/en/latest/?badge=latest)
4-
[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)
5-
[![PyPI](https://img.shields.io/pypi/v/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
6-
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
7-
[![PyPI - License](https://img.shields.io/pypi/l/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
8-
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
93

10-
Simple Python bindings for **@ggerganov's** [`llama.cpp`](https://github.com/ggerganov/llama.cpp) library.
4+
Simple Python bindings for [`ggllm.cpp`](https://github.com/cmp-nct/ggllm.cpp) library.
115
This package provides:
126

137
- Low-level access to C API via `ctypes` interface.
148
- High-level Python API for text completion
159
- OpenAI-like API
1610
- LangChain compatibility
1711

18-
Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest).
12+
This project is currently in alpha development and is not yet completely functional. Any contributions are warmly welcomed.
1913

2014

21-
## Installation from PyPI (recommended)
22-
23-
Install from PyPI (requires a c compiler):
24-
25-
```bash
26-
pip install llama-cpp-python
27-
```
28-
29-
The above command will attempt to install the package and build `llama.cpp` from source.
30-
This is the recommended installation method as it ensures that `llama.cpp` is built with the available optimizations for your system.
31-
32-
If you have previously installed `llama-cpp-python` through pip and want to upgrade your version or rebuild the package with different compiler options, please add the following flags to ensure that the package is rebuilt correctly:
33-
34-
```bash
35-
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
36-
```
37-
38-
Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:
39-
```
40-
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
41-
bash Miniforge3-MacOSX-arm64.sh
42-
```
43-
Otherwise, while installing it will build the llama.ccp x86 version which will be 10x slower on Apple Silicon (M1) Mac.
44-
45-
### Installation with OpenBLAS / cuBLAS / CLBlast / Metal
46-
47-
`llama.cpp` supports multiple BLAS backends for faster processing.
48-
Use the `FORCE_CMAKE=1` environment variable to force the use of `cmake` and install the pip package for the desired BLAS backend.
49-
50-
To install with OpenBLAS, set the `LLAMA_OPENBLAS=1` environment variable before installing:
51-
52-
```bash
53-
CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
54-
```
55-
56-
To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing:
57-
58-
```bash
59-
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
60-
```
61-
62-
To install with CLBlast, set the `LLAMA_CLBLAST=1` environment variable before installing:
63-
64-
```bash
65-
CMAKE_ARGS="-DLLAMA_CLBLAST=on" FORCE_CMAKE=1 pip install llama-cpp-python
66-
```
67-
68-
To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:
69-
70-
```bash
71-
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python
72-
```
73-
74-
Detailed MacOS Metal GPU install documentation is available at [docs/install/macos.md](docs/install/macos.md)
75-
7615
## High-level API
7716

7817
The high-level API provides a simple managed interface through the `Llama` class.
7918

8019
Below is a short example demonstrating how to use the high-level API to generate text:
8120

8221
```python
83-
>>> from llama_cpp import Llama
84-
>>> llm = Llama(model_path="./models/7B/ggml-model.bin")
22+
>>> from falcon_cpp import Falcon
23+
>>> llm = Falcon(model_path="./models/7B/ggml-model.bin")
8524
>>> output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
8625
>>> print(output)
8726
{
@@ -107,57 +46,45 @@ Below is a short example demonstrating how to use the high-level API to generate
10746

10847
## Web Server
10948

110-
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
111-
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
49+
`falcon-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
50+
This allows you to use ggllm.cpp to inference falcon models with any OpenAI compatible client (language libraries, services, etc).
11251

11352
To install the server package and get started:
11453

11554
```bash
116-
pip install llama-cpp-python[server]
11755
python3 -m llama_cpp.server --model models/7B/ggml-model.bin
11856
```
11957

12058
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
12159

122-
## Docker image
123-
124-
A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:
125-
126-
```bash
127-
docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/ggml-model-name.bin ghcr.io/abetlen/llama-cpp-python:latest
128-
```
129-
13060
## Low-level API
13161

13262
The low-level API is a direct [`ctypes`](https://docs.python.org/3/library/ctypes.html) binding to the C API provided by `llama.cpp`.
133-
The entire lowe-level API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and directly mirrors the C API in [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).
63+
The entire lowe-level API can be found in [falcon_cpp/falcon_cpp.py](https://github.com/sirajperson/falcon-cpp-python/blob/master/falcon_cpp/falcon_cpp.py) and directly mirrors the C API in [libfalcon.h](https://github.com/cmp-nct/ggllm.cpp/blob/master/libfalcon.h).
13464

13565
Below is a short example demonstrating how to use the low-level API to tokenize a prompt:
13666

13767
```python
138-
>>> import llama_cpp
68+
>>> import falcon_cpp
13969
>>> import ctypes
140-
>>> params = llama_cpp.llama_context_default_params()
70+
>>> params = falcon_cpp.falcon_context_default_params()
14171
# use bytes for char * params
142-
>>> ctx = llama_cpp.llama_init_from_file(b"./models/7b/ggml-model.bin", params)
72+
>>> ctx = falcon_cpp.falcon_init_backend("./models/7b/ggml-model.bin", params)
14373
>>> max_tokens = params.n_ctx
14474
# use ctypes arrays for array params
145-
>>> tokens = (llama_cpp.llama_token * int(max_tokens))()
146-
>>> n_tokens = llama_cpp.llama_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, add_bos=llama_cpp.c_bool(True))
147-
>>> llama_cpp.llama_free(ctx)
75+
>>> tokens = (falcon_cpp.falcon_token * int(max_tokens))()
76+
>>> n_tokens = falcon_cpp.falcon_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, add_bos=llama_cpp.c_bool(True))
77+
>>> falcon_cpp.falcon_free(ctx)
14878
```
14979

15080
Check out the [examples folder](examples/low_level_api) for more examples of using the low-level API.
15181

152-
15382
# Documentation
154-
155-
Documentation is available at [https://abetlen.github.io/llama-cpp-python](https://abetlen.github.io/llama-cpp-python).
156-
If you find any issues with the documentation, please open an issue or submit a PR.
83+
Coming soon...
15784

15885
# Development
15986

160-
This package is under active development and I welcome any contributions.
87+
Again, this package is under active development and I welcome any contributions.
16188

16289
To get started, clone the repository and install the package in development mode:
16390

@@ -179,12 +106,12 @@ poetry install --all-extras
179106
python3 setup.py develop
180107
```
181108

182-
# How does this compare to other Python bindings of `llama.cpp`?
183-
184-
I originally wrote this package for my own use with two goals in mind:
109+
# This Project is a fork of llama-cpp-python
185110

186-
- Provide a simple process to install `llama.cpp` and access the full C API in `llama.h` from Python
187-
- Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use `llama.cpp`
111+
This project was originally llama-cpp-python and owes an immense thanks to @abetlen.
112+
This projects goal is to
113+
- Provide a simple process to install `ggllm.cpp` and access the full C API in `libfalcon.h` from Python
114+
- Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use `ggllm.cpp`
188115

189116
Any contributions and changes to this package will be made with these goals in mind.
190117

0 commit comments

Comments
 (0)
0