GGBond stands for GGML Bounding, is a simple and naive GGML Python binding via pybind11. Working in progress.
pip install -e .Enable HIP backend at build time:
CMAKE_ARGS="-DGGML_HIP=ON" pip install -e .- Python 3.10+
- CMake 3.15+
- C++17 compatible compiler
- ROCm/HIP toolchain (only for
hipbackend)
GGBond provides three API styles:
A near 1:1 mapping of the GGML C API. Functions follow GGML naming conventions (new_tensor_2d, mul_mat, backend_graph_compute, etc.). All GGML opaque pointers are wrapped as distinct Python types (ggml.Context, ggml.Tensor, ggml.Backend, etc.) for type safety. Use this API when you need full control over the GGML computation model.
An object-oriented wrapper around GGML primitives with lifecycle management (close() / context manager). This API is suitable for production use when you want explicit control of context/graph/backend lifecycles with less boilerplate than the raw API. It is not a complete 1:1 equivalent of the C API mapping — for example, Graph internally owns its own Context, and Context computes memory size from n_tensors automatically. Compared with the tensor API, it exposes lower-level control and fewer convenience abstractions.
Session owns a backend and manages all resource lifetimes. Tensor is a lazy-evaluated tensor bound to a session — operations build a computation graph, which is materialized on compute() or numpy(). GGUF model weights are loaded directly onto the target backend (CPU/Metal/HIP) without intermediate copies.
Supported backends:
cpu(always available)metal(macOS only)hip(requires build withGGML_HIP=ON; alias:rocm)
import ggbond
matrix_a = [[2, 8], [5, 1], [4, 2], [8, 6]]
matrix_b = [[10, 5], [9, 9], [5, 4]]
s = ggbond.Session("cpu")
a = s.tensor(matrix_a)
b = s.tensor(matrix_b)
print((a @ b).numpy())
s.close()Tensor can also be instantiated directly, but session must be passed explicitly:
import ggbond
s = ggbond.Session("cpu")
matrix_a = [[2, 8], [5, 1], [4, 2], [8, 6]]
matrix_b = [[10, 5], [9, 9], [5, 4]]
a = ggbond.Tensor(matrix_a, session=s)
b = ggbond.Tensor(matrix_b, session=s)
print((a<
681D
/span> @ b).numpy())
s.close()examples/simple.py— Matrix multiplicationexamples/magika.py— File type detection with Magikaexamples/gpt2.py— GPT-2 text generation
MIT
