8000 Allow Llama objects to be freed earlier again · notwa/llama-cpp-python@a79c7ff · GitHub
[go: up one dir, main page]

Skip to content

Commit a79c7ff

Browse files
committed
Allow Llama objects to be freed earlier again
commit 9018270 introduced a cyclic dependency within Llama objects. That change causes old models to linger in memory longer than necessary, thereby creating memory bloat in most applications attempting to switch between models at runtime. This patch simply removes the problematic line, allowing models to deallocate without relying on GC. One might also consider combining `weakref.ref` with a `@property` if the `llama` attribute is absolutely necessary to expose in the tokenizer class.
1 parent 63b0c37 commit a79c7ff

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

llama_cpp/llama_tokenizer.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ def detokenize(
2727

2828
class LlamaTokenizer(BaseLlamaTokenizer):
2929
def __init__(self, llama: llama_cpp.Llama):
30-
self.llama = llama
3130
self._model = llama._model # type: ignore
3231

3332
def tokenize(

0 commit comments

Comments
 (0)
0