8000 GGUF metadata KV overrides, re #1011 by phiharri · Pull Request #1116 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content

GGUF metadata KV overrides, re #1011 #1116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 24, 2024
Merged

Conversation

phiharri
Copy link
Contributor

(starting from a clean branch)

76aafa6 didn't work for me, _kv_overrides_array[i] refs uninitialised i and I think it's necessary to instantiate the CStructs array with () or it ends up a generic PyCArrayType.

Let me know what you think about this PR instead.

Simple validation that it takes effect:

>>> llm=llama_cpp.Llama(model_path="../models/mixtral-8x7b-instruct-v0.1.Q6_K.gguf", kv_overrides={"llama.embedding_length":4095})

validate_override: Using metadata override (  int) 'llama.embedding_length' = 4095
llama_model_load: error loading model: invalid n_rot: 128, expected 127 

@abetlen
Copy link
Owner
abetlen commented Jan 22, 2024

@phiharri looks good, is the sentinel value not required, I saw that in your original PR.

@phiharri
Copy link
Contributor Author

Not sure where I got that idea from tbh, pretty sure it isn't needed. Thought about checking len(key)<128 for potential overflow but it seems ctypes takes care of that already 🤷

>>> k='a'*130
>>> kvo.key=k.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: bytes too long (130, maximum length 128)

@abetlen
Copy link
Owner
abetlen commented Jan 22, 2024

I think it is necessary to add an additional key and set it to the null byte, see https://github.com/ggerganov/llama.cpp/blob/6f9939d119b2d004c264952eb510bd106455531e/llama.cpp#L2250

@phiharri
Copy link
Contributor Author

Sure, how's this look? Elements get initialised empty/null but if you wanted to be explicit we can add
self._kv_overrides_array[-1].key = b'\0'

@abetlen
Copy link
Owner
abetlen commented Jan 23, 2024

@phiharri that's correct however I think(?) we still need to store the reference to that byte string because we're passing it into a c data structure and want to avoid any memory issues.

@phiharri
Copy link
Contributor Author

Aren't we just modifying data in the existing _kv_overrides_array, not allocating new memory (unless doing create_string_buffer)?

Definitely seems the array gets null initialised so I don't think self._kv_overrides_array[-1].key = b'\0' is needed. https://docs.python.org/3/library/ctypes.html#arrays "the array contents is initialized to zeros"

Peeking the memory of the last element without doing any b'\0' assignment shows all zeros:

null init

@abetlen
Copy link
Owner
abetlen commented Jan 23, 2024

Doh you're right, I thought the key was a char * not a fixed size array, yes as long as it's zero initialized (I'd like to be explicit about this) then it should be okay.

@abetlen
Copy link
Owner
abetlen commented Jan 24, 2024

Awesome, okay good to merge it then. Thanks for the contribution @phiharri

@abetlen abetlen merged commit fe5d6ea into abetlen:main Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0