gguf-py: Support identity operation in TensorNameMap #3095
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
edit: Just to make it a bit more clear what this is trying to do:
TensorNameMap
is used to map the assorted naming conventions for types of tensors in various models to GGUF convention. A HuggingFace LLaMA model might call the attention norm tensormodel.layers.1.input_layernorm
, the.pth
version might call it something different and so on. In GGUF it's calledblk.1.attn_norm
.However, currently
TensorNameMap
only maps the non-GGUF names to the GGUF name. If you already have the GGUF name and try to map, it'll fail. This pull just adds an entry for the GGUF-style name to the list so trying to map a name that's already correct is a no-op.Before:
After:
This also fixes an issue where you had to specify
try_suffixes
toTensorNameMap.get_name
and friends. It just sets the default value for the keyword param to an empty sequence (I meant to do this originally 8000 but apparently I messed it up).