Support tie embedding for chatglm models #13328

piDack · 2025-05-06T03:48:11Z

Make sure to read the contributing guidelines before submitting a PR

The GLM 1.5B model uses tied embeddings, but the chatglm architecture does not take this into account, resulting in a ‘missing tensor ‘output.weight’’ error.this pr is fixed this issue.

src/llama-model.cpp

CISC · 2025-05-06T17:19:20Z

BTW, I checked both the safetensors and GGUFs, and even though it has tied word embeddings enabled the safetensors have both sets of (most likely identical) weights, and thus the GGUFs also.

Which GGUF file gave you this error?

To make such a GGUF you would have to do something similar to this (which could make sense to incorporate into this PR, but I'd like to know where and why such a GGUF already exists first):

llama.cpp/convert_hf_to_gguf.py

Lines 4309 to 4313 in 764b856

    
           # assuming token_embd.weight is seen before output.weight 
        
           if self._tok_embd is not None and new_name == output_name: 
        
               if torch.equal(self._tok_embd, data_torch): 
        
                   logger.debug(f"{output_name} is equivalent to {tok_embd_name}, omitting") 
        
                   return []

piDack · 2025-05-07T05:39:31Z

BTW, I checked both the safetensors and GGUFs, and even though it has tied word embeddings enabled the safetensors have both sets of (most likely identical) weights, and thus the GGUFs also.

Which GGUF file gave you this error?

To make such a GGUF you would have to do something similar to this (which could make sense to incorporate into this PR, but I'd like to know where and why such a GGUF already exists first):

llama.cpp/convert_hf_to_gguf.py

Lines 4309 to 4313 in 764b856

# assuming token_embd.weight is seen before output.weight

if self._tok_embd is not None and new_name == output_name:

if torch.equal(self._tok_embd, data_torch):

logger.debug(f"{output_name} is equivalent to {tok_embd_name}, omitting")

return []

I’m sorry, this weight does not come from Hugging Face, but is for my own testing purposes. Indeed, there is no output.weight in the safetensor.
huggingface weight output

myself weight output

I think adding relevant checks is the right thing to do; at the very least, it can prevent some problems.right？

Format adjustment Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

engineer1109 · 2025-05-08T07:07:24Z

Good Job

support tie embedding for chatglm models

5ba8f23

CISC requested changes May 6, 2025

View reviewed changes

src/llama-model.cpp Outdated Show resolved Hide resolved

Update src/llama-model.cpp

5aaf38c

Format adjustment Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

CISC approved these changes May 7, 2025

View reviewed changes

CISC merged commit 6c7fd67 into ggml-org:master May 7, 2025
46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support tie embedding for chatglm models #13328

Support tie embedding for chatglm models #13328

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support tie embedding for chatglm models #13328

Support tie embedding for chatglm models #13328

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!