can't quant llama3 with expanded tokenizer #13628

SicariusSicariiStuff · 2025-05-19T06:11:24Z

Name and Version

latest llama.cpp won't quant llama3 with expanded bpe tokenizer (model works fine on fp16 and fp8 on aphrodite \ transformers \koboldcpp )

Operating systems

Linux

GGML backends

CUDA

Hardware

2xa6k

Models

llama 3.1

Problem description & steps to reproduce

Traceback (most recent call last):
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in
main()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
model_instance.write()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
self.prepare_metadata(vocab_only=False)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
self.set_vocab()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
self._set_vocab_gpt2()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

First Bad Commit

No response

Relevant log output

Traceback (most recent call last):
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in <module>
    main()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
    model_instance.write()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_metadata(vocab_only=False)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
    self.set_vocab()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
    self._set_vocab_gpt2()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

SicariusSicariiStuff added the bug-unconfirmed label May 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

can't quant llama3 with expanded tokenizer #13628

can't quant llama3 with expanded tokenizer #13628

can't quant llama3 with expanded tokenizer #13628

can't quant llama3 with expanded tokenizer #13628

Comments

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output