8000 Add chat format to support baichuan2 by caiyesd · Pull Request #936 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@caiyesd
Copy link
Contributor
@caiyesd caiyesd commented Nov 22, 2023

I found this cool project doesn't support baichuan2, so I make this PR to support it.

What is added

I added a new chat format named "baichuan-2" to support baichuan2 chat model.
I only tested Baichuan2-7B-Chat. Not sure whether Baichuan2-13B-Chat can work.

How to use it

  1. convert baichuan2 mode to baichuan1

see: https://github.com/baichuan-inc/Baichuan2

replace the lm_head.weight with new one.

  1. uses llama.cpp's convert.py to convert Baichuan2-7B-Chat to GGUF format.
python3 convert.py --outfile models/baichuan-2-7b-chat-baichuan-f16.gguf ../baichuan/Baichuan2-7B-Chat-Baichuan/
  1. quantize the model, take Q3_K_M for example.
./quantize models/baichuan-2-7b-chat-baichuan-f16.gguf models/baichuan-2-7b-chat-baichuan-Q3_K_M.gguf Q3_K_M
  1. launch llama-cpp-python server and test. remember to use " --chat_format baichuan-2"
python3 -m llama_cpp.server --model models/baichuan-2-7b-chat-baichuan-Q3_K_M.gguf --host 0.0.0.0 --port 9000 --n_gpu_layers -1 --chat_format baichuan-2

image

Signed-off-by: caiyesd <caiyesd@gmail.com>
@abetlen
Copy link
Owner
abetlen commented Nov 22, 2023

@caiyesd thank you for the contribution!

@abetlen abetlen merged commit b8f29f4 into abetlen:main Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0