Add Llama-3 chat format #1371

andreabak · 2024-04-21T16:21:20Z

This PR adds support for the recently-released Llama-3 models by Meta. Specifically this chat format is used for their Instruct pretrained models.

Model card: https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md
See reference implementation for the chat format: https://github.com/meta-llama/llama3/blob/main/llama/tokenizer.py#L202-L229

andreabak · 2024-04-21T17:01:26Z

I made the system message be just like any other role, since from the reference code there doesn't seem to be a distinction between those.
Conventionally the first message is the system one, and usually no extra system ones are allowed within the dialog lines, but there doesn't seem to be such restriction in the llama-3 format (at least just from looking at the code).

Includes proper Llama-3 <|eot_id|> token handling.

andreabak · 2024-04-22T20:58:41Z

@abetlen also added chat template for format auto-detect and bumped llama.cpp to latest version to properly support the eot token.
Rebased up to date with main.

N.B. unrelated issue, I noticed the LLAMA_CUBLAS cmake arg has been deprecated in favor of LLAMA_CUDA, so that should get changed in this codebase too eventually.

jakekarnes42

Just some suggestions based on similar changes I saw in the llama.cpp project.

llama_cpp/llama_chat_format.py

jakekarnes42 · 2024-04-22T21:02:09Z

llama_cpp/llama_chat_format.py

+    _messages = _map_roles(messages, _roles)
+    _messages.append((_roles["assistant"], None))
+    _prompt = _format_no_colon_single(_begin_token, _messages, _sep)
+    return ChatFormatterResponse(prompt=_prompt, stop=_sep)


We could consider adding "<|im_end|>" and "<end_of_turn>" as an additional stop tokens. I don't know if it's completely necessary, but ChatFormatterResponse looks like it accepts a list of stop tokens, and the llama.cpp project uses all three as stop tokens for Llama 3: ggml-org/llama.cpp@8960fe8

I think it makes sense to follow what's done in the llama.cpp library. One thing I'm not sure is if that code from the linked commit is only for llama-3 or more generic for any llama models? Because if it's the latter, maybe it's better to just stick to the <|eot_id|> that's explicitly defined in the released llama-3 code?

ramipellumbi · 2024-04-23T00:17:54Z

Thanks for this PR! I have been using these changes with success locally to test Llama 3.

abetlen · 2024-04-23T01:41:24Z

@andreabak thank you for this! I'll go ahead and merge this shortly!

* feat: Add Llama-3 chat format * feat: Auto-detect Llama-3 chat format from gguf template * feat: Update llama.cpp to b2715 Includes proper Llama-3 <|eot_id|> token handling. --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>

feat: Add Llama-3 chat format

6f95bcf

andreabak force-pushed the dev-llama3_chat_format-abk16 branch from a1bfeb3 to f114963 Compare April 21, 2024 16:57

andreabak added 2 commits April 22, 2024 22:51

feat: Auto-detect Llama-3 chat format from gguf template

8f00ece

feat: Update llama.cpp to b2715

71bc488

Includes proper Llama-3 <|eot_id|> token handling.

andreabak force-pushed the dev-llama3_chat_format-abk16 branch 2 times, most recently from c7a0548 to 93833a1 Compare April 22, 2024 20:56

jakekarnes42 suggested changes Apr 22, 2024

View reviewed changes

andreabak force-pushed the dev-llama3_chat_format-abk16 branch from 93833a1 to 71bc488 Compare April 22, 2024 21:37

Merge branch 'main' into dev-llama3_chat_format-abk16

51d944f

abetlen merged commit 8559e8c into abetlen:main Apr 23, 2024

andreabak deleted the dev-llama3_chat_format-abk16 branch April 23, 2024 07:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Llama-3 chat format #1371

Add Llama-3 chat format #1371

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Llama-3 chat format #1371

Add Llama-3 chat format #1371

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!