Add Google's Gemma formatting via `chat_format="gemma"` #1210

alvarobartt · 2024-02-22T11:07:56Z

Description

This PR adds the recently released Google's Gemma formatting, so that it can be used via the chat_format arg within the Llama class i.e. chat_format="gemma".

More information about the models in the following HuggingFace Collection at https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b

Note that some of the current GGUF versions uploaded to the HuggingFace Hub are not working fine with llama.cpp, but I got it working when using e.g. https://huggingface.co/rahuldshetty/gemma-7b-it-gguf-quantized/blob/main/gemma-7b-it-Q4_K_M.gguf.

Example

from llama_cpp import Llama

llm = Llama(
    model_path="./models/gemma-7b-it-Q4_K_M.gguf",
    chat_format="gemma",
    n_gpu_layers=-1,
)
print(
    llm.create_chat_completion(
        messages=[
            {"role": "user", "content": "What's the capital of Spain"},
            {"role": "assistant", "content": "Barcelona"},
            {"role": "user", "content": "No, it's not, try again."},
        ],
        max_tokens=4,
    )
)
# {'id': 'chatcmpl-6037698e-2f05-496d-b49f-f7d61dde32f3', 'object': 'chat.completion', 'created': 1708599453, 'model': './models/gemma-7b-it-Q4_K_M.gguf', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'The answer is Madrid'}, 'finish_reason': 'length'}], 'usage': {'prompt_tokens': 37, 'completion_tokens': 4, 'total_tokens': 41}}

alvarobartt · 2024-02-22T11:18:59Z

P.S. I'm still unsure whether the format_gemma should raise a ValueError when the system_prompt is provided, as maybe instead if safer to just print out a warning stating that the system_prompt cannot be used i.e. will be ignored, WDYT @abetlen?

felipelo · 2024-02-22T18:41:05Z

P.S. I'm still unsure whether the format_gemma should raise a ValueError when the system_prompt is provided, as maybe instead if safer to just print out a warning stating that the system_prompt cannot be used i.e. will be ignored, WDYT @abetlen?

Agree ^^^

There are cases where a Prompt System/Repository is being used and happens to hold a 'system' value. I would even suggest to append the system_prompt on the first user message.

abetlen · 2024-02-23T01:23:29Z

@alvarobartt thank you for the contribution.

As for system message I'm a little conflicted on this, the issue is that there are at least 3 entirely valid ways to handle it: raise an error, concat it to the user message, or ignore it entirely. For now I think ignoring a system message is the safest option (potentially raising a warning in verbose mode). In the future I'd like to migrate all of the chat templates to jinja2 then people can just pass custom templates easily to override these model defined presets.

alvarobartt · 2024-02-23T07:31:56Z

As for system message I'm a little conflicted on this, the issue is that there are at least 3 entirely valid ways to handle it: raise an error, concat it to the user message, or ignore it entirely. For now I think ignoring a system message is the safest option (potentially raising a warning in verbose mode). In the future I'd like to migrate all of the chat templates to jinja2 then people can just pass custom templates easily to override these model defined presets.

Fair, then I'll do that for the moment, and if you happen to open a draft PR with the Jinja2 migration I'm happy to help out! 🤝🏻

Co-authored-by: Andrei <abetlen@gmail.com>

Add Google's Gemma formatting via chat_format="gemma"

3bef2c2

alvarobartt and others added 3 commits February 23, 2024 08:37

Replace raise ValueError with logger.debug

afd2869

Co-authored-by: Andrei <abetlen@gmail.com>

Merge branch 'main' into main

fd1906d

Merge branch 'main' into main

f316d55

abetlen merged commit 251a8a2 into abetlen:main Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Google's Gemma formatting via `chat_format="gemma"` #1210

Add Google's Gemma formatting via `chat_format="gemma"` #1210

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Google's Gemma formatting via chat_format="gemma" #1210

Add Google's Gemma formatting via chat_format="gemma" #1210

Uh oh!

Conversation

Description

Example

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Google's Gemma formatting via `chat_format="gemma"` #1210

Add Google's Gemma formatting via `chat_format="gemma"` #1210