10000 Multimodal Llama3 Support · Issue #1403 · abetlen/llama-cpp-python · GitHub
[go: up one dir, main page]

Skip to content
Multimodal Llama3 Support #1403
Open
Open
@xx025

Description

@xx025

I came across a model on Huggingface that supports Llama3 multimodal Bunny-Llama-3-8B-V: bunny-llama, and I'd like to be able to deploy it using llama-cpp-python!

But I found that the existing chat_format:llama-3 doesn't seem to support running it.

I converted it to gguf format via llama.cpp and ran it with the following configuration

python llama.cpp/convert.py \
Bunny-Llama-3-8B-V --outtype f16 \
--outfile converted.bin \
--vocab-type bpe
{
    "host": "0.0.0.0",
    "port": 8080,
    "api_key":"xx",
    "models": [
        {
            "model": "bunny-llama.gguf",
            "model_alias": "bunny-llama",
            "chat_format": "llama-3",
            "n_gpu_layers": -1,
            "offload_kqv": true,
            "n_threads": 12,
            "n_batch": 512,
            "n_ctx": 2048
        }
    ]    
}
python3 -m llama_cpp.server \
--config_file bunny-llama.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0