8000 Feature Request: Mapping model name to LoRA config · Issue #11031 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Feature Request: Mapping model name to LoRA config #11031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
ngxson opened this issue Jan 1, 2025 · 5 comments
Open
4 tasks done

Feature Request: Mapping model name to LoRA config #11031

ngxson opened this issue Jan 1, 2025 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers server

Comments

@ngxson
Copy link
Collaborator
ngxson commented Jan 1, 2025

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I came across this idea while working on #10994

The idea is that we can maintain a list of model name mapped to LoRA config, for example:

{
    "llama-base":               [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 0.0}],
    "llama-story":              [{"id": 0, "scale": 1.0}, {"id": 1, "scale": 0.0}],
    "llama-abliteration":       [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 1.0}],
    "llama-story-abliteration": [{"id": 0, "scale": 0.5}, {"id": 1, "scale": 0.5}]
}

Then, user can switch the model by specifying model in the request, for example:

# first user:
{
    "model": "llama-story-abliteration",
    "messages": [
        {"role": "user", "content": "Write a NSFW story"}
    ]
}

# second user:
{
    "model": "llama-base",
    "messages": [
        {"role": "user", "content": "Is this NSFW?"}
    ]
}

Motivation

N/A

Possible Implementation

No response

@ngxson ngxson added enhancement New feature or request good first issue Good for newcomers server labels Jan 1, 2025
@joeamroo
Copy link
joeamroo commented Jan 9, 2025

I can work on this if possible.

@ngxson
Copy link
Collaborator Author
ngxson commented Jan 10, 2025

@joeamroo Can your firstly describe how you will implement it?

Also, I'm thinking about a more general case. Maybe users want not just mapping list of lora to model name, but also generation params to model name. For example:

{
    "llama-base":               {"lora": [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 0.0}], "top_k": 1},
    "llama-story":              {"lora": [{"id": 0, "scale": 1.0}, {"id": 1, "scale": 0.0}], "temperature": 0.8},
    "llama-abliteration":       {"lora": [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 1.0}]},
    "llama-story-abliteration": {"lora": [{"id": 0, "scale": 0.5}, {"id": 1, "scale": 0.5}]}
}

So probably we should implement the more generic way.

@brokad
Copy link
brokad commented Feb 17, 2025

This feature would be useful.

Briefly reading through the code, here is a thought:

I see there already is a /props endpoint with POST not doing anything. Assuming model ids are url-safe, how about we expand this to /props/model-alias? Not sure if this is what was planned for this endpoint - or if it was destined for something else.

We'd add a mapping of model-alias -> common_params to the server ctx. The POST handler can be implemented like the /lora-adapters endpoints.

When the user POSTs a new model-alias, we can also start from the ambient server_context.params_base so that global defaults are applied unless otherwise overridden explicitly by the user.

Finally, in handle_completions_impl, instead of using ctx_server.params_base for defaults, we'd pass the alias-specific common_params.

What do you think @ngxson?

@ngxson
99EC
Copy link
Collaborator Author
ngxson commented Feb 22, 2025

IMO using a POST endpoint to set this will make it less reproducible. Many users expect to use llama-server on-demand, so it will be annoying having to set /props each time they start the server.

A better UX/DX is to provide an argument, for example --alias-presets-file my_preset.json and the user can simply put the config into the file:

{
    "llama-base":               {"lora": [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 0.0}], "top_k": 1},
    "llama-story":              {"lora": [{"id": 0, "scale": 1.0}, {"id": 1, "scale": 0.0}], "temperature": 0.8},
    "llama-abliteration":       {"lora": [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 1.0}]},
    "llama-story-abliteration": {"lora": [{"id": 0, "scale": 0.5}, {"id": 1, "scale": 0.5}]}
}

@prd-tuong-nguyen
Copy link

I think this feature is really helpful. The configuration to map the LoRA name to its settings could also be configured in the environment variable. I am looking forward to this feature being released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers server
Projects
None yet
Development

No branches or pull requests

4 participants
0