-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
ModelsLLM and ML model repos and linksLLM and ML model repos and linksllmLarge Language ModelsLarge Language Modelsllm-applicationsTopics related to practical applications of Large Language Models in various fieldsTopics related to practical applications of Large Language Models in various fieldsllm-evaluationEvaluating Large Language Models performance and behavior through human-written evaluation setsEvaluating Large Language Models performance and behavior through human-writ
89F7
ten evaluation setsllm-inference-enginesSoftware to run inference on large language modelsSoftware to run inference on large language modelsllm-serving-optimisationsTips, tricks and tools to speedup inference of large language modelsTips, tricks and tools to speedup inference of large language models
Description
Previously, it wasn't recommended to incorporate non-llama architectures into llama.cpp. However, in light of the recent addition of the Falcon architecture (see Pull Request #2717), it might be worth reconsidering this stance.
One distinguishing feature of Starcoder is its ability to provide a complete series of models ranging from 1B to 13B. This capability can prove highly beneficial for speculative decoding and making coding models available for edge devices (e.g., M1/M2 Macs).
I can contribute the PR if it matches llama.cpp's roadmap.
Suggested labels
{ "key": "LLM-Applications", "value": "Practical applications of Large Language Models, such as edge device coding models and speculative decoding" } { "key": "Multimodal-LM", "value": "LLMs that combine modes such as text and image recognition" }
Metadata
Metadata
Assignees
Labels
ModelsLLM and ML model repos and linksLLM and ML model repos and linksllmLarge Language ModelsLarge Language Modelsllm-applicationsTopics related to practical applications of Large Language Models in various fieldsTopics related to practical applications of Large Language Models in various fieldsllm-evaluationEvaluating Large Language Models performance and behavior through human-written evaluation setsEvaluating Large Language Models performance and behavior through human-written evaluation setsllm-inference-enginesSoftware to run inference on large language modelsSoftware to run inference on large language modelsllm-serving-optimisationsTips, tricks and tools to speedup inference of large language modelsTips, tricks and tools to speedup inference of large language models