8000 Feature Request: Llama-bench improvement · Issue #13671 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content
8000

Feature Request: Llama-bench improvement #13671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
RommelCF opened this issue May 20, 2025 · 0 comments
Open
4 tasks done

Feature Request: Llama-bench improvement #13671

RommelCF opened this issue May 20, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@RommelCF
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Llama-bench is missing many useful flags that are present in batched-bench. Can we add these flags to llama-bench as well?

PP - prompt tokens per batch
TG - generated tokens per batch
B - number of batches
T_PP - prompt processing time (i.e. time to first token)
S_PP - prompt processing speed ((BPP)/T_PP or PP/T_PP)
T_TG - time to generate all batches
S_TG - text generation speed ((B
TG)/T_TG)
T - total time
S - total speed (i.e. all tokens / total time)

Motivation

To get better metrics using llama-bench.

Possible Implementation

No response

@RommelCF RommelCF added the enhancement New feature or request label May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant
0