-
Notifications
You must be signed in to change notification settings - Fork 580
Open
Description
Motivation
Response metrics are very useful for benchmarking performance of different configurations. LMDeploy could implement similar metrics to vLLM's RequestMetrics
.
I think adding basic metrics like first token time, finish time, etc. should be pretty straightforward for AsyncEngine
from skimming the sourcecode. I'm not sure if there are other areas where changes would be required.
If there is interest, I am happy to make a PR.
Metadata
Metadata
Assignees
Labels
No labels