Token / Cost / Usage Tracking with the new ADK (Using Python, not the Dashboard)? #97

BrewHog · 2025-04-11T17:50:43Z

BrewHog
Apr 11, 2025

Are there any built-in functions to determine cost / use of queries using the ADK? At the moment, I am using the free exp models (gemini-2.0-flash-exp). I need to understand what the costs/usage will be for my flows when put into production.

osmariofilho · 2025-04-11T18:52:00Z

osmariofilho
Apr 11, 2025

yes. correct

3 replies

BrewHog Apr 11, 2025
Author

yes. correct

? There is a function to track tokens/cost? Could you elaborate? Thanks

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

weisiang9678 May 6, 2025

can you show an example ? i could not get the "usage_metadata" from the llm response.
The only to get the token usage is through LiteLLM ? For my use case, it does not require LiteLLM.
Thanks !

Shivam-002 · 2025-04-16T08:30:02Z

Shivam-002
Apr 16, 2025

I am also having same question. Looking into the code I didn't find any build-in methods. If anyone knows that how to do it please let us know.

1 reply

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

Pavan-Bellam · 2025-04-21T21:00:58Z

Pavan-Bellam
Apr 21, 2025

Has anyone figured out a solution for this?

1 reply

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

beinoriusju · 2025-04-22T04:06:16Z

beinoriusju
Apr 22, 2025

LiteLlm has cost counters

0 replies

lsimone · 2025-04-30T18:07:15Z

lsimone
Apr 30, 2025

I had the same problem since I wanted to track some costs/usage stats, so I tried to intercept with after_model_callback but no luck.

I finally came up with something similar:

def track_llm_usage(res: ModelResponse | CustomStreamWrapper, model):
            # Log token usage and cost before returning
        try:
            usage = res.get("usage", {})
            prompt_tokens = usage.get("prompt_tokens", 0)
            completion_tokens = usage.get("completion_tokens", 0)
            total_tokens = usage.get("total_tokens", prompt_tokens + completion_tokens)
            
            # Calculate cost if possible
            cost = litellm.completion_cost(completion_response=res) or 0.0
            
            print(f"[acompletion] Model: {model}")
            print(f"[acompletion] Prompt tokens: {prompt_tokens}")
            print(f"[acompletion] Completion tokens: {completion_tokens}")
            print(f"[acompletion] Total tokens: {total_tokens}")
            print(f"[acompletion] Cost: ${cost:.6f}")
            print(f"[acompletion] Time: {(res._response_ms / 1000):.2f}s")
        except Exception as e:
            print(f"Error logging token usage: {e}")
        return res


class KLiteLLMClient(LiteLLMClient):
    async def acompletion(
      self, model, messages, tools, **kwargs
    ):
        return track_llm_usage(await super().acompletion(
            model=model,
            messages=messages,
            tools=tools,
            **kwargs,
        ), model=model)
        
 
    
 root_agent = LlmAgent(
    name="weather_time_agent",
    model=LiteLlm(model="gemini/gemini-2.5-flash-preview-04-17", 
        llm_client=KLiteLLMClient()
    ),
#    [...]
)

hope it helps!

0 replies

PaveLuchkov · 2025-05-07T07:11:16Z

PaveLuchkov
May 7, 2025

litellm._turn_on_debug()
you can paste this in your code, and it will show you following:

"usage":{"prompt_tokens":350,"completion_tokens":418,"total_tokens":768}}

search within terminal using ctrl f "total_tokens". hope that helps

0 replies

OhVIton · 2025-06-11T10:06:15Z

OhVIton
Jun 11, 2025

If you are using LiteLlm as an agent model, add stream_options={"include_usage": True} to LiteLlm constructor

LiteLlm(
    'azure/gpt-4.1',
    stream_options={"include_usage": True}
)

and you can get usage tokens by Event.usage_metadata

        events = runner.run_async(
            session_id=session.id,
            user_id=session.user_id,
            new_message=self.__messages_to_agent_new_message(messages),
            run_config=RunConfig(streaming_mode=StreamingMode.SSE),
        )
    async for event in events:
        # cache_tokens_details=None cached_content_token_count=None candidates_token_count=None candidates_tokens_details=None prompt_token_count=172 prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=172)] thoughts_token_count=44 tool_use_prompt_token_count=None tool_use_prompt_tokens_details=None total_token_count=216 traffic_type=None
        print(event.usage_metadata)

0 replies

Token / Cost / Usage Tracking with the new ADK (Using Python, not the Dashboard)? #97

Uh oh!

Replies: 7 comments · 5 replies

Uh oh!

Uh oh!

Uh oh!

BrewHog Apr 11, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 7 comments 5 replies

BrewHog Apr 11, 2025
Author