Skip to content

Expose per-invocation cost in LlmResponse (LiteLLM and other providers) #3309

@caichuanwang

Description

@caichuanwang

** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.

Is your feature request related to a problem? Please describe.
ADK exposes token usage via LlmResponse.usage_metadata but does not expose per-call cost. With LiteLLM (and some other providers), a computed cost is available after each invocation (e.g., completion_cost(response) or _hidden_params["response_cost"]). Because ADK doesn’t surface this, applications must either (a) maintain price sheets and recompute cost from tokens (risking drift and inconsistency) or (b) patch adapters to pull provider cost directly.

Describe the solution you'd like
Add an optional, typed field on LlmResponse to standardize cost exposure across providers, e.g.:

  • LlmResponse.cost_metadata: Optional[CostMetadata]
    • total_cost_usd: Optional[float]
    • prompt_cost_usd: Optional[float]
    • output_cost_usd: Optional[float]
    • currency: Optional[str] (default “USD” when known)
    • provider: Optional[str] (e.g., “litellm”, “vertexai”, “openai”)
    • source: Optional[Literal['provider','adapter','computed']]
    • raw: Optional[dict] (provider-specific passthrough like response_cost)
      Population strategy:
  • LiteLLM:
    • Non-streaming: prefer litellm.completion_cost(response); fallback to response._hidden_params['response_cost'] when present.
    • Streaming: read from stream wrapper if exposed; otherwise leave unset (no silent recompute).
  • Other providers:
    • If SDK exposes cost, pass it through into cost_metadata; else None.

Describe alternatives you've considered

  • Client-side recomputation using usage_metadata + custom price sheets: works, but requires price maintenance, risks drift, and yields provider-inconsistent results.
  • Stuffing cost into custom_metadata: not discoverable/typed; harder to rely on across apps.

Additional context

  • Suggested touch points:
    • src/google/adk/models/llm_response.py: add CostMetadata and cost_metadata.
    • src/google/adk/models/lite_llm.py: populate for non-streaming and streaming when available.
    • (Optional) plugins/logging_plugin.py: log cost if present.
    • (Optional) telemetry/tracing.py: emit span attribute (e.g., gen_ai.cost.total_usd).
  • Example (illustrative):
"cost_metadata": {
  "total_cost_usd": 0.00123,
  "prompt_cost_usd": 0.00040,
  "output_cost_usd": 0.00083,
  "currency": "USD",
  "provider": "litellm",
  "source": "provider",
  "raw": { "response_cost": 0.00123 }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    models[Component] Issues related to model support

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions