Agent Metrics
Access RunMetrics, MessageMetrics, and SessionMetrics from agent runs.
The RunOutput from an agent run includes detailed metrics about token usage, cost, timing, and per-model breakdowns.
1from kern.agent import Agent2from kern.models.openai import OpenAIResponses3from kern.tools.hackernews import HackerNewsTools4from kern.db.sqlite import SqliteDb5from rich.pretty import pprint67agent = Agent(8 model=OpenAIResponses(id="gpt-5.2"),9 tools=[HackerNewsTools()],10 db=SqliteDb(db_file="tmp/agents.db"),11 markdown=True,12)1314run_response = agent.run("What are the top stories on HackerNews?")1516# Message metrics (MessageMetrics)17for message in run_response.messages:18 if message.role == "assistant":19 pprint(message.metrics.to_dict())2021# Run metrics (RunMetrics)22pprint(run_response.metrics.to_dict())2324# Per-model breakdown25if run_response.metrics.details:26 for model_type, model_metrics_list in run_response.metrics.details.items():27 for m in model_metrics_list:28 print(f"{model_type}: {m.provider}/{m.id} - {m.total_tokens} tokens")2930# Session metrics (SessionMetrics)31pprint(agent.get_session_metrics().to_dict())Metrics are available at multiple levels:
- Per message: Each assistant message has
MessageMetricswith per-API-call token counts and timing. - Per run: Each
RunOutputhasRunMetricswith aggregated totals and adetailsbreakdown by model type. - Per session:
agent.get_session_metrics()returnsSessionMetricsaggregated across all runs.
| Level | Type | Access |
|---|---|---|
| Per message | MessageMetrics | message.metrics |
| Per run | RunMetrics | run_response.metrics |
| Per session | SessionMetrics | agent.get_session_metrics() |
Run fields (RunMetrics)
| Field | Description |
|---|---|
input_tokens | Tokens sent to the model. |
output_tokens | Tokens generated by the model. |
total_tokens | Sum of input_tokens and output_tokens. |
audio_input_tokens | Audio tokens in the input. |
audio_output_tokens | Audio tokens in the output. |
audio_total_tokens | Total audio tokens. |
cache_read_tokens | Tokens served from cache. |
cache_write_tokens | Tokens written to cache. |
reasoning_tokens | Tokens used for reasoning. |
cost | Cost of the run. |
duration | Run duration in seconds. |
time_to_first_token | Time from run start to first token (seconds). |
details | Per-model breakdown by model type. See Metrics reference. |
additional_metrics | Extra metrics (e.g., eval_duration). |
Message fields (MessageMetrics)
| Field | Description |
|---|---|
input_tokens | Tokens sent to the model. |
output_tokens | Tokens generated by the model. |
total_tokens | Sum of input_tokens and output_tokens. |
audio_input_tokens | Audio tokens in the input. |
audio_output_tokens | Audio tokens in the output. |
audio_total_tokens | Total audio tokens. |
cache_read_tokens | Tokens served from cache. |
cache_write_tokens | Tokens written to cache. |
reasoning_tokens | Tokens used for reasoning. |
cost | Cost of this API call. |
duration | Duration of this API call (seconds). |
time_to_first_token | Time to first token for this API call (seconds). |
provider_metrics | Provider-specific metrics (e.g., Ollama timing, Groq timing, Cerebras timing). |