Accuracy with Tools

Example showing an evaluation that runs the provided agent with the provided input and then evaluates the answer that the agent gives.

Create a Python file

1from typing import Optional
2
3from kern.agent import Agent
4from kern.eval.accuracy import AccuracyEval, AccuracyResult
5from kern.models.openai import OpenAIResponses
6from kern.tools.calculator import CalculatorTools
7
8evaluation = AccuracyEval(
9 name="Tools Evaluation",
10 model=OpenAIResponses(id="gpt-5.2"),
11 agent=Agent(
12 model=OpenAIResponses(id="gpt-5.2"),
13 tools=[CalculatorTools()],
14 ),
15 input="What is 10!?",
16 expected_output="3628800",
17)
18
19result: Optional[AccuracyResult] = evaluation.run(print_results=True)
20assert result is not None and result.avg_score >= 8

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U openai kern-ai

Export your OpenAI API key

1export OPENAI_API_KEY="your_openai_api_key_here"
1$Env:OPENAI_API_KEY="your_openai_api_key_here"

Run Agent

1python accuracy_with_tools.py