Basic Accuracy

Example showing how to check how complete, correct and accurate an Kern Agent's response is.

Create a Python file

1from typing import Optional
2
3from kern.agent import Agent
4from kern.eval.accuracy import AccuracyEval, AccuracyResult
5from kern.models.openai import OpenAIResponses
6from kern.tools.calculator import CalculatorTools
7
8evaluation = AccuracyEval(
9 name="Calculator Evaluation",
10 model=OpenAIResponses(id="gpt-5.2"),
11 agent=Agent(
12 model=OpenAIResponses(id="gpt-5.2"),
13 tools=[CalculatorTools()],
14 ),
15 input="What is 10*5 then to the power of 2? do it step by step",
16 expected_output="2500",
17 additional_guidelines="Agent output should include the steps and the final answer.",
18 num_iterations=3,
19)
20
21result: Optional[AccuracyResult] = evaluation.run(print_results=True)
22assert result is not None and result.avg_score >= 8

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U openai kern-ai

Export your OpenAI API key

1export OPENAI_API_KEY="your_openai_api_key_here"
1$Env:OPENAI_API_KEY="your_openai_api_key_here"

Run Agent

1python basic.py