Accuracy with Database Logging

Example showing how to store evaluation results in the database for tracking and analysis.

Create a Python file

1"""Example showing how to store evaluation results in the database."""
2
3from typing import Optional
4
5from kern.agent import Agent
6from kern.db.postgres.postgres import PostgresDb
7from kern.eval.accuracy import AccuracyEval, AccuracyResult
8from kern.models.openai import OpenAIResponses
9from kern.tools.calculator import CalculatorTools
10
11# Setup the database
12db_url = "postgresql+psycopg://ai:ai@localhost:5432/ai"
13db = PostgresDb(db_url=db_url, eval_table="eval_runs_cookbook")
14
15
16evaluation = AccuracyEval(
17 db=db, # Pass the database to the evaluation. Results will be stored in the database.
18 name="Calculator Evaluation",
19 model=OpenAIResponses(id="gpt-5.2"),
20 agent=Agent(
21 model=OpenAIResponses(id="gpt-5.2"),
22 tools=[CalculatorTools()],
23 ),
24 input="What is 10*5 then to the power of 2? do it step by step",
25 expected_output="2500",
26 additional_guidelines="Agent output should include the steps and the final answer.",
27 num_iterations=1,
28)
29
30result: Optional[AccuracyResult] = evaluation.run(print_results=True)
31assert result is not None and result.avg_score >= 8

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U openai kern-ai psycopg

Export your OpenAI API key

1export OPENAI_API_KEY="your_openai_api_key_here"
1$Env:OPENAI_API_KEY="your_openai_api_key_here"

Run Agent

1python accuracy_db_logging.py