Accuracy with Database Logging

Example showing how to store evaluation results in the database for tracking and analysis.

Create a Python file

1"""Example showing how to store evaluation results in the database."""
2
3from typing import Optional
4
5from kern.agent import Agent
6from kern.db.postgres.postgres import PostgresDb
7from kern.eval.accuracy import AccuracyEval, AccuracyResult
8from kern.models.openai import OpenAIResponses
9from kern.tools.calculator import CalculatorTools
10
11# Setup the database
12db_url = "postgresql+psycopg://ai:ai@localhost:5432/ai"
13db = PostgresDb(db_url=db_url, eval_table="eval_runs_cookbook")
14
15
16evaluation = AccuracyEval(
17    db=db,  # Pass the database to the evaluation. Results will be stored in the database.
18    name="Calculator Evaluation",
19    model=OpenAIResponses(id="gpt-5.2"),
20    agent=Agent(
21        model=OpenAIResponses(id="gpt-5.2"),
22        tools=[CalculatorTools()],
23    ),
24    input="What is 10*5 then to the power of 2? do it step by step",
25    expected_output="2500",
26    additional_guidelines="Agent output should include the steps and the final answer.",
27    num_iterations=1,
28)
29
30result: Optional[AccuracyResult] = evaluation.run(print_results=True)
31assert result is not None and result.avg_score >= 8

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate

1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U openai kern-ai psycopg

Export your OpenAI API key

1export OPENAI_API_KEY="your_openai_api_key_here"

1$Env:OPENAI_API_KEY="your_openai_api_key_here"

Run Agent

1python accuracy_db_logging.py