Agent as Judge with Custom Evaluator

Using a custom evaluator agent with specific instructions

This example demonstrates using a custom evaluator agent with specific instructions for evaluation.

Add the following code to your Python file

1from kern.agent import Agent
2from kern.eval.agent_as_judge import AgentAsJudgeEval
3from kern.models.openai import OpenAIResponses
4
5agent = Agent(
6 model=OpenAIResponses(id="gpt-5.2"),
7 instructions="Explain technical concepts simply.",
8)
9
10response = agent.run("What is machine learning?")
11
12# Create a custom evaluator with specific instructions
13custom_evaluator = Agent(
14 model=OpenAIResponses(id="gpt-5.2"),
15 description="Strict technical evaluator",
16 instructions="You are a strict evaluator. Only give high scores to exceptionally clear and accurate explanations.",
17)
18
19evaluation = AgentAsJudgeEval(
20 name="Technical Accuracy",
21 criteria="Explanation must be technically accurate and comprehensive",
22 scoring_strategy="numeric",
23 threshold=8,
24 evaluator_agent=custom_evaluator,
25)
26
27result = evaluation.run(
28 input="What is machine learning?",
29 output=str(response.content),
30)
31
32print(f"Score: {result.results[0].score}/10")
33print(f"Passed: {result.results[0].passed}")

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U kern-ai openai

Export your OpenAI API key

1export OPENAI_API_KEY="your_openai_api_key_here"
1$Env:OPENAI_API_KEY="your_openai_api_key_here"

Run the example

1python agent_as_judge_custom_evaluator.py