Getting Started

Install Kern and run your first local agent in 5 minutes.

Kern is a Python framework for building AI agents optimized for small language models (1-7B parameters). It gives you structured outputs, automatic JSON repair, tool use, and workflow orchestration out of the box—all designed to work reliably with models running on your laptop.

Prerequisites

Python 3.9+ (Python 3.11 or 3.12 is recommended for best performance).
An OpenAI-Compatible Model Server (Ollama, llama.cpp, vLLM, or LM Studio running locally).

Supported Local Servers

Kern connects to any server that exposes an OpenAI-compatible chat completions endpoint.

Server	Default Base URL	Example Models	Param Range
Ollama	`http://localhost:11434/v1`	`llama3.2:1b`, `llama3.2:3b`, `phi4-mini`	1-4B
llama.cpp server	`http://localhost:8080/v1`	Any GGUF model	1-7B
vLLM	`http://localhost:8000/v1`	Any HuggingFace model	1-7B
LM Studio	`http://localhost:1234/v1`	Any GGUF from UI	1-7B

Installation

Install Kern from PyPI:

1pip install kern-ai

You can also install optional extras for tools and development dependencies:

1# Install with optional extras
2pip install 'kern-ai[tools]'       # DuckDuckGo, Calculator, and built-in tools
3pip install 'kern-ai[all]'         # Everything including extras and dev tools

Verify your installation:

1import kern
2print(kern.__version__)

Your First Agent

Create a file agent.py to run a basic agent that connects to a local model server (e.g. llama.cpp running on port 8080):

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5    model=OpenAIChat(
6        id="local-model",
7        base_url="http://localhost:8080/v1",
8    ),
9    description="You are a helpful assistant optimized for concise answers.",
10)
11
12result = agent.run("What is quantum computing?")
13print(result.content)

What just happened?

OpenAIChat connects to your local model server using the standard OpenAI-compatible completions API.
description sets the system prompt. For small models, keep instructions short, specific, and focused on one task.
agent.run() sends the prompt, handles response parsing, and returns a RunOutput containing the response text and usage metadata.

Connecting to Model Providers

Ollama (Local)

Ollama is the easiest way to run small models locally. Pull a model and run:

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4# Connect to Ollama running a small model
5agent = Agent(
6    model=OpenAIChat(
7        id="llama3.2:3b",
8        base_url="http://localhost:11434/v1",
9    ),
10    description="You are a helpful research assistant.",
11)
12
13result = agent.run("Explain how neural networks learn")
14print(result.content)

Cloud Providers

Kern works with cloud APIs too. Just omit the base_url to use OpenAI's default endpoint, or set the OPENAI_API_KEY environment variable.

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5    model=OpenAIChat(id="gpt-4o-mini"),
6    description="You are a helpful assistant.",
7)
8
9result = agent.run("Summarize the benefits of small language models")
10print(result.content)

Streaming Responses

For interactive applications, use agent.run_stream() to receive tokens as they arrive:

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5    model=OpenAIChat(
6        id="local-model",
7        base_url="http://localhost:8080/v1",
8    ),
9    description="You are a creative writing assistant.",
10)
11
12# Stream tokens as they arrive
13for chunk in agent.run_stream("Write a short poem about debugging"):
14    print(chunk.content, end="", flush=True)

Structured Output

Instead of sending raw JSON Schema to the model (which confuses models under 7B parameters), Kern generates fill-in-the-blank templates that the model can complete reliably. Define a Pydantic model and pass it as output_schema:

1from pydantic import BaseModel, Field
2from typing import Literal
3from kern import Agent
4from kern.models.openai import OpenAIChat
5
6class ArticleAnalysis(BaseModel):
7    title: str = Field(description="Article title")
8    summary: str = Field(description="2-3 sentence summary")
9    sentiment: Literal["positive", "negative", "neutral"]
10    confidence: float = Field(description="Confidence score from 0.0 to 1.0")
11    key_topics: list[str] = Field(description="List of 3-5 key topics")
12
13agent = Agent(
14    model=OpenAIChat(
15        id="local-model",
16        base_url="http://localhost:8080/v1",
17    ),
18    description="You are a news analyst.",
19    output_schema=ArticleAnalysis,
20)
21
22result = agent.run("Analyze this article about renewable energy adoption")
23print(result.content.title)
24print(result.content.sentiment)
25print(result.content.key_topics)

If the model produces slightly malformed JSON, Kern's built-in repair engine fixes missing brackets or commentary automatically.

Adding Tools

Pass built-in tools (like web search or a calculator) or custom python functions to the agent:

Custom Python Tools

1from kern import Agent
2from kern.models.openai import OpenAIChat
3from kern.tools import Tool
4
5def get_weather(city: str) -> str:
6    """Get current weather for a city."""
7    return f"Weather in {city}: 22C, partly cloudy"
8
9weather_tool = Tool(
10    name="get_weather",
11    description="Get current weather for a given city",
12    func=get_weather,
13)
14
15agent = Agent(
16    model=OpenAIChat(
17        id="local-model",
18        base_url="http://localhost:8080/v1",
19    ),
20    description="You are a helpful weather assistant.",
21    tools=[weather_tool],
22)
23
24result = agent.run("What is the weather in Tokyo?")
25print(result.content)