Getting Started

Install Kern and run your first local agent in 5 minutes.

Kern is a Python framework for building AI agents optimized for small language models (1-7B parameters). It gives you structured outputs, automatic JSON repair, tool use, and workflow orchestration out of the box—all designed to work reliably with models running on your laptop.

Prerequisites

  1. Python 3.9+ (Python 3.11 or 3.12 is recommended for best performance).
  2. An OpenAI-Compatible Model Server (Ollama, llama.cpp, vLLM, or LM Studio running locally).

Supported Local Servers

Kern connects to any server that exposes an OpenAI-compatible chat completions endpoint.

ServerDefault Base URLExample ModelsParam Range
Ollamahttp://localhost:11434/v1llama3.2:1b, llama3.2:3b, phi4-mini1-4B
llama.cpp serverhttp://localhost:8080/v1Any GGUF model1-7B
vLLMhttp://localhost:8000/v1Any HuggingFace model1-7B
LM Studiohttp://localhost:1234/v1Any GGUF from UI1-7B

Installation

Install Kern from PyPI:

1pip install kern-ai

You can also install optional extras for tools and development dependencies:

1# Install with optional extras
2pip install 'kern-ai[tools]' # DuckDuckGo, Calculator, and built-in tools
3pip install 'kern-ai[all]' # Everything including extras and dev tools

Verify your installation:

1import kern
2print(kern.__version__)

Your First Agent

Create a file agent.py to run a basic agent that connects to a local model server (e.g. llama.cpp running on port 8080):

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5 model=OpenAIChat(
6 id="local-model",
7 base_url="http://localhost:8080/v1",
8 ),
9 description="You are a helpful assistant optimized for concise answers.",
10)
11
12result = agent.run("What is quantum computing?")
13print(result.content)

What just happened?

  1. OpenAIChat connects to your local model server using the standard OpenAI-compatible completions API.
  2. description sets the system prompt. For small models, keep instructions short, specific, and focused on one task.
  3. agent.run() sends the prompt, handles response parsing, and returns a RunOutput containing the response text and usage metadata.

Connecting to Model Providers

Ollama (Local)

Ollama is the easiest way to run small models locally. Pull a model and run:

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4# Connect to Ollama running a small model
5agent = Agent(
6 model=OpenAIChat(
7 id="llama3.2:3b",
8 base_url="http://localhost:11434/v1",
9 ),
10 description="You are a helpful research assistant.",
11)
12
13result = agent.run("Explain how neural networks learn")
14print(result.content)

Cloud Providers

Kern works with cloud APIs too. Just omit the base_url to use OpenAI's default endpoint, or set the OPENAI_API_KEY environment variable.

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5 model=OpenAIChat(id="gpt-4o-mini"),
6 description="You are a helpful assistant.",
7)
8
9result = agent.run("Summarize the benefits of small language models")
10print(result.content)

Streaming Responses

For interactive applications, use agent.run_stream() to receive tokens as they arrive:

1from kern import Agent
2from kern.models.openai import OpenAIChat
3
4agent = Agent(
5 model=OpenAIChat(
6 id="local-model",
7 base_url="http://localhost:8080/v1",
8 ),
9 description="You are a creative writing assistant.",
10)
11
12# Stream tokens as they arrive
13for chunk in agent.run_stream("Write a short poem about debugging"):
14 print(chunk.content, end="", flush=True)

Structured Output

Instead of sending raw JSON Schema to the model (which confuses models under 7B parameters), Kern generates fill-in-the-blank templates that the model can complete reliably. Define a Pydantic model and pass it as output_schema:

1from pydantic import BaseModel, Field
2from typing import Literal
3from kern import Agent
4from kern.models.openai import OpenAIChat
5
6class ArticleAnalysis(BaseModel):
7 title: str = Field(description="Article title")
8 summary: str = Field(description="2-3 sentence summary")
9 sentiment: Literal["positive", "negative", "neutral"]
10 confidence: float = Field(description="Confidence score from 0.0 to 1.0")
11 key_topics: list[str] = Field(description="List of 3-5 key topics")
12
13agent = Agent(
14 model=OpenAIChat(
15 id="local-model",
16 base_url="http://localhost:8080/v1",
17 ),
18 description="You are a news analyst.",
19 output_schema=ArticleAnalysis,
20)
21
22result = agent.run("Analyze this article about renewable energy adoption")
23print(result.content.title)
24print(result.content.sentiment)
25print(result.content.key_topics)

If the model produces slightly malformed JSON, Kern's built-in repair engine fixes missing brackets or commentary automatically.


Adding Tools

Pass built-in tools (like web search or a calculator) or custom python functions to the agent:

Custom Python Tools

1from kern import Agent
2from kern.models.openai import OpenAIChat
3from kern.tools import Tool
4
5def get_weather(city: str) -> str:
6 """Get current weather for a city."""
7 return f"Weather in {city}: 22C, partly cloudy"
8
9weather_tool = Tool(
10 name="get_weather",
11 description="Get current weather for a given city",
12 func=get_weather,
13)
14
15agent = Agent(
16 model=OpenAIChat(
17 id="local-model",
18 base_url="http://localhost:8080/v1",
19 ),
20 description="You are a helpful weather assistant.",
21 tools=[weather_tool],
22)
23
24result = agent.run("What is the weather in Tokyo?")
25print(result.content)