Getting Started
Install Kern and run your first local agent in 5 minutes.
Kern is a Python framework for building AI agents optimized for small language models (1-7B parameters). It gives you structured outputs, automatic JSON repair, tool use, and workflow orchestration out of the box—all designed to work reliably with models running on your laptop.
Prerequisites
- Python 3.9+ (Python 3.11 or 3.12 is recommended for best performance).
- An OpenAI-Compatible Model Server (Ollama, llama.cpp, vLLM, or LM Studio running locally).
Supported Local Servers
Kern connects to any server that exposes an OpenAI-compatible chat completions endpoint.
| Server | Default Base URL | Example Models | Param Range |
|---|---|---|---|
| Ollama | http://localhost:11434/v1 | llama3.2:1b, llama3.2:3b, phi4-mini | 1-4B |
| llama.cpp server | http://localhost:8080/v1 | Any GGUF model | 1-7B |
| vLLM | http://localhost:8000/v1 | Any HuggingFace model | 1-7B |
| LM Studio | http://localhost:1234/v1 | Any GGUF from UI | 1-7B |
Installation
Install Kern from PyPI:
1pip install kern-aiYou can also install optional extras for tools and development dependencies:
1# Install with optional extras2pip install 'kern-ai[tools]' # DuckDuckGo, Calculator, and built-in tools3pip install 'kern-ai[all]' # Everything including extras and dev toolsVerify your installation:
1import kern2print(kern.__version__)Your First Agent
Create a file agent.py to run a basic agent that connects to a local model server (e.g. llama.cpp running on port 8080):
1from kern import Agent2from kern.models.openai import OpenAIChat34agent = Agent(5 model=OpenAIChat(6 id="local-model",7 base_url="http://localhost:8080/v1",8 ),9 description="You are a helpful assistant optimized for concise answers.",10)1112result = agent.run("What is quantum computing?")13print(result.content)What just happened?
OpenAIChatconnects to your local model server using the standard OpenAI-compatible completions API.descriptionsets the system prompt. For small models, keep instructions short, specific, and focused on one task.agent.run()sends the prompt, handles response parsing, and returns aRunOutputcontaining the response text and usage metadata.
Connecting to Model Providers
Ollama (Local)
Ollama is the easiest way to run small models locally. Pull a model and run:
1from kern import Agent2from kern.models.openai import OpenAIChat34# Connect to Ollama running a small model5agent = Agent(6 model=OpenAIChat(7 id="llama3.2:3b",8 base_url="http://localhost:11434/v1",9 ),10 description="You are a helpful research assistant.",11)1213result = agent.run("Explain how neural networks learn")14print(result.content)Cloud Providers
Kern works with cloud APIs too. Just omit the base_url to use OpenAI's default endpoint, or set the OPENAI_API_KEY environment variable.
1from kern import Agent2from kern.models.openai import OpenAIChat34agent = Agent(5 model=OpenAIChat(id="gpt-4o-mini"),6 description="You are a helpful assistant.",7)89result = agent.run("Summarize the benefits of small language models")10print(result.content)Streaming Responses
For interactive applications, use agent.run_stream() to receive tokens as they arrive:
1from kern import Agent2from kern.models.openai import OpenAIChat34agent = Agent(5 model=OpenAIChat(6 id="local-model",7 base_url="http://localhost:8080/v1",8 ),9 description="You are a creative writing assistant.",10)1112# Stream tokens as they arrive13for chunk in agent.run_stream("Write a short poem about debugging"):14 print(chunk.content, end="", flush=True)Structured Output
Instead of sending raw JSON Schema to the model (which confuses models under 7B parameters), Kern generates fill-in-the-blank templates that the model can complete reliably. Define a Pydantic model and pass it as output_schema:
1from pydantic import BaseModel, Field2from typing import Literal3from kern import Agent4from kern.models.openai import OpenAIChat56class ArticleAnalysis(BaseModel):7 title: str = Field(description="Article title")8 summary: str = Field(description="2-3 sentence summary")9 sentiment: Literal["positive", "negative", "neutral"]10 confidence: float = Field(description="Confidence score from 0.0 to 1.0")11 key_topics: list[str] = Field(description="List of 3-5 key topics")1213agent = Agent(14 model=OpenAIChat(15 id="local-model",16 base_url="http://localhost:8080/v1",17 ),18 description="You are a news analyst.",19 output_schema=ArticleAnalysis,20)2122result = agent.run("Analyze this article about renewable energy adoption")23print(result.content.title)24print(result.content.sentiment)25print(result.content.key_topics)If the model produces slightly malformed JSON, Kern's built-in repair engine fixes missing brackets or commentary automatically.
Adding Tools
Pass built-in tools (like web search or a calculator) or custom python functions to the agent:
Custom Python Tools
1from kern import Agent2from kern.models.openai import OpenAIChat3from kern.tools import Tool45def get_weather(city: str) -> str:6 """Get current weather for a city."""7 return f"Weather in {city}: 22C, partly cloudy"89weather_tool = Tool(10 name="get_weather",11 description="Get current weather for a given city",12 func=get_weather,13)1415agent = Agent(16 model=OpenAIChat(17 id="local-model",18 base_url="http://localhost:8080/v1",19 ),20 description="You are a helpful weather assistant.",21 tools=[weather_tool],22)2324result = agent.run("What is the weather in Tokyo?")25print(result.content)