Models

A Model in Kern wraps any LLM provider into a uniform interface.

A Model in Kern wraps any LLM provider into a uniform interface. Pass it to an Agent and switch providers without changing your application code.

Kern is heavily optimized for small language models (1-7B parameters) that run locally on your own hardware for zero cost, full privacy, and low latency.


Why small models?

Models under 7 billion parameters (like Llama 3.2 3B or Phi-4 Mini) run efficiently on consumer laptops and cost nothing to operate. They are highly capable at structured extraction, classification, summarization, and basic tool use.

Kern's template-based structured output, JSON repair, and prompt tuning are specifically designed to maximize the reliability of these resource-constrained models.


The Model Concept

Every model in Kern is a Python class that wraps a specific LLM provider's API. Regardless of the provider, the model object exposes the same methods, making your agent code fully portable:

1from kern.models.openai import OpenAIChat
2from kern import Agent
3
4# Define the model — e.g. pointing to a local model server
5model = OpenAIChat(
6 id="local-model",
7 base_url="http://localhost:8080/v1",
8 temperature=0.3,
9)
10
11agent = Agent(
12 model=model,
13 description="You are a helpful assistant.",
14)
15
16result = agent.run("Explain transformers in one sentence")
17print(result.content)

Supported Providers

Local Models (Recommended)

Run models on your own hardware for full privacy and zero API costs. Any OpenAI-compatible endpoint works out of the box:

laptop

Ollama

Install Ollama, pull a model (ollama pull llama3.2:3b), and import Ollama or connect via OpenAIChat.

server

llama.cpp / vLLM / LM Studio

Connect to any local OpenAI-compatible endpoint by setting base_url.

1# Ollama example
2from kern.models.ollama import Ollama
3model = Ollama(id="llama3.2:3b")
4
5# OpenAI-Compatible local server (llama.cpp)
6from kern.models.openai import OpenAIChat
7model = OpenAIChat(
8 id="local-model",
9 base_url="http://localhost:8080/v1",
10)

Cloud Providers

Kern also supports cloud-based model providers when you need models larger than 7B:

  • OpenAI (e.g. gpt-4o-mini — best-in-class small cloud model)
  • Anthropic (e.g. claude-3-5-sonnet)
  • Google (e.g. gemini-2.0-flash)
  • Groq / Together AI / Fireworks AI (High throughput cloud endpoints for open-source models)

Recommended Models

ModelIDParametersBest For
Llama 3.2 3Bllama3.2:3b3BGeneral purpose local tasks
Phi-4 Miniphi4-mini3.8BReasoning-heavy tasks, local coding
Llama 3.2 1Bllama3.2:1b1BUltra-fast local classification / extraction
GPT-4o Minigpt-4o-mini~8BCloud primary, tool use, and structured outputs

Model Shorthand (String Syntax)

For quick prototyping, you can pass a model identifier string directly to the agent instead of importing the model class:

1from kern import Agent
2
3# Kern automatically infers the model provider from the string format
4agent = Agent(
5 model="gpt-4o-mini",
6 description="You are a helpful assistant."
7)
8
9# Explicit provider shorthand
10agent_ollama = Agent(model="ollama:llama3.2:3b")
11agent_together = Agent(model="together:meta-llama/Llama-3.2-3B-Instruct-Turbo")

Fallback Models (Resilience)

If your primary model fails due to rate limits or outages, Kern can automatically failover to secondary models:

1from kern import Agent
2from kern.models.openai import OpenAIChat
3from kern.models.ollama import Ollama
4
5agent = Agent(
6 model=OpenAIChat(id="gpt-4o-mini"),
7 fallback_models=[
8 Ollama(id="llama3.2:3b"), # Fallback to local model if OpenAI fails
9 ],
10 description="You are a resilient assistant.",
11)