Reasoning Models

Reasoning models are a class of large language models pre-trained to think before they answer. They produce a long internal chain of thought before responding.

Examples of reasoning models include:

OpenAI o1-pro and gpt-5-mini
Claude 3.7 sonnet in extended-thinking mode
Gemini 2.0 flash thinking
DeepSeek-R1

Reasoning models deeply consider and think through a plan before taking action. Its all about what the model does before it starts generating a response. Reasoning models excel at single-shot use-cases. They're perfect for solving hard problems (coding, math, physics) that don't require multiple turns, or calling tools sequentially.

Examples

gpt-5-mini

1from kern.agent import Agent
2from kern.models.openai import OpenAIResponses
3
4# Setup your Agent using a reasoning model
5agent = Agent(model=OpenAIResponses(id="gpt-5.2"))
6
7# Run the Agent
8agent.print_response(
9    "Solve the trolley problem. Evaluate multiple ethical frameworks. Include an ASCII diagram of your solution.",
10    stream=True,
11    show_full_reasoning=True,
12)

gpt-5-mini with tools

1from kern.agent import Agent
2from kern.models.openai import OpenAIResponses
3from kern.tools.hackernews import HackerNewsTools
4
5# Setup your Agent using a reasoning model
6agent = Agent(
7    model=OpenAIResponses(id="gpt-5.2"),
8    tools=[HackerNewsTools()],
9    markdown=True,
10)
11
12# Run the Agent
13agent.print_response("What is the best basketball team in the NBA this year?", stream=True)

gpt-5-mini with reasoning effort

1from kern.agent import Agent
2from kern.models.openai import OpenAIResponses
3from kern.tools.hackernews import HackerNewsTools
4
5# Setup your Agent using a reasoning model with high reasoning effort
6agent = Agent(
7    model=OpenAIResponses(id="gpt-5.2", reasoning_effort="high"),
8    tools=[HackerNewsTools()],
9    markdown=True,
10)
11
12# Run the Agent
13agent.print_response("What is the best basketball team in the NBA this year?", stream=True)

DeepSeek-R1 using Groq

1from kern.agent import Agent
2from kern.models.groq import Groq
3
4# Setup your Agent using a reasoning model
5agent = Agent(
6    model=Groq(
7        id="deepseek-r1-distill-llama-70b", temperature=0.6, max_tokens=1024, top_p=0.95
8    ),
9    markdown=True,
10)
11
12# Run the Agent
13agent.print_response("9.11 and 9.9 -- which is bigger?", stream=True)

Reasoning Model + Response Model

When you run the DeepSeek-R1 Agent above, you'll notice that the response is not that great. This is because DeepSeek-R1 is great at solving problems but not that great at responding in a natural way (like claude sonnet or gpt-4.5).

To solve this problem, Kern supports using separate models for reasoning and response generation. This approach leverages a reasoning model for problem-solving while using a different model optimized for natural language responses, combining the strengths of both.

DeepSeek-R1 + Claude Sonnet

1from kern.agent import Agent
2from kern.models.anthropic import Claude
3from kern.models.groq import Groq
4
5# Setup your Agent using an extra reasoning model
6deepseek_plus_claude = Agent(
7    model=Claude(id="claude-sonnet-4-5"),
8    reasoning_model=Groq(
9        id="deepseek-r1-distill-llama-70b", temperature=0.6, max_tokens=1024, top_p=0.95
10    ),
11)
12
13# Run the Agent
14deepseek_plus_claude.print_response("9.11 and 9.9 -- which is bigger?", stream=True)

Streaming Reasoning Content

When using a reasoning_model, you can stream the reasoning content as it's being generated. This allows you to see the model's thought process in real-time.

To enable streaming reasoning, set stream=True and stream_events=True when running the agent:

1from kern.agent import Agent
2from kern.models.anthropic import Claude
3
4# Create an agent with a reasoning model
5agent = Agent(
6    reasoning_model=Claude(
7        id="claude-sonnet-4-5",
8        thinking={"type": "enabled", "budget_tokens": 1024},
9    ),
10    reasoning=True,
11    instructions="Think step by step about the problem.",
12)
13
14# Stream the response with reasoning events
15agent.print_response(
16    "What is 25 * 37? Show your reasoning.",
17    stream=True,
18    stream_events=True,
19)

Capturing Reasoning Events

You can also capture individual reasoning events. This gives you fine-grained control over how reasoning content is displayed:

1from kern.agent import Agent
2from kern.models.anthropic import Claude
3from kern.run.agent import RunEvent
4
5agent = Agent(
6    reasoning_model=Claude(
7        id="claude-sonnet-4-5",
8        thinking={"type": "enabled", "budget_tokens": 1024},
9    ),
10    reasoning=True,
11    instructions="Think step by step about the problem.",
12)
13
14for run_output_event in agent.run(
15    "What is 25 * 37? Show your reasoning.",
16    stream=True,
17    stream_events=True,
18):
19    if run_output_event.event == RunEvent.run_started:
20        print(f"EVENT: {run_output_event.event}")
21    elif run_output_event.event == RunEvent.reasoning_started:
22        print(f"EVENT: {run_output_event.event}")
23        print("Reasoning started...\n")
24    elif run_output_event.event == RunEvent.reasoning_content_delta:
25        # Stream reasoning content as it's being generated
26        print(run_output_event.reasoning_content, end="", flush=True)
27    elif run_output_event.event == RunEvent.run_content:
28        if run_output_event.content:
29            print(run_output_event.content, end="", flush=True)
30    elif run_output_event.event == RunEvent.run_completed:
31        print(f"EVENT: {run_output_event.event}")

The key events for streaming reasoning are:

Event	Description
`RunEvent.reasoning_started`	Emitted when reasoning begins
`RunEvent.reasoning_content_delta`	Emitted for each chunk of reasoning content as it streams
`RunEvent.run_content`	Emitted for the final response content

Developer Resources

View Examples