Session Summaries

Automatically condense long conversations into concise summaries

As conversations grow longer, passing the entire chat history to your LLM becomes expensive and slow. Session summaries solve this by automatically condensing conversations into concise summaries that capture the key points.

Think of it like taking notes during a long meeting - you don't need a transcript of everything said, just the important bits.

The Problem: Growing Token Costs

Without summaries, every message adds to your context window:

1Run 1: 100 tokens
2Run 2: 250 tokens (100 history + 150 new)
3Run 3: 450 tokens (250 history + 200 new)
4Run 4: 750 tokens (450 history + 300 new)
5...exponential growth

This quickly becomes expensive and hits context limits.

The Solution: Automatic Summaries

Session summaries condense your history:

1Run 1: 100 tokens
2Run 2: 250 tokens
3[Summary created: 50 tokens]
4Run 3: 250 tokens (50 summary + 200 new)
5Run 4: 350 tokens (50 summary + 300 new)
6...linear growth

Benefits:

✅ Dramatically reduced token costs
✅ Avoid context window limits
✅ Maintain conversation continuity
✅ Automatic creation and updates

How It Works

Session summaries follow a simple three-step pattern:

Enable Summary Generation

Set enable_session_summaries=True on your agent or team. Summaries are automatically created and updated after runs when there are meaningful messages to summarize, then stored in your database.

Use Summaries in Context

Set add_session_summary_to_context=True to include the summary in your messages (this is enabled by default if you enable session summary generation). Instead of sending dozens of historical messages, only the condensed summary is sent, dramatically reducing tokens while maintaining context.

Customize (Optional)

Use SessionSummaryManager to control summary generation - use a cheaper model, customize prompts, or change the summary format. This lets you optimize costs by using a lightweight model for summaries while keeping your main agent powerful.

Enable Session Summaries

Turn on enable_session_summaries=True to have Kern maintain a rolling summary for each session. Summaries sit alongside the stored history and can be reused later to save tokens.

1from kern.agent import Agent
2from kern.db.postgres import PostgresDb
3from kern.models.openai import OpenAIResponses
4
5agent = Agent(
6    model=OpenAIResponses(id="gpt-5.2"),
7    db=PostgresDb(db_url="postgresql+psycopg://ai:ai@localhost:5532/ai"),
8    enable_session_summaries=True,
9)
10
11agent.print_response("Hi my name is John and I live in New York", session_id="conversation_123")
12
13# Retrieve the summary
14summary = agent.get_session_summary(session_id="conversation_123")
15if summary:
16    print(summary.summary, summary.topics)

1from kern.team import Team
2from kern.db.postgres import PostgresDb
3from kern.models.openai import OpenAIResponses
4
5team = Team(
6    model=OpenAIResponses(id="gpt-5.2"),
7    db=PostgresDb(db_url="postgresql+psycopg://ai:ai@localhost:5532/ai"),
8    enable_session_summaries=True,
9)
10
11team.print_response("Hi my name is John and I live in New York", session_id="conversation_123")
12
13# Retrieve the summary
14summary = team.get_session_summary(session_id="conversation_123")
15if summary:
16    print(summary.summary, summary.topics)

Customizing Generation

Provide a SessionSummaryManager to specify a cheaper model or custom prompt
Run summary generation out-of-band by instantiating a lightweight Agent that just calls get_session_summary across all sessions

Use Summary in Context

add_session_summary_to_context=True is enabled by default if you enable session summary generation. If you don't want summaries to be generated, but still want to use them in context, you can set add_session_summary_to_context=True. Alternatively, if you don't want to use summaries in context, you can set add_session_summary_to_context=False.

1from kern.agent import Agent
2from kern.db.postgres import PostgresDb
3from kern.models.openai import OpenAIResponses
4
5db = PostgresDb(db_url="postgresql+psycopg://ai:ai@localhost:5532/ai")
6
7agent = Agent(
8    model=OpenAIResponses(id="gpt-5.2"),
9    db=db,
10    add_session_summary_to_context=True,
11)
12
13agent.print_response("Hi my name is John and I live in New York", session_id="conversation_123")

1from kern.team import Team
2from kern.db.postgres import PostgresDb
3from kern.models.openai import OpenAIResponses
4
5db = PostgresDb(db_url="postgresql+psycopg://ai:ai@localhost:5532/ai")
6
7team = Team(
8    model=OpenAIResponses(id="gpt-5.2"),
9    db=db,
10    add_session_summary_to_context=True,
11)
12
13team.print_response("Hi my name is John and I live in New York", session_id="conversation_123")

Kern automatically loads the latest summary from storage before each run. You can still mix in recent history:

1agent = Agent(
2    model=OpenAIResponses(id="gpt-5.2"),
3    db=db,
4    add_session_summary_to_context=True,
5    add_history_to_context=True,
6    num_history_runs=2,  # Summary for long-term memory, last 2 runs for detail
7)

1team = Team(
2    model=OpenAIResponses(id="gpt-5.2"),
3    db=db,
4    add_session_summary_to_context=True,
5    add_history_to_context=True,
6    num_history_runs=2,  # Summary for long-term memory, last 2 runs for detail
7)

When to Use Session Summaries

✅ Perfect for:

Long-running customer support conversations
Multi-day or multi-week interactions
Conversations with 10+ turns
Production systems where cost matters

⚠️ Consider alternatives for:

Short conversations (fewer than 5 turns)
When full detail is critical
Real-time chat with recent context only