Production Best Practices

Avoid common pitfalls, optimize costs, and ensure reliable memory behavior in production.

Memory is powerful, but without careful configuration, it can lead to unexpected token consumption, behavioral issues, and high costs. This guide shows you what to watch out for and how to optimize your memory usage for production.

Quick Reference

Default to automatic memory (update_memory_on_run=True) unless you have a specific reason for agentic control
Always provide user_id, don't rely on the default "default" user
Use cheaper models for memory operations when using agentic memory
Implement pruning for long-running applications
Monitor token usage in production to catch memory-related cost spikes
Test with realistic data: 100+ memories behave very differently than 5 memories

The Agentic Memory Token Trap

The Problem: When you use enable_agentic_memory=True, every memory operation triggers a separate, nested LLM call. This architecture can cause token usage to explode, especially as memories accumulate.

Here's what happens under the hood:

User sends a message → Main LLM call processes it
Agent decides to update memory → Calls update_user_memory tool
Nested LLM call fires with:
- Detailed system prompt (~50 lines)
- ALL existing user memories loaded into context
- Memory management instructions and tools
Memory LLM makes tool calls (add, update, delete)
Control returns to main conversation

Real-world impact:

1# Scenario: User with 100 existing memories
2agent = Agent(
3    db=db,
4    enable_agentic_memory=True,
5    model=OpenAIResponses(id="gpt-5.2")
6)
7
8# 10-message conversation where agent updates memory 7 times:
9# Normal conversation: 10 × 500 tokens = 5,000 tokens
10# With agentic memory: (10 × 500) + (7 × 5,000) = 40,000 tokens
11# Cost increase: 8x more expensive!

As memories accumulate, each memory operation gets more expensive. With 200 memories, a single memory update could consume 10,000+ tokens just loading context.

Mitigation Strategy #1: Use Automatic Memory

For most use cases, automatic memory is your best bet—it's significantly more efficient:

1# Recommended: Single memory processing after conversation
2agent = Agent(
3    db=db,
4    update_memory_on_run=True  # Processes memories once at end
5)
6
7# Only use agentic memory when you specifically need:
8# - Real-time memory updates during conversation
9# - User-directed memory commands ("forget my address")
10# - Complex memory reasoning within the conversation flow

Mitigation Strategy #2: Use a Cheaper Model for Memory Operations

If you do need agentic memory, use a less expensive model for memory management while keeping a powerful model for conversation:

1from kern.memory import MemoryManager
2from kern.models.openai import OpenAIResponses
3
4# Cheap model for memory operations (60x less expensive)
5memory_manager = MemoryManager(
6    db=db,
7    model=OpenAIResponses(id="gpt-5.2")
8)
9
10# Expensive model for main conversations
11agent = Agent(
12    db=db,
13    model=OpenAIResponses(id="gpt-5.2"),
14    memory_manager=memory_manager,
15    enable_agentic_memory=True
16)

This approach can reduce memory-related costs by 98% while maintaining conversation quality.

Mitigation Strategy #3: Guide Memory Behavior with Instructions

Add explicit instructions to prevent frivolous memory updates:

1agent = Agent(
2    db=db,
3    enable_agentic_memory=True,
4    instructions=[
5        "Only update memories when users share significant new information.",
6        "Don't create memories for casual conversation or temporary states.",
7        "Batch multiple memory updates together when possible."
8    ]
9)

Mitigation Strategy #4: Implement Memory Pruning

Prevent memory bloat by periodically cleaning up old or irrelevant memories:

1from datetime import datetime, timedelta
2
3def prune_old_memories(db, user_id, days=90):
4    """Remove memories older than 90 days"""
5    cutoff_timestamp = int((datetime.now() - timedelta(days=days)).timestamp())
6    
7    memories = db.get_user_memories(user_id=user_id)
8    for memory in memories:
9        if memory.updated_at and memory.updated_at < cutoff_timestamp:
10            db.delete_user_memory(memory_id=memory.memory_id)
11
12# Run periodically or before high-cost operations
13prune_old_memories(db, user_id="john_doe@example.com")

Mitigation Strategy #5: Set Tool Call Limits

Prevent runaway memory operations by limiting tool calls per conversation:

1agent = Agent(
2    db=db,
3    enable_agentic_memory=True,
4    tool_call_limit=5  # Prevents excessive memory operations
5)

Common Pitfalls

The user_id Pitfall

The Problem: Forgetting to set user_id causes all memories to default to user_id="default", mixing different users' memories together.

1# ❌ Bad: All users share the same memories
2agent.print_response("I love pizza")
3agent.print_response("I'm allergic to dairy")
4
5# ✅ Good: Each user has isolated memories
6agent.print_response("I love pizza", user_id="user_123")
7agent.print_response("I'm allergic to dairy", user_id="user_456")

Best practice: Always pass user_id explicitly, especially in multi-user applications.

The Double-Enable Pitfall

The Problem: Using both update_memory_on_run=True and enable_agentic_memory=True doesn't give you both—agentic mode overrides automatic mode.

1# ❌ Doesn't work as expected - automatic memory is disabled
2agent = Agent(
3    db=db,
4    update_memory_on_run=True,
5    enable_agentic_memory=True  # This disables automatic behavior
6)
7
8# ✅ Choose one approach
9agent = Agent(db=db, update_memory_on_run=True)  # Automatic
10# OR
11agent = Agent(db=db, enable_agentic_memory=True)  # Agentic

Memory Growth Monitoring

Track memory counts to catch issues early:

1from kern.agent import Agent
2
3agent = Agent(db=db, update_memory_on_run=True)
4
5# Check memory count for a user
6memories = agent.get_user_memories(user_id="user_123")
7print(f"User has {len(memories)} memories")
8
9# Alert if memory count is unusually high
10if len(memories) > 500:
11    print("⚠️ Warning: User has excessive memories. Consider pruning.")

Developer Resources

View Examples
View Cookbook