Knowledge

Hybrid search with embeddings, chunking, and metadata in the Dash team.

Knowledge is the retrieval primitive: a vector index with an optional keyword index and optional reranker. Dash uses it heavily for grounding text-to-SQL.

1from kern.knowledge import Knowledge
2from kern.vector_db.pgvector import PgVector
3
4dash_knowledge = Knowledge(
5 vector_db=PgVector(
6 table_name="dash_knowledge",
7 db_url=DB_URL,
8 search_type="hybrid", # vector + BM25
9 ),
10)
11
12dash = Agent(
13 knowledge=dash_knowledge,
14 add_knowledge_to_context=True, # auto-search before each run
15 search_knowledge=True, # also expose as a tool
16)

Loading content

Three ways to put content in:

1# From a directory
2dash_knowledge.add_content_from_path("knowledge/tables/")
3
4# From a URL
5dash_knowledge.add_content_from_url("https://example.com/article")
6
7# Programmatically
8dash_knowledge.add_content(
9 name="MRR definition",
10 content="MRR is sum of active subscriptions excluding trials.",
11 metadata={"category": "business_rules"},
12)

Demo OS loads via scripts:

1python -m agents.dash.scripts.load_knowledge

Re-run with --recreate to rebuild from scratch. Without it, content is upserted by primary key.

Chunking and embedding

By default, Knowledge chunks long content into ~500-token segments and embeds each chunk with text-embedding-3-small. Override with:

1from kern.embedder.openai import OpenAIEmbedder
2from kern.knowledge.chunking.text import TextChunkingStrategy
3
4dash_knowledge = Knowledge(
5 vector_db=PgVector(...),
6 embedder=OpenAIEmbedder(id="text-embedding-3-large"),
7 chunking_strategy=TextChunkingStrategy(chunk_size=1000, overlap=100),
8)

Other chunking strategies live under kern.knowledge.chunking.*: by markdown headers, by code structure, by recursive token count, by semantic boundaries.

Hybrid search

search_type="hybrid" runs both:

IndexCatches
Vector (semantic)"different words for the same idea"
BM25 (keyword)"find the doc that mentions this exact term"

Results from both get merged with reciprocal rank fusion. Hybrid almost always beats either alone.

Metadata filtering

Metadata attached at ingest time becomes a filter at query time:

1# Ingest
2dash_knowledge.add_content(
3 name="MRR definition",
4 content="...",
5 metadata={"category": "business_rules", "team": "finance"},
6)
7
8# Retrieve only finance team rules
9dash_knowledge.search(
10 query="how do we calculate MRR?",
11 filters={"team": "finance"},
12)

Useful for multi-tenant agents (filter by tenant_id) or topic scoping (filter by category).

When the model gets the chunks

With add_knowledge_to_context=True:

  1. User message arrives.
  2. AgentOS runs knowledge.search(message) automatically.
  3. Top-k chunks get inserted into the system prompt.
  4. The model answers with the chunks visible.

With search_knowledge=True:

The agent gets a search_knowledge_base(query) tool. The model decides when to call it. Useful for follow-up retrieval mid-run.

Both flags are common to set together. Auto-search hits first, the tool catches "I need to look up something else" cases.

Reranking

For larger knowledge bases, add a reranker:

1from kern.rerank.cohere import CohereReranker
2
3dash_knowledge = Knowledge(
4 vector_db=PgVector(...),
5 reranker=CohereReranker(model="rerank-3.5"),
6)

The vector DB returns the top-50, the reranker scores them, and the top-10 reach the model. This two-stage retrieval (cast wide, rerank tight) is the standard production setup.

See it in action

1@Dash what's the right way to count active subscriptions?
2@Dash show me a query for MRR by plan
3@Dash which tables track customer lifecycle events?

Source: agents/dash/, Knowledge docs

Next

Memory →