Custom Retriever

Implement custom retrieval logic for full control over how agents search knowledge.

Custom retrievers let you implement your own search logic instead of using the default knowledge search. This is useful when you need to:

  • Query external APIs or databases directly
  • Implement custom ranking or filtering
  • Reformulate queries before searching
  • Combine multiple data sources
1from kern.agent import Agent
2
3def knowledge_retriever(query: str, num_documents: int = 5, **kwargs) -> list[dict]:
4 # Your custom retrieval logic here
5 return [{"content": "..."}]
6
7agent = Agent(
8 knowledge_retriever=knowledge_retriever,
9 search_knowledge=True,
10)

How It Works

When the agent decides to search for information:

  1. The agent calls your knowledge_retriever function with the query
  2. Your function retrieves documents however you want
  3. Results are returned to the agent as a list of dictionaries
  4. The agent uses the retrieved content to generate a response

Retriever Function Signature

1from typing import Optional
2from kern.agent import Agent
3
4def knowledge_retriever(
5 query: str,
6 agent: Optional[Agent] = None,
7 num_documents: int = 5,
8 **kwargs
9) -> Optional[list[dict]]:
10 """
11 Args:
12 query: The search query from the agent
13 agent: The agent instance (optional, for accessing agent state)
14 num_documents: Number of documents to retrieve
15 **kwargs: Additional arguments passed from the agent
16
17 Returns:
18 List of documents as dictionaries, or None if search fails
19 """
20 # Your logic here
21 return [{"content": "..."}]

Example: Direct Vector Database Query

This example bypasses the Knowledge abstraction and queries Qdrant directly:

1from typing import Optional
2
3from kern.agent import Agent
4from kern.knowledge.embedder.openai import OpenAIEmbedder
5from qdrant_client import QdrantClient
6
7embedder = OpenAIEmbedder(id="text-embedding-3-small")
8qdrant_client = QdrantClient(url="http://localhost:6333")
9
10def knowledge_retriever(
11 query: str, num_documents: int = 5, **kwargs
12) -> Optional[list[dict]]:
13 try:
14 # Generate embedding for the query
15 query_embedding = embedder.get_embedding(query)
16
17 # Search Qdrant directly
18 results = qdrant_client.query_points(
19 collection_name="recipes",
20 query=query_embedding,
21 limit=num_documents,
22 )
23
24 return results.model_dump().get("points")
25 except Exception as e:
26 print(f"Search error: {e}")
27 return None
28
29agent = Agent(
30 knowledge_retriever=knowledge_retriever,
31 search_knowledge=True,
32)
33
34agent.print_response("What ingredients do I need for Massaman Gai?")

Example: Query Reformulation

Expand or modify queries before searching:

1from kern.knowledge.knowledge import Knowledge
2
3knowledge = Knowledge(vector_db=vector_db)
4
5def knowledge_retriever(query: str, num_documents: int = 5, **kwargs) -> list[dict]:
6 # Expand common terms
7 expanded_query = query.replace("vacation", "vacation PTO paid time off")
8 expanded_query = expanded_query.replace("WFH", "work from home remote")
9
10 # Search with expanded query
11 results = knowledge.search(expanded_query, max_results=num_documents)
12
13 return [doc.to_dict() for doc in results]

Example: Multi-Source Retrieval

Combine results from multiple knowledge bases:

1def knowledge_retriever(query: str, num_documents: int = 5, **kwargs) -> list[dict]:
2 # Search multiple sources
3 policy_results = policy_knowledge.search(query, max_results=3)
4 faq_results = faq_knowledge.search(query, max_results=3)
5
6 # Combine and deduplicate
7 all_results = []
8 seen_ids = set()
9
10 for doc in policy_results + faq_results:
11 if doc.id not in seen_ids:
12 all_results.append(doc.to_dict())
13 seen_ids.add(doc.id)
14
15 return all_results[:num_documents]

When to Use Custom Retrievers

Use CaseWhy Custom Retriever
Direct database accessSkip the Knowledge abstraction for performance
Query expansionAdd synonyms or related terms before searching
Multi-source searchCombine results from multiple knowledge bases
External APIsSearch third-party services (Elasticsearch, Algolia, etc.)
Custom rankingImplement domain-specific relevance scoring
Conditional logicApply different search strategies based on query type

For most use cases, the built-in Knowledge search is sufficient. Use custom retrievers when you need full control over the retrieval process.

Next Steps