Agents with Knowledge
Understanding knowledge and how to use it with Kern agents
Knowledge stores domain-specific content that can be added to the context of the agent to enable better decision making.
Kern has a generic knowledge solution that supports many forms of content.
See more details in the knowledge documentation.
The Agent can search this knowledge at runtime to make better decisions and provide more accurate responses. This searching on demand pattern is called Agentic RAG.
Example: Say we are building a Text2Sql Agent, we'll need to give the table schemas, column names, data types, example queries, etc to the agent to help it generate the best-possible SQL query.
It is not viable to put this all in the system message, instead we store this information as knowledge and let the Agent query it at runtime.
Using this information, the Agent can then generate the best-possible SQL query. This is called dynamic few-shot learning.
Knowledge for Agents
Kern Agents use Agentic RAG by default, meaning when we provide knowledge to an Agent, it will search this knowledge base, at runtime, for the specific information it needs to achieve its task.
For example:
1import asyncio23from kern.agent import Agent4from kern.db.postgres.postgres import PostgresDb5from kern.knowledge.embedder.openai import OpenAIEmbedder6from kern.knowledge.knowledge import Knowledge7from kern.vectordb.pgvector import PgVector89db = PostgresDb(10 db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",11 knowledge_table="knowledge_contents",12)1314# Create Knowledge Instance15knowledge = Knowledge(16 name="Basic SDK Knowledge Base",17 description="Kern 2.0 Knowledge Implementation",18 contents_db=db,19 vector_db=PgVector(20 table_name="vectors",21 db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",22 embedder=OpenAIEmbedder(),23 ),24)25# Add from URL to the knowledge base26asyncio.run(27 knowledge.ainsert(28 name="Recipes",29 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",30 metadata={"user_tag": "Recipes from website"},31 )32)3334agent = Agent(35 name="My Agent",36 description="Kern 2.0 Agent Implementation",37 knowledge=knowledge,38 search_knowledge=True,39)4041agent.print_response(42 "How do I make chicken and galangal in coconut milk soup?",43 markdown=True,44)We can give our agent access to the knowledge base in the following ways:
- We can set
search_knowledge=Trueto add asearch_knowledge_base()tool to the Agent.search_knowledgeisTrueby default if you addknowledgeto an Agent. - We can set
add_knowledge_to_context=Trueto automatically add references from the knowledge base to the Agent's context, based in your user message. This is the traditional RAG approach.
Custom knowledge retrieval
If you need complete control over the knowledge base search, you can pass your own knowledge_retriever function with the following signature:
1def knowledge_retriever(agent: Agent, query: str, num_documents: Optional[int], **kwargs) -> Optional[list[dict]]:2 ...Example of how to configure an agent with a custom retriever:
1def knowledge_retriever(agent: Agent, query: str, num_documents: Optional[int], **kwargs) -> Optional[list[dict]]:2 ...34agent = Agent(5 knowledge_retriever=knowledge_retriever,6 search_knowledge=True,7)This function is called during search_knowledge_base() and is used by the Agent to retrieve references from the knowledge base.
Async retrievers are supported. Simply create an async function and pass it to
the knowledge_retriever parameter.
Knowledge storage
Knowledge content is tracked in a "Contents DB" and vectorized and stored in a "Vector DB".
Contents database
The Contents DB is a database that stores the name, description, metadata and other information for any content you add to the knowledge base.
Below is the schema for the Contents DB:
| Field | Type | Description |
|---|---|---|
id | str | The unique identifier for the knowledge content. |
name | str | The name of the knowledge content. |
description | str | The description of the knowledge content. |
metadata | dict | The metadata for the knowledge content. |
type | str | The type of the knowledge content. |
size | int | The size of the knowledge content. Applicable only to files. |
linked_to | str | The ID of the knowledge content that this content is linked to. |
access_count | int | The number of times this content has been accessed. |
status | str | The status of the knowledge content. |
status_message | str | The message associated with the status of the knowledge content. |
created_at | int | The timestamp when the knowledge content was created. |
updated_at | int | The timestamp when the knowledge content was last updated. |
external_id | str | The external ID of the knowledge content. Used when external vector stores are used, like LightRAG. |
This data is best displayed on the knowledge page of the AgentOS UI.
Vector databases
Vector databases offer the best solution for retrieving relevant results from dense information quickly.
Adding contents
The typical way content is processed when being added to the knowledge base is:
Parse the content
A reader is used to parse the content based on the type of content that is being inserted
Chunk the information
The content is broken down into smaller chunks to ensure our search query returns only relevant results.
Embed each chunk
The chunks are converted into embedding vectors and stored in a vector database.
For example, to add a PDF to the knowledge base:
1...2knowledge = Knowledge(3 name="Basic SDK Knowledge Base",4 description="Kern 2.0 Knowledge Implementation",5 vector_db=vector_db,6 contents_db=contents_db,7)89asyncio.run(10 knowledge.ainsert(11 name="CV",12 path="cookbook/08_knowledge/testing_resources/cv_1.pdf",13 metadata={"user_tag": "Engineering Candidates"},14 )15)See more details on Loading the Knowledge Base.
Knowledge filters are currently supported on the following knowledge base types: PDF, PDF_URL, Text, JSON, and DOCX. For more details, see the Knowledge Filters documentation.
Example: Agentic RAG Agent
Let's build a RAG Agent that answers questions from a PDF.
Set up the database
Let's use Postgres as both our contents and vector databases.
Install docker desktop and run Postgres on port 5532 using:
1docker run -d \2 -e POSTGRES_DB=ai \3 -e POSTGRES_USER=ai \4 -e POSTGRES_PASSWORD=ai \5 -e PGDATA=/var/lib/postgresql/data/pgdata \6 -v pgvolume:/var/lib/postgresql/data \7 -p 5532:5432 \8 --name pgvector \9 agnohq/pgvector:16This docker container contains a general purpose Postgres database with the pgvector extension installed.
Install required packages:
1uv pip install -U pgvector pypdf psycopg sqlalchemy1uv pip install -U pgvector pypdf psycopg sqlalchemyDo agentic RAG
Create a file agentic_rag.py with the following contents
1import asyncio2from kern.agent import Agent3from kern.models.openai import OpenAIResponses4from kern.knowledge.embedder.openai import OpenAIEmbedder5from kern.knowledge.knowledge import Knowledge6from kern.vectordb.pgvector import PgVector78db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"910db = PostgresDb(11 db_url=db_url,12 knowledge_table="knowledge_contents",13)1415knowledge = Knowledge(16 contents_db=db,17 vector_db=PgVector(18 table_name="recipes",19 db_url=db_url,20 embedder=OpenAIEmbedder(),21 )22)2324agent = Agent(25 model=OpenAIResponses(id="gpt-5.2"),26 db=db,27 knowledge=knowledge,28 markdown=True,29)30if __name__ == "__main__":31 asyncio.run(32 knowledge.ainsert(33 name="Recipes",34 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",35 metadata={"user_tag": "Recipes from website"}36 )37 )38 # Create and use the agent39 asyncio.run(40 agent.aprint_response(41 "How do I make chicken and galangal in coconut milk soup?",42 markdown=True,43 )44 )Run the agent
Run the agent
1python agentic_rag.pyDeveloper Resources
- View the Agent schema
- View the Knowledge schema
- View Cookbook