Agents with Knowledge

Understanding knowledge and how to use it with Kern agents

Knowledge stores domain-specific content that can be added to the context of the agent to enable better decision making.

Note

Kern has a generic knowledge solution that supports many forms of content.

See more details in the knowledge documentation.

The Agent can search this knowledge at runtime to make better decisions and provide more accurate responses. This searching on demand pattern is called Agentic RAG.

Tip

Example: Say we are building a Text2Sql Agent, we'll need to give the table schemas, column names, data types, example queries, etc to the agent to help it generate the best-possible SQL query.

It is not viable to put this all in the system message, instead we store this information as knowledge and let the Agent query it at runtime.

Using this information, the Agent can then generate the best-possible SQL query. This is called dynamic few-shot learning.

Knowledge for Agents

Kern Agents use Agentic RAG by default, meaning when we provide knowledge to an Agent, it will search this knowledge base, at runtime, for the specific information it needs to achieve its task.

For example:

1import asyncio
2
3from kern.agent import Agent
4from kern.db.postgres.postgres import PostgresDb
5from kern.knowledge.embedder.openai import OpenAIEmbedder
6from kern.knowledge.knowledge import Knowledge
7from kern.vectordb.pgvector import PgVector
8
9db = PostgresDb(
10 db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
11 knowledge_table="knowledge_contents",
12)
13
14# Create Knowledge Instance
15knowledge = Knowledge(
16 name="Basic SDK Knowledge Base",
17 description="Kern 2.0 Knowledge Implementation",
18 contents_db=db,
19 vector_db=PgVector(
20 table_name="vectors",
21 db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
22 embedder=OpenAIEmbedder(),
23 ),
24)
25# Add from URL to the knowledge base
26asyncio.run(
27 knowledge.ainsert(
28 name="Recipes",
29 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
30 metadata={"user_tag": "Recipes from website"},
31 )
32)
33
34agent = Agent(
35 name="My Agent",
36 description="Kern 2.0 Agent Implementation",
37 knowledge=knowledge,
38 search_knowledge=True,
39)
40
41agent.print_response(
42 "How do I make chicken and galangal in coconut milk soup?",
43 markdown=True,
44)

We can give our agent access to the knowledge base in the following ways:

  • We can set search_knowledge=True to add a search_knowledge_base() tool to the Agent. search_knowledge is True by default if you add knowledge to an Agent.
  • We can set add_knowledge_to_context=True to automatically add references from the knowledge base to the Agent's context, based in your user message. This is the traditional RAG approach.

Custom knowledge retrieval

If you need complete control over the knowledge base search, you can pass your own knowledge_retriever function with the following signature:

1def knowledge_retriever(agent: Agent, query: str, num_documents: Optional[int], **kwargs) -> Optional[list[dict]]:
2 ...

Example of how to configure an agent with a custom retriever:

1def knowledge_retriever(agent: Agent, query: str, num_documents: Optional[int], **kwargs) -> Optional[list[dict]]:
2 ...
3
4agent = Agent(
5 knowledge_retriever=knowledge_retriever,
6 search_knowledge=True,
7)

This function is called during search_knowledge_base() and is used by the Agent to retrieve references from the knowledge base.

Tip

Async retrievers are supported. Simply create an async function and pass it to the knowledge_retriever parameter.

Knowledge storage

Knowledge content is tracked in a "Contents DB" and vectorized and stored in a "Vector DB".

Contents database

The Contents DB is a database that stores the name, description, metadata and other information for any content you add to the knowledge base.

Below is the schema for the Contents DB:

FieldTypeDescription
idstrThe unique identifier for the knowledge content.
namestrThe name of the knowledge content.
descriptionstrThe description of the knowledge content.
metadatadictThe metadata for the knowledge content.
typestrThe type of the knowledge content.
sizeintThe size of the knowledge content. Applicable only to files.
linked_tostrThe ID of the knowledge content that this content is linked to.
access_countintThe number of times this content has been accessed.
statusstrThe status of the knowledge content.
status_messagestrThe message associated with the status of the knowledge content.
created_atintThe timestamp when the knowledge content was created.
updated_atintThe timestamp when the knowledge content was last updated.
external_idstrThe external ID of the knowledge content. Used when external vector stores are used, like LightRAG.

This data is best displayed on the knowledge page of the AgentOS UI.

Vector databases

Vector databases offer the best solution for retrieving relevant results from dense information quickly.

Adding contents

The typical way content is processed when being added to the knowledge base is:

Parse the content

A reader is used to parse the content based on the type of content that is being inserted

Chunk the information

The content is broken down into smaller chunks to ensure our search query returns only relevant results.

Embed each chunk

The chunks are converted into embedding vectors and stored in a vector database.

For example, to add a PDF to the knowledge base:

1...
2knowledge = Knowledge(
3 name="Basic SDK Knowledge Base",
4 description="Kern 2.0 Knowledge Implementation",
5 vector_db=vector_db,
6 contents_db=contents_db,
7)
8
9asyncio.run(
10 knowledge.ainsert(
11 name="CV",
12 path="cookbook/08_knowledge/testing_resources/cv_1.pdf",
13 metadata={"user_tag": "Engineering Candidates"},
14 )
15)
Tip

See more details on Loading the Knowledge Base.

Note

Knowledge filters are currently supported on the following knowledge base types: PDF, PDF_URL, Text, JSON, and DOCX. For more details, see the Knowledge Filters documentation.

Example: Agentic RAG Agent

Let's build a RAG Agent that answers questions from a PDF.

Set up the database

Let's use Postgres as both our contents and vector databases.

Install docker desktop and run Postgres on port 5532 using:

1docker run -d \
2 -e POSTGRES_DB=ai \
3 -e POSTGRES_USER=ai \
4 -e POSTGRES_PASSWORD=ai \
5 -e PGDATA=/var/lib/postgresql/data/pgdata \
6 -v pgvolume:/var/lib/postgresql/data \
7 -p 5532:5432 \
8 --name pgvector \
9 agnohq/pgvector:16
Note

This docker container contains a general purpose Postgres database with the pgvector extension installed.

Install required packages:

1uv pip install -U pgvector pypdf psycopg sqlalchemy
1uv pip install -U pgvector pypdf psycopg sqlalchemy

Do agentic RAG

Create a file agentic_rag.py with the following contents

1import asyncio
2from kern.agent import Agent
3from kern.models.openai import OpenAIResponses
4from kern.knowledge.embedder.openai import OpenAIEmbedder
5from kern.knowledge.knowledge import Knowledge
6from kern.vectordb.pgvector import PgVector
7
8db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"
9
10db = PostgresDb(
11 db_url=db_url,
12 knowledge_table="knowledge_contents",
13)
14
15knowledge = Knowledge(
16 contents_db=db,
17 vector_db=PgVector(
18 table_name="recipes",
19 db_url=db_url,
20 embedder=OpenAIEmbedder(),
21 )
22)
23
24agent = Agent(
25 model=OpenAIResponses(id="gpt-5.2"),
26 db=db,
27 knowledge=knowledge,
28 markdown=True,
29)
30if __name__ == "__main__":
31 asyncio.run(
32 knowledge.ainsert(
33 name="Recipes",
34 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
35 metadata={"user_tag": "Recipes from website"}
36 )
37 )
38 # Create and use the agent
39 asyncio.run(
40 agent.aprint_response(
41 "How do I make chicken and galangal in coconut milk soup?",
42 markdown=True,
43 )
44 )

Run the agent

Run the agent

1python agentic_rag.py

Developer Resources