LLMs.txt Reader
The LLMs.txt Reader reads an llms.txt file, follows the linked documentation pages, and turns them into documents for your knowledge base.
Code
1from kern.agent import Agent2from kern.knowledge.knowledge import Knowledge3from kern.knowledge.reader.llms_txt_reader import LLMsTxtReader4from kern.vectordb.pgvector import PgVector56db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"78knowledge = Knowledge(9 name="LLMs.txt Docs",10 vector_db=PgVector(table_name="llms_txt_docs", db_url=db_url),11)1213knowledge.insert(14 url="https://kern.ndx.rocks/llms.txt",15 reader=LLMsTxtReader(max_urls=10),16)1718agent = Agent(19 knowledge=knowledge,20 search_knowledge=True,21)2223agent.print_response("What is Kern?", markdown=True)Usage
Set up your virtual environment
1uv venv --python 3.122source .venv/bin/activate1uv venv --python 3.122.venv\Scripts\activateInstall dependencies
1uv pip install -U beautifulsoup4 sqlalchemy psycopg pgvector kern-ai openaiSet environment variables
1export OPENAI_API_KEY=xxxRun PgVector
1docker run -d \2 -e POSTGRES_DB=ai \3 -e POSTGRES_USER=ai \4 -e POSTGRES_PASSWORD=ai \5 -e PGDATA=/var/lib/postgresql/data/pgdata \6 -v pgvolume:/var/lib/postgresql/data \7 -p 5532:5432 \8 --name pgvector \9 kern/pgvector:16Run Agent
1python examples/basics/knowledge/concepts/readers/overview/llms_txt_reader.pyParams
| Parameter | Type | Default | Description |
|---|---|---|---|
url | str | Required | URL of the llms.txt file to read |
max_urls | int | 20 | Maximum number of linked URLs to fetch from the file |
timeout | int | 60 | HTTP timeout in seconds |
proxy | Optional[str] | None | Optional HTTP proxy URL |
skip_optional | bool | False | Skip entries under the ## Optional section |
chunking_strategy | Optional[ChunkingStrategy] | FixedSizeChunking() | Strategy for chunking content |
allowed_hosts | Optional[List[str]] | None | Hostnames the reader is allowed to fetch from. See Restricting URL Fetches. |