Performance Tips

Optimize knowledge base performance, search quality, and content loading speed.

Kern's defaults work well for most use cases. But if you're seeing slow searches, memory issues, or poor results, a few strategic changes might help.

Quick Wins

1. Choose the Right Vector Database

Database choice has the biggest impact at scale:

DatabaseUse Case
LanceDB/ChromaDBDevelopment, testing (zero setup)
PgVectorProduction up to 1M docs, need SQL
PineconeManaged service, auto-scaling
1from kern.vectordb.lancedb import LanceDb
2from kern.vectordb.pgvector import PgVector
3
4# Development
5dev_db = LanceDb(table_name="docs", uri="./local_db")
6
7# Production
8prod_db = PgVector(table_name="docs", db_url=db_url)

2. Skip Already-Processed Files

The biggest speed-up when re-running ingestion:

1knowledge.insert(
2 path="documents/",
3 skip_if_exists=True, # Don't reprocess existing files
4)
5
6# Batch loading with filters
7knowledge.insert_many(
8 paths=["docs/", "policies/"],
9 skip_if_exists=True,
10 include=["*.pdf", "*.md"],
11 exclude=["*temp*", "*draft*"]
12)

3. Use Metadata Filters

Narrow searches before search:

1# Slow: search everything
2results = knowledge.search("deployment process")
3
4# Fast: filter first, then search
5results = knowledge.search(
6 query="deployment process",
7 filters={"department": "engineering", "type": "procedure"}
8)
9
10# Validate filters to catch typos
11valid_filters, invalid_keys = knowledge.validate_filters({
12 "department": "engineering",
13 "invalid_key": "value" # This gets flagged
14})

4. Match Chunking to Content

StrategySpeedQualityBest For
Fixed SizeFastGoodUniform content
SemanticSlowerBestComplex documents
RecursiveFastGoodStructured docs
1from kern.knowledge.chunking.fixed_size_chunking import FixedSizeChunking
2from kern.knowledge.chunking.semantic_chunking import SemanticChunking
3
4# Fast processing
5FixedSizeChunking(chunk_size=5000, overlap=200)
6
7# Better quality (slower)
8SemanticChunking(similarity_threshold=0.5)

5. Use Async for Batch Operations

Process multiple sources concurrently:

1import asyncio
2
3async def load_knowledge():
4 await asyncio.gather(
5 knowledge.ainsert(path="docs/hr/"),
6 knowledge.ainsert(path="docs/engineering/"),
7 knowledge.ainsert(url="https://company.com/api-docs"),
8 )
9
10asyncio.run(load_knowledge())

Common Issues

Irrelevant Search Results

Causes: Chunks too large/small, wrong chunking strategy.

Fixes:

  • Try semantic chunking for better context
  • Increase max_results to check if relevant results are ranked lower
  • Add metadata filters to narrow scope
1# Debug search quality
2results = knowledge.search("your query", max_results=10)
3for doc in results:
4 print(doc.content[:200])

Slow Content Loading

Causes: Reprocessing existing files, semantic chunking on large datasets.

Fixes:

  • Use skip_if_exists=True
  • Switch to fixed-size chunking
  • Process in batches
1# Only process new PDFs
2knowledge.insert(
3 path="documents/",
4 include=["*.pdf"],
5 exclude=["*draft*", "*backup*"],
6 skip_if_exists=True,
7)

Memory Issues

Causes: Loading too many large files at once, chunk sizes too large.

Fixes:

  • Process in smaller batches
  • Reduce chunk size
  • Use include/exclude patterns
  • Clear outdated content with knowledge.remove_content_by_id(content_id)

Advanced Optimizations

Hybrid Search

Combine vector and keyword search:

1from kern.vectordb.pgvector import PgVector, SearchType
2
3vector_db = PgVector(
4 table_name="docs",
5 db_url=db_url,
6 search_type=SearchType.hybrid,
7)

Reranking

Improve result ordering:

1from kern.knowledge.reranker.cohere import CohereReranker
2
3vector_db = PgVector(
4 table_name="docs",
5 db_url=db_url,
6 reranker=CohereReranker(model="rerank-v3.5", top_n=10),
7)

Smaller Embedding Dimensions

Trade slight quality for faster search:

1from kern.knowledge.embedder.openai import OpenAIEmbedder
2
3embedder = OpenAIEmbedder(
4 id="text-embedding-3-large",
5 dimensions=1024, # Instead of 3072
6)

Monitoring

1import time
2
3# Time searches
4start = time.time()
5results = knowledge.search("test query", max_results=5)
6print(f"Search: {time.time() - start:.2f}s")
7
8# Check failed content
9content_list, total = knowledge.get_content()
10for content in content_list:
11 if content.status == "failed":
12 status, message = knowledge.get_content_status(content.id)
13 print(f"{content.name}: {message}")

Next Steps