Chroma Hybrid Search

This example demonstrates how to use ChromaDB with hybrid search, which combines dense vector similarity search (semantic) with full-text search (keyword/lexical) using RRF fusion.

Hybrid search is useful when you want to:

  • Combine semantic understanding with exact keyword matching
  • Improve retrieval accuracy for queries with specific terms
  • Handle both conceptual and lexical search needs

The RRF algorithm fuses rankings from both search methods using:

1RRF(d) = sum(1 / (k + rank_i(d))) for each ranking i

Code

1import asyncio
2
3from kern.agent import Agent
4from kern.knowledge.knowledge import Knowledge
5from kern.vectordb.chroma import ChromaDb
6from kern.vectordb.search import SearchType
7
8# Create Knowledge Instance with ChromaDB using Hybrid Search
9knowledge = Knowledge(
10 name="Thai Recipes Knowledge Base",
11 description="Knowledge base for Thai recipes with hybrid search (RRF fusion)",
12 vector_db=ChromaDb(
13 collection="thai_recipes_hybrid",
14 path="tmp/chromadb_hybrid",
15 persistent_client=True,
16 # Enable hybrid search - combines vector similarity with keyword matching using RRF
17 search_type=SearchType.hybrid,
18 # RRF (Reciprocal Rank Fusion) constant - controls ranking smoothness.
19 # Higher values (e.g., 60) give more weight to lower-ranked results,
20 # Lower values make top results more dominant. Default is 60 (per original RRF paper).
21 hybrid_rrf_k=60,
22 ),
23)
24
25# Load content into the knowledge base
26asyncio.run(
27 knowledge.ainsert(
28 name="Thai Recipes",
29 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
30 metadata={"doc_type": "recipe_book", "cuisine": "thai"},
31 )
32)
33
34# Create an agent with the hybrid search knowledge base
35agent = Agent(
36 knowledge=knowledge,
37 search_knowledge=True,
38 instructions="You are a helpful Thai cooking assistant. Use the knowledge base to answer questions about Thai recipes.",
39)
40
41# Hybrid search will:
42# 1. Find semantically similar documents (via dense embeddings)
43# 2. Find documents containing query keywords (via FTS)
44# 3. Fuse results using RRF for optimal ranking
45agent.print_response("What are the ingredients for Massaman curry?", markdown=True)

Usage

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Install dependencies

1uv pip install -U chromadb openai kern-ai

Run Agent

1python cookbook/08_knowledge/vector_db/chroma_db/chroma_db_hybrid_search.py