ChromaDB Vector Database
Use ChromaDB as a vector database for your Knowledge Base.
Setup
1uv pip install chromadbExample
1import asyncio23from kern.agent import Agent4from kern.knowledge.knowledge import Knowledge5from kern.vectordb.chroma import ChromaDb67# Create Knowledge Instance with ChromaDB8knowledge = Knowledge(9 name="Basic SDK Knowledge Base",10 description="Kern 2.0 Knowledge Implementation with ChromaDB",11 vector_db=ChromaDb(12 collection="vectors", path="tmp/chromadb", persistent_client=True13 ),14)1516asyncio.run(17 knowledge.ainsert(18 name="Recipes",19 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",20 metadata={"doc_type": "recipe_book"},21 )22)2324# Create and use the agent25agent = Agent(knowledge=knowledge)26agent.print_response("List down the ingredients to make Massaman Gai", markdown=True)2728# Delete operations examples29vector_db = knowledge.vector_db30vector_db.delete_by_name("Recipes")31# or32vector_db.delete_by_metadata({"user_tag": "Recipes from website"})For hosted ChromaDB (Chroma Cloud)
1from chromadb.config import Settings23vector_db = ChromaDb(4 collection="vectors",5 settings=Settings(6 chroma_api_impl="chromadb.api.fastapi.FastAPI",7 chroma_server_host="your-tenant-id.api.trychroma.com",8 chroma_server_http_port=443,9 chroma_server_ssl_enabled=True,10 chroma_client_auth_provider="chromadb.auth.token_authn.TokenAuthClientProvider",11 chroma_client_auth_credentials="your-api-key"12 )13)Async Support ⚡
ChromaDB also supports asynchronous operations, enabling concurrency and leading to better performance.
1# install chromadb - `pip install chromadb`23import asyncio45from kern.agent import Agent6from kern.knowledge.knowledge import Knowledge7from kern.vectordb.chroma import ChromaDb89# Initialize ChromaDB10vector_db = ChromaDb(collection="recipes", path="tmp/chromadb", persistent_client=True)1112# Create knowledge base13knowledge = Knowledge(14 vector_db=vector_db,15)1617# Create and use the agent18agent = Agent(knowledge=knowledge)1920if __name__ == "__main__":21 # Comment out after first run22 asyncio.run(23 knowledge.ainsert(url="https://kern.ndx.rocks/introduction/agents.md")24 )2526 # Create and use the agent27 asyncio.run(28 agent.aprint_response("What is the purpose of an Kern Agent?", markdown=True)29 )Tip
Use ainsert() and aprint_response() methods with asyncio.run() for non-blocking operations in high-throughput applications.
Note
ChromaDB has a batch size limit due to SQLite constraints. When inserting documents that exceed this limit, Kern automatically splits them into smaller batches. The batch size is auto-detected from ChromaDB's server configuration.
You can also set batch_size to override the auto-detected value.
ChromaDb Params
| Parameter | Type | Default | Description |
|---|---|---|---|
collection | str | - | The name of the collection to use. |
embedder | Embedder | OpenAIEmbedder() | The embedder to use for embedding document contents. |
distance | Distance | cosine | The distance metric to use. |
path | str | "tmp/chromadb" | The path where ChromaDB data will be stored. |
persistent_client | bool | False | Whether to use a persistent ChromaDB client. |
batch_size | int | None | Maximum number of documents per batch operation. Auto-detected from ChromaDB's server limit if not set, falls back to 100 if auto-detect fails. |