Clickhouse Vector Database
Use ClickHouse as a vector database for your Knowledge Base.
Setup
1docker run -d \2 -e CLICKHOUSE_DB=ai \3 -e CLICKHOUSE_USER=ai \4 -e CLICKHOUSE_PASSWORD=ai \5 -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 \6 -v clickhouse_data:/var/lib/clickhouse/ \7 -v clickhouse_log:/var/log/clickhouse-server/ \8 -p 8123:8123 \9 -p 9000:9000 \10 --ulimit nofile=262144:262144 \11 --name clickhouse-server \12 clickhouse/clickhouse-serverExample
1from kern.agent import Agent2from kern.knowledge.knowledge import Knowledge3from kern.db.sqlite import SqliteDb4from kern.vectordb.clickhouse import Clickhouse56knowledge=Knowledge(7 vector_db=Clickhouse(8 table_name="recipe_documents",9 host="localhost",10 port=8123,11 username="ai",12 password="ai",13 ),14)1516knowledge.insert(17 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"18)1920agent = Agent(21 db=SqliteDb(db_file="kern.db"),22 knowledge=knowledge,23 # Enable the agent to search the knowledge base24 search_knowledge=True,25 # Enable the agent to read the chat history26 read_chat_history=True,27)28# Comment out after first run29agent.knowledge.load(recreate=False) # type: ignore3031agent.print_response("How do I make pad thai?", markdown=True)32agent.print_response("What was my last question?", stream=True)Async Support ⚡
Clickhouse also supports asynchronous operations, enabling concurrency and leading to better performance.
1import asyncio23from kern.agent import Agent4from kern.knowledge.knowledge import Knowledge5from kern.db.sqlite import SqliteDb6from kern.vectordb.clickhouse import Clickhouse78agent = Agent(9 db=SqliteDb(db_file="kern.db"),10 knowledge=Knowledge(11 vector_db=Clickhouse(12 table_name="recipe_documents",13 host="localhost",14 port=8123,15 username="ai",16 password="ai",17 ),18 ),19 # Enable the agent to search the knowledge base20 search_knowledge=True,21 # Enable the agent to read the chat history22 read_chat_history=True,23)2425if __name__ == "__main__":26 # Comment out after first run27 asyncio.run(agent.knowledge.ainsert(28 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"29 )30 )3132 # Create and use the agent33 asyncio.run(agent.aprint_response("How to make Tom Kha Gai", markdown=True))Tip
Use aload() and aprint_response() methods with asyncio.run() for non-blocking operations in high-throughput applications.
Clickhouse Params
| Parameter | Type | Default | Description |
|---|---|---|---|
table_name | str | None | Name of the table to store vectors and metadata in Clickhouse |
host | str | None | Hostname of the Clickhouse server |
username | Optional[str] | None | Username for Clickhouse authentication |
password | str | "" | Password for Clickhouse authentication |
port | int | 0 | Port number for Clickhouse connection |
database_name | str | "ai" | Name of the database to use in Clickhouse |
dsn | Optional[str] | None | DSN string for Clickhouse connection |
compress | str | "lz4" | Compression algorithm to use |
client | Optional[Client] | None | Optional pre-configured Clickhouse client |
embedder | Optional[Embedder] | OpenAIEmbedder() | Embedder instance to generate embeddings |
distance | Distance | Distance.cosine | Distance metric to use for similarity search |
index | Optional[HNSW] | HNSW() | HNSW index configuration for vector similarity search |