Cassandra Vector Database
Use Cassandra as a vector database for your Knowledge Base.
Setup
Install cassandra packages
1uv pip install cassandra-driverRun cassandra
1docker run -d \2 --name cassandra-db \3 -p 9042:9042 \4 cassandra:latestExample
1from kern.agent import Agent2from kern.knowledge.knowledge import Knowledge3from kern.vectordb.cassandra import Cassandra4from kern.knowledge.embedder.mistral import MistralEmbedder5from kern.models.mistral import MistralChat6from cassandra.cluster import Cluster78# (Optional) Set up your Cassandra DB910cluster = Cluster()1112session = cluster.connect()13session.execute(14 """15 CREATE KEYSPACE IF NOT EXISTS testkeyspace16 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }17 """18)1920knowledge_base = Knowledge(21 vector_db=Cassandra(table_name="recipes", keyspace="testkeyspace", session=session, embedder=MistralEmbedder()),22)2324knowledge_base.insert(25 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"26)2728agent = Agent(29 model=MistralChat(provider="mistral-large-latest", api_key=os.getenv("MISTRAL_API_KEY")),30 knowledge=knowledge_base,31)3233agent.print_response(34 "What are the health benefits of Khao Niew Dam Piek Maphrao Awn?", markdown=True, show_full_reasoning=True35)Async Support ⚡
Cassandra also supports asynchronous operations, enabling concurrency and leading to better performance.
1import asyncio23from kern.agent import Agent4from kern.knowledge.embedder.mistral import MistralEmbedder5from kern.knowledge.knowledge import Knowledge6from kern.models.mistral import MistralChat7from kern.vectordb.cassandra import Cassandra89try:10 from cassandra.cluster import Cluster # type: ignore11except (ImportError, ModuleNotFoundError):12 raise ImportError(13 "Could not import cassandra-driver python package.Please install it with pip install cassandra-driver."14 )1516cluster = Cluster()1718session = cluster.connect()19session.execute(20 """21 CREATE KEYSPACE IF NOT EXISTS testkeyspace22 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }23 """24)2526knowledge_base = Knowledge(27 vector_db=Cassandra(28 table_name="recipes",29 keyspace="testkeyspace",30 session=session,31 embedder=MistralEmbedder(),32 ),33)3435agent = Agent(36 model=MistralChat(),37 knowledge=knowledge_base,38)3940if __name__ == "__main__":41 asyncio.run(knowledge_base.ainsert(42 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",43 )44 )4546 # Create and use the agent47 asyncio.run(48 agent.aprint_response(49 "What are the health benefits of Khao Niew Dam Piek Maphrao Awn?",50 markdown=True,51 )52 )Tip
Use aload() and aprint_response() methods with asyncio.run() for non-blocking operations in high-throughput applications.
Cassandra Params
| Parameter | Type | Default | Description |
|---|---|---|---|
table_name | str | None | Name of the table to store vectors and metadata in Cassandra |
keyspace | str | None | Keyspace name in Cassandra where the table will be created |
embedder | Optional[Embedder] | OpenAIEmbedder() | Embedder instance to generate embeddings |
session | CassandraSession | None | Active Cassandra session object for database operations |