Couchbase Vector Database

Use Couchbase as a vector database for your Knowledge Base.

Setup

Local Setup (Docker)

Run Couchbase locally using Docker:

1docker run -d --name couchbase-server \
2 -p 8091-8096:8091-8096 \
3 -p 11210:11210 \
4 -e COUCHBASE_ADMINISTRATOR_USERNAME=Administrator \
5 -e COUCHBASE_ADMINISTRATOR_PASSWORD=password \
6 couchbase:latest
  1. Access the Couchbase UI at: http://localhost:8091
  2. Login with username: Administrator and password: password
  3. Create a bucket named recipe_bucket, a scope recipe_scope, and a collection recipes

Managed Setup (Capella)

For a managed cluster, use Couchbase Capella:

  • Follow Capella's UI to create a database, bucket, scope, and collection

Environment Variables

Set up your environment variables:

1export COUCHBASE_USER="Administrator"
2export COUCHBASE_PASSWORD="password"
3export COUCHBASE_CONNECTION_STRING="couchbase://localhost"
4export OPENAI_API_KEY=xxx

For Capella, set COUCHBASE_CONNECTION_STRING to your Capella connection string.

Install Dependencies

1uv pip install couchbase

Example

1import os
2import time
3from kern.agent import Agent
4from kern.knowledge.embedder.openai import OpenAIEmbedder
5from kern.knowledge.knowledge import Knowledge
6from kern.vectordb.couchbase import CouchbaseSearch
7from couchbase.options import ClusterOptions, KnownConfigProfiles
8from couchbase.auth import PasswordAuthenticator
9from couchbase.management.search import SearchIndex
10
11# Couchbase connection settings
12username = os.getenv("COUCHBASE_USER")
13password = os.getenv("COUCHBASE_PASSWORD")
14connection_string = os.getenv("COUCHBASE_CONNECTION_STRING")
15
16# Create cluster options with authentication
17auth = PasswordAuthenticator(username, password)
18cluster_options = ClusterOptions(auth)
19cluster_options.apply_profile(KnownConfigProfiles.WanDevelopment)
20
21knowledge_base = Knowledge(
22 vector_db=CouchbaseSearch(
23 bucket_name="recipe_bucket",
24 scope_name="recipe_scope",
25 collection_name="recipes",
26 couchbase_connection_string=connection_string,
27 cluster_options=cluster_options,
28 search_index="vector_search_fts_index",
29 embedder=OpenAIEmbedder(
30 id="text-embedding-3-large",
31 dimensions=3072,
32 api_key=os.getenv("OPENAI_API_KEY")
33 ),
34 wait_until_index_ready=60,
35 overwrite=True
36 ),
37)
38
39# Load the knowledge base
40knowledge_base.insert(
41 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"
42)
43
44# Wait for the vector index to sync with KV
45time.sleep(20)
46
47# Create and use the agent
48agent = Agent(knowledge=knowledge_base)
49agent.print_response("How to make Thai curry?", markdown=True)

Async Support ⚡

Couchbase also supports asynchronous operations, enabling concurrency and leading to better performance.

1import asyncio
2import os
3import time
4from kern.agent import Agent
5from kern.knowledge.embedder.openai import OpenAIEmbedder
6from kern.knowledge.knowledge import Knowledge
7from kern.vectordb.couchbase import CouchbaseSearch
8from couchbase.options import ClusterOptions, KnownConfigProfiles
9from couchbase.auth import PasswordAuthenticator
10from couchbase.management.search import SearchIndex
11
12# Couchbase connection settings
13username = os.getenv("COUCHBASE_USER")
14password = os.getenv("COUCHBASE_PASSWORD")
15connection_string = os.getenv("COUCHBASE_CONNECTION_STRING")
16
17# Create cluster options with authentication
18auth = PasswordAuthenticator(username, password)
19cluster_options = ClusterOptions(auth)
20cluster_options.apply_profile(KnownConfigProfiles.WanDevelopment)
21
22knowledge_base = Knowledge(
23 vector_db=CouchbaseSearch(
24 bucket_name="recipe_bucket",
25 scope_name="recipe_scope",
26 collection_name="recipes",
27 couchbase_connection_string=connection_string,
28 cluster_options=cluster_options,
29 search_index="vector_search_fts_index",
30 embedder=OpenAIEmbedder(
31 id="text-embedding-3-large",
32 dimensions=3072,
33 api_key=os.getenv("OPENAI_API_KEY")
34 ),
35 wait_until_index_ready=60,
36 overwrite=True
37 ),
38)
39
40# Create and use the agent
41agent = Agent(knowledge=knowledge_base)
42
43async def run_agent():
44 await knowledge_base.ainsert(
45 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
46 )
47 time.sleep(5) # Wait for the vector index to sync with KV
48 await agent.aprint_response("How to make Thai curry?", markdown=True)
49
50if __name__ == "__main__":
51 asyncio.run(run_agent())
Tip

Use aload() and aprint_response() methods with asyncio.run() for non-blocking operations in high-throughput applications.

Key Configuration Notes

Connection Profiles

Use KnownConfigProfiles.WanDevelopment for both local and cloud deployments to handle network latency and timeouts appropriately.

Couchbase Params

ParameterTypeDescriptionDefault
bucket_namestrName of the Couchbase bucketRequired
scope_namestrName of the scope within the bucketRequired
collection_namestrName of the collection within the scopeRequired
couchbase_connection_stringstrCouchbase cluster connection stringRequired
cluster_optionsClusterOptionsOptions for configuring the Couchbase cluster connectionRequired
search_indexUnion[str, SearchIndex]Search index configuration, either as index name or SearchIndex definitionRequired
embedderEmbedderEmbedder instance for generating embeddingsOpenAIEmbedder()
overwriteboolWhether to overwrite existing collectionFalse
is_global_level_indexboolWhether the search index is at global levelFalse
wait_until_index_readyOptional[float]Time in seconds to wait until the index is readyNone
batch_limitintMaximum number of documents to process in a single batch (applies to both sync and async operations)500