SurrealDB Vector Database

Use SurrealDB as a vector database for your Knowledge Base.

Setup

1docker run --rm \
2 --pull always \
3 -p 8000:8000 \
4 surrealdb/surrealdb:latest \
5 start \
6 --user root \
7 --pass root

or

1./cookbook/scripts/run_surrealdb.sh

Example

1from kern.agent import Agent
2from kern.knowledge.embedder.openai import OpenAIEmbedder
3from kern.knowledge.knowledge import Knowledge
4from kern.vectordb.surrealdb import SurrealDb
5from surrealdb import Surreal
6
7# SurrealDB connection parameters
8SURREALDB_URL = "ws://localhost:8000"
9SURREALDB_USER = "root"
10SURREALDB_PASSWORD = "root"
11SURREALDB_NAMESPACE = "test"
12SURREALDB_DATABASE = "test"
13
14# Create a client
15client = Surreal(url=SURREALDB_URL)
16client.signin({"username": SURREALDB_USER, "password": SURREALDB_PASSWORD})
17client.use(namespace=SURREALDB_NAMESPACE, database=SURREALDB_DATABASE)
18
19surrealdb = SurrealDb(
20 client=client,
21 collection="recipes", # Collection name for storing documents
22 efc=150, # HNSW construction time/accuracy trade-off
23 m=12, # HNSW max number of connections per element
24 search_ef=40, # HNSW search time/accuracy trade-off
25)
26
27def sync_demo():
28 """Demonstrate synchronous usage of SurrealDb"""
29 knowledge_base = Knowledge(
30 vector_db=surrealdb,
31 embedder=OpenAIEmbedder(),
32 )
33
34 # Load data synchronously
35 knowledge_base.insert(
36 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"
37 )
38
39 # Create agent and query synchronously
40 agent = Agent(knowledge=knowledge_base)
41 agent.print_response(
42 "What are the 3 categories of Thai SELECT is given to restaurants overseas?",
43 markdown=True,
44 )
45
46if __name__ == "__main__":
47 # Run synchronous demo
48 print("Running synchronous demo...")
49 sync_demo()

Async Support ⚡

SurrealDB also supports asynchronous operations, enabling concurrency and leading to better performance.

1import asyncio
2
3from kern.agent import Agent
4from kern.knowledge.embedder.openai import OpenAIEmbedder
5from kern.knowledge.knowledge import Knowledge
6from kern.vectordb.surrealdb import SurrealDb
7from surrealdb import AsyncSurreal
8
9# SurrealDB connection parameters
10SURREALDB_URL = "ws://localhost:8000"
11SURREALDB_USER = "root"
12SURREALDB_PASSWORD = "root"
13SURREALDB_NAMESPACE = "test"
14SURREALDB_DATABASE = "test"
15
16# Create a client
17client = AsyncSurreal(url=SURREALDB_URL)
18
19surrealdb = SurrealDb(
20async_client=client,
21collection="recipes", # Collection name for storing documents
22efc=150, # HNSW construction time/accuracy trade-off
23m=12, # HNSW max number of connections per element
24search_ef=40, # HNSW search time/accuracy trade-off
25)
26
27async def async_demo():
28"""Demonstrate asynchronous usage of SurrealDb"""
29
30await client.signin({"username": SURREALDB_USER, "password": SURREALDB_PASSWORD})
31await client.use(namespace=SURREALDB_NAMESPACE, database=SURREALDB_DATABASE)
32
33knowledge_base = Knowledge(
34 vector_db=surrealdb,
35 embedder=OpenAIEmbedder(),
36)
37
38await knowledge_base.ainsert(
39 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"
40)
41
42agent = Agent(knowledge=knowledge_base)
43await agent.aprint_response(
44 "What are the 3 categories of Thai SELECT is given to restaurants overseas?",
45 markdown=True,
46)
47
48if __name__ == "__main__":
49# Run asynchronous demo
50print("\nRunning asynchronous demo...")
51asyncio.run(async_demo())
Tip

Using aload() and aprint_response() with asyncio provides non-blocking operations, making your application more responsive under load.

SurrealDB Params

ParameterTypeDefaultDescription
clientOptional[Union[BlockingWsSurrealConnection, BlockingHttpSurrealConnection]]NoneA blocking connection, either HTTP or WS
async_clientOptional[Union[AsyncWsSurrealConnection, AsyncHttpSurrealConnection]]NoneAn async connection, either HTTP or WS
collectionstr"documents"Collection name to store documents
distanceDistanceDistance.cosineDistance metric to use (cosine, l2, or max_inner_product)
efcint150HNSW construction time/accuracy trade-off
mint12HNSW max number of connections per element
search_efint40HNSW search time/accuracy trade-off
embedderOptional[Embedder]OpenAIEmbedder()Embedder instance for creating embeddings