LanceDB Vector Database

Use LanceDB as a vector database for your Knowledge Base.

Setup

1uv pip install lancedb

Example

1import typer
2from typing import Optional
3from rich.prompt import Prompt
4
5from kern.agent import Agent
6from kern.knowledge.knowledge import Knowledge
7from kern.vectordb.lancedb import LanceDb
8from kern.vectordb.search import SearchType
9
10# LanceDB Vector DB
11vector_db = LanceDb(
12 table_name="recipes",
13 uri="/tmp/lancedb",
14 search_type=SearchType.keyword,
15)
16
17# Knowledge Base
18knowledge_base = Knowledge(
19 vector_db=vector_db,
20)
21
22def lancedb_agent(user: str = "user"):
23 agent = Agent(
24 knowledge=knowledge_base,
25 debug_mode=True,
26 )
27
28 while True:
29 message = Prompt.ask(f"[bold] :sunglasses: {user} [/bold]")
30 if message in ("exit", "bye"):
31 break
32 agent.print_response(message, session_id=f"{user}_session")
33
34if __name__ == "__main__":
35 # Comment out after first run
36 knowledge_base.insert(
37 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"
38 )
39
40 typer.run(lancedb_agent)

Async Support ⚡

LanceDB also supports asynchronous operations, enabling concurrency and leading to better performance.

1# install lancedb - `pip install lancedb`
2import asyncio
3
4from kern.agent import Agent
5from kern.knowledge.knowledge import Knowledge
6from kern.vectordb.lancedb import LanceDb
7
8# Initialize LanceDB
9vector_db = LanceDb(
10 table_name="recipes",
11 uri="tmp/lancedb", # You can change this path to store data elsewhere
12)
13
14# Create knowledge base
15knowledge_base = Knowledge(
16 vector_db=vector_db,
17)
18agent = Agent(knowledge=knowledge_base, debug_mode=True)
19
20if __name__ == "__main__":
21 # Load knowledge base asynchronously
22 asyncio.run(knowledge_base.ainsert(
23 url="https://kern-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"
24 )
25 )
26
27 # Create and use the agent asynchronously
28 asyncio.run(agent.aprint_response("How to make Tom Kha Gai", markdown=True))
Tip

Use aload() and aprint_response() methods with asyncio.run() for non-blocking operations in high-throughput applications.

LanceDb Params

ParameterTypeDefaultDescription
uristr-The URI to connect to.
tableLanceTable-The Lance table to use.
table_namestr-The name of the table to use.
connectionDBConnection-The database connection to use.
api_keystr-The API key to use.
embedderEmbedder-The embedder to use.
search_typeSearchTypevectorThe search type to use.
distanceDistancecosineThe distance to use.
nprobesint-The number of probes to use. More Info
rerankerReranker-The reranker to use. More Info
use_tantivybool-Whether to use tantivy.